xhs-batch
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese用户希望批量提取多个小红书帖子。请按以下步骤处理:
Users want to batch extract multiple Xiaohongshu posts. Please follow the steps below to process:
常量定义
Constant Definitions
- Cookies 文件:
~/cookies.json - Obsidian 保存目录:
~/Documents/Obsidian Vault/xhs - Whisper 模型:
mlx-community/whisper-large-v3-turbo
- Cookies File:
~/cookies.json - Obsidian Save Directory:
~/Documents/Obsidian Vault/xhs - Whisper Model:
mlx-community/whisper-large-v3-turbo
输入
Input
用户提供的链接列表: $ARGUMENTS
Link list provided by users: $ARGUMENTS
流程
Workflow
步骤 1:解析链接
Step 1: Parse Links
从输入中提取所有小红书链接(支持多行、空格分隔、逗号分隔)。
每个链接提取帖子 ID 和 xsec_token。
Extract all Xiaohongshu links from the input (supports multi-line, space-separated, comma-separated formats).
Extract post ID and xsec_token from each link.
步骤 2:检查 Cookies
Step 2: Check Cookies
检查 是否存在。如不存在,按 的步骤 0 引导用户导出。
~/cookies.json/xhsCheck if exists. If not, guide users to export it according to Step 0 of .
~/cookies.json/xhs步骤 3:逐个提取
Step 3: Extract One by One
对每个链接,执行 的完整提取流程(步骤 2-4):
/xhs- 请求页面 → 解析 INITIAL_STATE
- 视频帖子做语音转录
- 按 Peter Thiel 风格整理
- 保存为
{YYYY-MM-DD} {短标题}.md
每个帖子之间间隔 3 秒,避免触发反爬。
For each link, execute the complete extraction workflow of (Steps 2-4):
/xhs- Request page → Parse INITIAL_STATE
- Transcribe audio for video posts
- Organize in Peter Thiel's style
- Save as
{YYYY-MM-DD} {Short Title}.md
Wait 3 seconds between each post to avoid triggering anti-crawling mechanisms.
步骤 4:汇总报告
Step 4: Generate Summary Report
全部完成后,输出简短汇总:
- 成功/失败数量
- 每个帖子的文件名和一句话摘要
After all tasks are completed, output a brief summary:
- Number of successful/failed extractions
- File name and one-sentence summary for each post