xhs-search-workflow
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseXHS Search Workflow
小红书搜索工作流
Setup
环境搭建
Run once on a new machine:
bash
skills/xhs-search-workflow/scripts/setup_env.shThis creates and installs Python deps.
skills/xhs-search-workflow/.venv在新机器上执行一次以下命令:
bash
skills/xhs-search-workflow/scripts/setup_env.sh这将创建虚拟环境并安装Python依赖。
skills/xhs-search-workflow/.venvCookie Input
Cookie输入方式
Use either:
--cookie "..."- with
--env-file /path/to/.envCOOKIES="..."
Add when host proxy vars break network.
--no-env-proxy可选择以下任意一种方式:
--cookie "..."- ,文件内需配置
--env-file /path/to/.envCOOKIES="..."
当主机代理变量导致网络异常时,添加参数。
--no-env-proxyMain Scripts
主要脚本
- : note search (supports advanced filters)
scripts/search_notes.py - : extract note text and image URLs, optional image download
scripts/fetch_note_texts.py - : unified entry for user/comment/message/homefeed/creator/no-water APIs
scripts/xhs_full_cli.py - : export note data to Excel and/or media files
scripts/export_notes.py
- :笔记搜索(支持高级筛选)
scripts/search_notes.py - :提取笔记文本和图片URL,可选图片下载功能
scripts/fetch_note_texts.py - :统一入口,支持用户/评论/消息/首页推荐流/创作者/无水印等API调用
scripts/xhs_full_cli.py - :将笔记数据导出为Excel和/或媒体文件
scripts/export_notes.py
Typical Commands
常用命令
1) Search notes
1) 搜索笔记
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/search_notes.py "汇丰银行" \
--num 10 --sort 0 --note-type 0 --no-env-proxy --jsonbash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/search_notes.py "汇丰银行" \
--num 10 --sort 0 --note-type 0 --no-env-proxy --json2) Extract note text + image URLs
2) 提取笔记文本 + 图片URL
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/fetch_note_texts.py \
--url-file note_urls.txt --no-env-proxy \
--timeout 30 --retries 2 --min-interval 4 --max-interval 7 \
--out note_content.jsonbash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/fetch_note_texts.py \
--url-file note_urls.txt --no-env-proxy \
--timeout 30 --retries 2 --min-interval 4 --max-interval 7 \
--out note_content.json3) Download note images while extracting
3) 提取内容同时下载笔记图片
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/fetch_note_texts.py \
--url-file note_urls.txt --no-env-proxy \
--download-images --image-dir xhs_images \
--timeout 30 --retries 2 --min-interval 4 --max-interval 7 \
--out note_content.jsonbash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/fetch_note_texts.py \
--url-file note_urls.txt --no-env-proxy \
--download-images --image-dir xhs_images \
--timeout 30 --retries 2 --min-interval 4 --max-interval 7 \
--out note_content.json4) Full API CLI examples
4) 全功能CLI示例
Search users:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--env-file .env --no-env-proxy search-users --query "汇丰银行" --num 10Get note comments:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--env-file .env --no-env-proxy note-comments \
--url "https://www.xiaohongshu.com/explore/<note_id>?xsec_token=<token>"Get creator posted notes:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--env-file .env --no-env-proxy creator-postedNo-watermark URL conversion:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--no-env-proxy no-water-img --img-url "https://..."搜索用户:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--env-file .env --no-env-proxy search-users --query "汇丰银行" --num 10获取笔记评论:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--env-file .env --no-env-proxy note-comments \
--url "https://www.xiaohongshu.com/explore/<note_id>?xsec_token=<token>"获取创作者已发布笔记:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--env-file .env --no-env-proxy creator-posted无水印URL转换:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/xhs_full_cli.py \
--no-env-proxy no-water-img --img-url "https://..."5) Export Excel/media
5) 导出Excel/媒体文件
From query:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/export_notes.py \
--query "汇丰银行" --num 10 --save all \
--excel xhs_notes.xlsx --media-dir xhs_media --no-env-proxyFrom URL file:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/export_notes.py \
--url-file note_urls.txt --save excel --excel xhs_notes.xlsx --no-env-proxy根据关键词查询导出:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/export_notes.py \
--query "汇丰银行" --num 10 --save all \
--excel xhs_notes.xlsx --media-dir xhs_media --no-env-proxy根据URL文件导出:
bash
skills/xhs-search-workflow/.venv/bin/python \
skills/xhs-search-workflow/scripts/export_notes.py \
--url-file note_urls.txt --save excel --excel xhs_notes.xlsx --no-env-proxyxhs_full_cli.py
Subcommands
xhs_full_cli.pyxhs_full_cli.py
子命令
xhs_full_cli.pyuser-info --user-id <id>user-self-infouser-self-info2user-posts --user-url <url>user-likes --user-url <url>user-collects --user-url <url>note-info --url <url>note-comments --url <url>search-keyword --word <kw>search-users --query <kw> --num <n>messages-unreadmessages-mentionsmessages-likesmessages-connectionshomefeed-channelshomefeed-recommend --category <name> --num <n>creator-postedno-water-video --note-id <id>no-water-img --img-url <url>
user-info --user-id <id>user-self-infouser-self-info2user-posts --user-url <url>user-likes --user-url <url>user-collects --user-url <url>note-info --url <url>note-comments --url <url>search-keyword --word <kw>search-users --query <kw> --num <n>messages-unreadmessages-mentionsmessages-likesmessages-connectionshomefeed-channelshomefeed-recommend --category <name> --num <n>creator-postedno-water-video --note-id <id>no-water-img --img-url <url>
Offline/Portable Design
离线/可移植设计
- Skill bundles signing JS in .
assets/js/ - Skill bundles offline at
crypto-js.assets/js/vendor/crypto-js.js - Skill does not import or
apis/from original repository.xhs_utils/
- 该技能在中捆绑了签名JS文件
assets/js/ - 在中捆绑了离线版
assets/js/vendor/crypto-js.js库crypto-js - 无需从原仓库导入或
apis/模块xhs_utils/
Validation
验证步骤
Run after edits:
bash
skills/xhs-search-workflow/.venv/bin/python \
"$CODEX_HOME/skills/.system/skill-creator/scripts/quick_validate.py" \
skills/xhs-search-workflowBasic smoke tests:
bash
skills/xhs-search-workflow/.venv/bin/python skills/xhs-search-workflow/scripts/xhs_full_cli.py --help
skills/xhs-search-workflow/.venv/bin/python skills/xhs-search-workflow/scripts/export_notes.py --help编辑后执行以下命令验证:
bash
skills/xhs-search-workflow/.venv/bin/python \
"$CODEX_HOME/skills/.system/skill-creator/scripts/quick_validate.py" \
skills/xhs-search-workflow基础冒烟测试:
bash
skills/xhs-search-workflow/.venv/bin/python skills/xhs-search-workflow/scripts/xhs_full_cli.py --help
skills/xhs-search-workflow/.venv/bin/python skills/xhs-search-workflow/scripts/export_notes.py --helpExecution Notes
执行注意事项
- Prefer instead of system
skills/xhs-search-workflow/.venv/bin/python.python - If environment changed, rerun before debugging.
scripts/setup_env.sh - Keep with the skill for cross-machine offline use.
assets/js/vendor/crypto-js.js - Scripts force UTF-8 stdout/stderr; on Windows, also set and
PYTHONUTF8=1.PYTHONIOENCODING=utf-8 - auto-checks JS assets and syncs
scripts/xhs_client.pyfor runtime compatibility.assets/js/static/xhs_xray_pack{1,2}.js - For , place global flags before subcommand:
xhs_full_cli.py- Correct:
xhs_full_cli.py --env-file .env --no-env-proxy <subcommand> ... - Wrong:
xhs_full_cli.py <subcommand> ... --env-file .env
- Correct:
- can return very large JSON; prefer
messages-mentions/messages-likes/messages-connectionsto file.--out - defaults to serial throttling and retries to reduce hang/risk-control issues.
fetch_note_texts.py
- 优先使用而非系统Python
skills/xhs-search-workflow/.venv/bin/python - 如果环境发生变化,调试前请重新运行
scripts/setup_env.sh - 请保留文件,以支持跨机器离线使用
assets/js/vendor/crypto-js.js - 脚本强制使用UTF-8编码输出;在Windows系统中,还需设置环境变量和
PYTHONUTF8=1PYTHONIOENCODING=utf-8 - 会自动检查JS资源,并同步
scripts/xhs_client.py以保证运行时兼容性assets/js/static/xhs_xray_pack{1,2}.js - 对于,全局参数需放在子命令之前:
xhs_full_cli.py- 正确写法:
xhs_full_cli.py --env-file .env --no-env-proxy <subcommand> ... - 错误写法:
xhs_full_cli.py <subcommand> ... --env-file .env
- 正确写法:
- 可能返回超大体积JSON,建议使用
messages-mentions/messages-likes/messages-connections参数将结果保存到文件--out - 默认采用串行限流和重试机制,以减少请求挂起或风控问题
fetch_note_texts.py
Troubleshooting
问题排查
See .
references/troubleshooting.md请查看文档。
references/troubleshooting.md