deep-research
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDeep Research Skill
深度研究技能
Trigger
触发条件
Activate this skill when the user wants to:
- "Research a topic", "literature review", "find papers about", "survey papers on"
- "Deep dive into [topic]", "what's the state of the art in [topic]"
- Uses slash command
/research <topic>
当用户有以下需求时激活该技能:
- "研究某个主题"、"文献综述"、"查找关于xxx的论文"、"调研xxx领域的论文"
- "深入探究[主题]"、"[主题]的当前最优技术是什么"
- 使用 斜杠命令
/research <topic>
Overview
概述
This skill conducts systematic academic literature reviews in 6 phases, producing structured notes, a curated paper database, and a synthesized final report. Output is organized by phase for clarity.
Installation: — scripts, references, and this skill definition.
Output: relative to the current working directory.
~/.claude/skills/deep-research/.//Users/lingzhi/Code/deep-research-output/{slug}/本技能分6个阶段开展系统化的学术文献综述,生成结构化笔记、精选论文数据库以及综合最终报告。为清晰起见,输出按阶段组织。
安装路径: — 存放脚本、参考资料和本技能的定义。
输出路径:相对于当前工作目录的 。
~/.claude/skills/deep-research/.//Users/lingzhi/Code/deep-research-output/{slug}/CRITICAL: Strict Sequential Phase Execution
重要须知:严格按顺序执行阶段
You MUST execute all 6 phases in strict order: 1 → 2 → 3 → 4 → 5 → 6. NEVER skip any phase.
This is the single most important rule of this skill. Violations include:
- ❌ Jumping from Phase 2 to Phase 5/6 (skipping Deep Dive and Code)
- ❌ Writing synthesis or report before completing Phase 3 deep reading
- ❌ Producing a final report based only on abstracts/titles from search results
- ❌ Combining or merging phases (e.g., doing "Phase 3-5 together")
你必须严格按照 1 → 2 → 3 → 4 → 5 → 6 的顺序执行全部6个阶段,严禁跳过任何阶段。
这是本技能最重要的规则。违规行为包括:
- ❌ 从第2阶段直接跳到第5/6阶段(跳过深度研读和代码调研阶段)
- ❌ 在完成第3阶段深度研读之前就撰写综合分析或报告
- ❌ 仅基于搜索结果的摘要/标题生成最终报告
- ❌ 合并多个阶段(例如“同时执行第3-5阶段”)
Phase Gate Protocol
阶段关口协议
Before starting Phase N+1, you MUST verify that Phase N's required output files exist on disk. If they don't exist, you have NOT completed that phase.
| Phase | Gate: Required Output Files |
|---|---|
| 1 → 2 | |
| 2 → 3 | |
| 3 → 4 | |
| 4 → 5 | |
| 5 → 6 | |
After completing each phase, print a phase completion checkpoint:
✅ Phase N complete. Output: [list files written]. Proceeding to Phase N+1.在开始第N+1阶段前,你必须验证第N阶段的要求输出文件已存在于磁盘中。如果文件不存在,说明你尚未完成该阶段。
| 阶段 | 关口:要求输出文件 |
|---|---|
| 1 → 2 | |
| 2 → 3 | |
| 3 → 4 | |
| 4 → 5 | |
| 5 → 6 | |
每个阶段完成后,打印阶段完成检查点:
✅ Phase N complete. Output: [list files written]. Proceeding to Phase N+1.Why Every Phase Matters
各阶段的必要性说明
- Phase 3 (Deep Dive) is where you actually READ papers — without it, your synthesis is superficial and based only on abstracts
- Phase 4 (Code & Tools) grounds the research in practical implementations — without it, you miss the open-source ecosystem
- Phase 5 (Synthesis) requires deep knowledge from Phase 3 — you cannot synthesize papers you haven't read
- Phase 6 (Report) assembles content from ALL prior phases — it should cite specific findings from Phase 3 notes
- 第3阶段(深度研读):你会真正阅读论文全文——没有这一步,你的综合分析会非常浅显,仅基于摘要内容
- 第4阶段(代码与工具):将研究落地到实际实现——没有这一步,你会遗漏开源生态相关内容
- 第5阶段(综合分析):需要第3阶段积累的深度知识——你无法对未读过的论文做综合分析
- 第6阶段(报告):整合所有先前阶段的内容——应当引用第3阶段笔记中的具体发现
Paper Quality Policy
论文质量政策
Peer-reviewed conference papers take priority over arXiv preprints. Many arXiv papers have not undergone peer review and may contain unverified claims.
同行评审的会议论文优先级高于arXiv预印本。 很多arXiv论文未经过同行评审,可能包含未经验证的结论。
Source Priority (highest to lowest)
来源优先级(从高到低)
- Top AI conferences: NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, AAAI, IJCAI, CVPR, KDD, CoRL
- Peer-reviewed journals: JMLR, TACL, Nature, Science, etc.
- Workshop papers: NeurIPS/ICML workshops (lower bar but still reviewed)
- arXiv preprints with high citations: Likely high-quality but unverified
- Recent arXiv preprints: Use cautiously, note "preprint" status explicitly
- 顶级AI会议:NeurIPS, ICLR, ICML, ACL, EMNLP, NAACL, AAAI, IJCAI, CVPR, KDD, CoRL
- 同行评审期刊:JMLR, TACL, Nature, Science 等
- 研讨会论文:NeurIPS/ICML 研讨会(录用门槛更低但仍经过评审)
- 高引用量arXiv预印本:质量可能较高但未经验证
- 近期arXiv预印本:谨慎使用,明确标注“预印本”状态
When to Use arXiv Papers
arXiv论文适用场景
- As supplementary evidence alongside peer-reviewed work
- For very recent results (< 3 months old) not yet at conferences
- When a peer-reviewed version doesn't exist yet — note in citations
(preprint) - For survey/review papers (these are useful even without peer review)
- 作为同行评审成果的补充证据
- 针对发布时间极短(<3个月)、尚未在会议发表的研究结果
- 当暂时没有同行评审版本时——在引用中注明
(preprint) - 用于调研/综述类论文(即使没有同行评审也有参考价值)
Search Tools (by priority)
搜索工具(按优先级排序)
1. paper_finder (primary — conference papers only)
1. paper_finder(主要工具——仅支持会议论文)
Location:
/Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.pySearches ai-paper-finder.info (HuggingFace Space) for published conference papers. Supports filtering by conference + year. Outputs JSONL with BibTeX.
bash
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode scrape --config <config.yaml>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode download --jsonl <results.jsonl>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --list-venuesConfig example:
yaml
searches:
- query: "long horizon reasoning agent"
num_results: 100
venues:
neurips: [2024, 2025]
iclr: [2024, 2025, 2026]
icml: [2024, 2025]
output:
root: /Users/lingzhi/Code/deep-research-output/{slug}/phase1_frontier/search_results
overwrite: true路径:
/Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py在 ai-paper-finder.info(HuggingFace Space)中搜索已发表的会议论文,支持按会议+年份筛选,输出带BibTeX的JSONL格式内容。
bash
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode scrape --config <config.yaml>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --mode download --jsonl <results.jsonl>
python /Users/lingzhi/Code/documents/tool/paper_finder/paper_finder.py --list-venues配置示例:
yaml
searches:
- query: "long horizon reasoning agent"
num_results: 100
venues:
neurips: [2024, 2025]
iclr: [2024, 2025, 2026]
icml: [2024, 2025]
output:
root: /Users/lingzhi/Code/deep-research-output/{slug}/phase1_frontier/search_results
overwrite: true2. search_semantic_scholar.py (supplementary — citation data + broader coverage)
2. search_semantic_scholar.py(补充工具——提供引用数据+更广覆盖范围)
Location:
Supports and filters. API key: (field )
/Users/lingzhi/.claude/skills/deep-research/scripts/search_semantic_scholar.py--peer-reviewed-only--top-conferences/Users/lingzhi/Code/keys.mdS2_API_Key路径:
支持 和 筛选参数。API密钥:(字段 )
/Users/lingzhi/.claude/skills/deep-research/scripts/search_semantic_scholar.py--peer-reviewed-only--top-conferences/Users/lingzhi/Code/keys.mdS2_API_Key3. search_arxiv.py (supplementary — latest preprints)
3. search_arxiv.py(补充工具——最新预印本)
Location:
For searching recent papers not yet published at conferences. Mark citations with .
/Users/lingzhi/.claude/skills/deep-research/scripts/search_arxiv.py(preprint)路径:
用于搜索还未在会议上发表的最新论文,引用时标注 。
/Users/lingzhi/.claude/skills/deep-research/scripts/search_arxiv.py(preprint)Other Scripts
其他脚本
| Script | Location | Key Flags |
|---|---|---|
| | |
| | |
| | subcommands: |
| | |
| | |
| 脚本 | 路径 | 核心参数 |
|---|---|---|
| | |
| | |
| | 子命令: |
| | |
| | |
WebFetch Mode (no Bash)
WebFetch模式(无需Bash)
- Paper discovery: +
WebSearchto query Semantic Scholar/arXiv APIsWebFetch - Paper reading: on ar5iv HTML or
WebFetchtool on downloaded PDFsRead - Writing: tool for JSONL, notes, report files
Write
- 论文发现:使用 +
WebSearch查询 Semantic Scholar/arXiv APIWebFetch - 论文阅读:对ar5iv的HTML页面使用 ,或对下载的PDF使用
WebFetch工具Read - 写作:使用 工具生成JSONL、笔记、报告文件
Write
6-Phase Workflow
6阶段工作流
Phase 1: Frontier
第1阶段:前沿调研
Search the latest conference proceedings and preprints to understand current trends.
- Write targeting latest 1-2 years
phase1_frontier/paper_finder_config.yaml - Run paper_finder scrape
- WebSearch for latest accepted paper lists
- Identify trending directions, key breakthroughs
→ Output: ,
phase1_frontier/frontier.mdphase1_frontier/search_results/
搜索最新的会议论文集和预印本,了解当前趋势。
- 编写 ,目标范围为最近1-2年
phase1_frontier/paper_finder_config.yaml - 运行 paper_finder scrape
- 网页搜索最新的录用论文列表
- 识别趋势方向、关键突破
→ 输出:,
phase1_frontier/frontier.mdphase1_frontier/search_results/
Phase 2: Survey
第2阶段:全面调研
Build a comprehensive landscape with broader time range. Target 35-80 papers after filtering.
- Write covering 2023-2025
phase2_survey/paper_finder_config.yaml - Run paper_finder + Semantic Scholar + arXiv
- Merge all results:
python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py merge - Filter to 35-80 most relevant:
python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py filter --min-score 0.80 --max-papers 70 - Cluster by theme, write survey notes
→ Output: ,
phase2_survey/survey.md,phase2_survey/search_results/paper_db.jsonl
覆盖更广时间范围,构建全面的领域全景。筛选后目标收录35-80篇论文。
- 编写 ,覆盖2023-2025年
phase2_survey/paper_finder_config.yaml - 运行 paper_finder + Semantic Scholar + arXiv 搜索
- 合并所有结果:
python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py merge - 筛选出35-80篇最相关的论文:
python /Users/lingzhi/.claude/skills/deep-research/scripts/paper_db.py filter --min-score 0.80 --max-papers 70 - 按主题聚类,撰写调研笔记
→ 输出:,
phase2_survey/survey.md,phase2_survey/search_results/paper_db.jsonl
Phase 3: Deep Dive ⚠️ DO NOT SKIP
第3阶段:深度研读 ⚠️ 请勿跳过
This phase is MANDATORY. You must actually READ 8-15 full papers, not just their abstracts.
- Select 8-15 papers from paper_db.jsonl with rationale → write
phase3_deep_dive/selection.md - Download PDFs:
python download_papers.py --jsonl paper_db.jsonl --output-dir phase3_deep_dive/papers/ --sort-by-citations --max-downloads 15 - For EACH selected paper, read the full text (PDF via or HTML via
Readon ar5iv)WebFetch - Write detailed structured notes per paper (see note-format.md template): problem, contributions, methodology, experiments, limitations, connections
- Write ALL notes →
phase3_deep_dive/deep_dive.md
Phase 3 Gate: must contain detailed notes for ≥8 papers, each with methodology and experiment sections filled in. Abstract-only summaries do NOT count.
deep_dive.md→ Output: , ,
phase3_deep_dive/selection.mdphase3_deep_dive/deep_dive.mdphase3_deep_dive/papers/本阶段为强制要求。 你必须实际阅读8-15篇完整论文,而不仅仅是摘要。
- 从 paper_db.jsonl 中选择8-15篇论文并说明选择理由 → 写入
phase3_deep_dive/selection.md - 下载PDF:
python download_papers.py --jsonl paper_db.jsonl --output-dir phase3_deep_dive/papers/ --sort-by-citations --max-downloads 15 - 对每篇选中的论文,阅读全文(通过工具读取PDF,或通过
Read读取ar5iv的HTML版本)WebFetch - 为每篇论文撰写结构化的详细笔记(参考note-format.md模板):问题、贡献、方法、实验、局限性、关联关系
- 汇总所有笔记 → 写入
phase3_deep_dive/deep_dive.md
第3阶段关口: 必须包含≥8篇论文的详细笔记,每篇都要有方法和实验部分的内容,仅摘要总结不满足要求。
deep_dive.md→ 输出:, ,
phase3_deep_dive/selection.mdphase3_deep_dive/deep_dive.mdphase3_deep_dive/papers/Phase 4: Code & Tools ⚠️ DO NOT SKIP
第4阶段:代码与工具 ⚠️ 请勿跳过
This phase is MANDATORY. You must survey the open-source ecosystem.
- Extract GitHub URLs from papers read in Phase 3
- WebSearch for implementations: "site:github.com {method name}", "site:paperswithcode.com {topic}"
- For each repo found: record URL, stars, language, last updated, documentation quality
- Search for related benchmarks and datasets
- Write → (must contain ≥3 repositories)
phase4_code/code_repos.md
Phase 4 Gate: must exist and contain at least 3 repositories with metadata.
code_repos.md→ Output:
phase4_code/code_repos.md本阶段为强制要求。 你必须调研开源生态。
- 从第3阶段读过的论文中提取GitHub链接
- 网页搜索相关实现:"site:github.com {方法名}"、"site:paperswithcode.com {主题}"
- 对每个找到的仓库:记录URL、星标数、开发语言、最后更新时间、文档质量
- 搜索相关的基准测试和数据集
- 写入 → (必须包含≥3个代码仓库)
phase4_code/code_repos.md
第4阶段关口: 必须存在且包含至少3个带元数据的代码仓库。
code_repos.md→ 输出:
phase4_code/code_repos.mdPhase 5: Synthesis (REQUIRES Phase 3 + 4 complete)
第5阶段:综合分析(需完成第3+4阶段)
Cross-paper analysis. Weight peer-reviewed findings higher.
This phase MUST build on the detailed notes from Phase 3 and the code landscape from Phase 4.
Taxonomy, comparative tables, gap analysis.
Before starting: Verify and exist. If not, go back and complete those phases first.
phase3_deep_dive/deep_dive.mdphase4_code/code_repos.md→ Output: ,
phase5_synthesis/synthesis.mdphase5_synthesis/gaps.md跨论文分析。同行评审的研究结果权重更高。
本阶段必须基于第3阶段的详细笔记和第4阶段的代码生态内容,生成分类体系、对比表格、研究空白分析。
开始前验证:确认 和 存在,否则返回完成对应阶段。
phase3_deep_dive/deep_dive.mdphase4_code/code_repos.md→ 输出:,
phase5_synthesis/synthesis.mdphase5_synthesis/gaps.mdPhase 6: Compilation (REQUIRES Phase 1-5 complete)
第6阶段:报告汇编(需完成第1-5阶段)
Assemble final report from ALL prior phase outputs. Mark preprint citations with suffix.
(preprint)Before starting: Verify ALL phase outputs exist:
phase1_frontier/frontier.mdphase2_survey/survey.mdphase3_deep_dive/deep_dive.mdphase4_code/code_repos.md- +
phase5_synthesis/synthesis.mdgaps.md
If ANY are missing, go back and complete the missing phase(s) first.
→ Output: ,
phase6_report/report.mdphase6_report/references.bib整合所有先前阶段的输出,生成最终报告。预印本引用后缀标注 。
(preprint)开始前验证:确认所有阶段的输出都存在:
phase1_frontier/frontier.mdphase2_survey/survey.mdphase3_deep_dive/deep_dive.mdphase4_code/code_repos.md- +
phase5_synthesis/synthesis.mdgaps.md
如果有任何文件缺失,返回完成对应的阶段。
→ 输出:,
phase6_report/report.mdphase6_report/references.bibOutput Directory
输出目录
output/{topic-slug}/
├── paper_db.jsonl # Master database (accumulated)
├── phase1_frontier/
│ ├── paper_finder_config.yaml
│ ├── search_results/
│ └── frontier.md
├── phase2_survey/
│ ├── paper_finder_config.yaml
│ ├── search_results/
│ └── survey.md
├── phase3_deep_dive/
│ ├── papers/
│ ├── selection.md
│ └── deep_dive.md
├── phase4_code/
│ └── code_repos.md
├── phase5_synthesis/
│ ├── synthesis.md
│ └── gaps.md
└── phase6_report/
├── report.md
└── references.biboutput/{topic-slug}/
├── paper_db.jsonl # 主数据库(累计更新)
├── phase1_frontier/
│ ├── paper_finder_config.yaml
│ ├── search_results/
│ └── frontier.md
├── phase2_survey/
│ ├── paper_finder_config.yaml
│ ├── search_results/
│ └── survey.md
├── phase3_deep_dive/
│ ├── papers/
│ ├── selection.md
│ └── deep_dive.md
├── phase4_code/
│ └── code_repos.md
├── phase5_synthesis/
│ ├── synthesis.md
│ └── gaps.md
└── phase6_report/
├── report.md
└── references.bibKey Conventions
核心约定
- Paper IDs: Use when available, otherwise Semantic Scholar
arxiv_idpaperId - Citations: format, key = firstAuthorYearWord (e.g.,
[@key])[@vaswani2017attention] - JSONL schema: title, authors, abstract, year, venue, venue_normalized, peer_reviewed, citationCount, paperId, arxiv_id, pdf_url, tags, source
- Preprint marking: Always note when citing non-peer-reviewed work
(preprint) - Incremental saves: Each phase writes to disk immediately
- Paper count: Target 35-80 papers in final paper_db.jsonl (use )
paper_db.py filter
- 论文ID:优先使用 ,否则使用 Semantic Scholar
arxiv_idpaperId - 引用格式:格式,key = 第一作者年份关键词(例如
[@key])[@vaswani2017attention] - JSONL schema:title, authors, abstract, year, venue, venue_normalized, peer_reviewed, citationCount, paperId, arxiv_id, pdf_url, tags, source
- 预印本标注:引用非同行评审内容时必须标注
(preprint) - 增量保存:每个阶段的内容立即写入磁盘
- 论文数量:最终 paper_db.jsonl 目标收录35-80篇论文(使用 调整)
paper_db.py filter
References
参考资料
- — Detailed 6-phase methodology
/Users/lingzhi/.claude/skills/deep-research/references/workflow-phases.md - — Note templates, BibTeX format, report structure
/Users/lingzhi/.claude/skills/deep-research/references/note-format.md - — arXiv, Semantic Scholar, ar5iv API guide
/Users/lingzhi/.claude/skills/deep-research/references/api-reference.md
- — 6阶段方法论详情
/Users/lingzhi/.claude/skills/deep-research/references/workflow-phases.md - — 笔记模板、BibTeX格式、报告结构
/Users/lingzhi/.claude/skills/deep-research/references/note-format.md - — arXiv、Semantic Scholar、ar5iv API指南
/Users/lingzhi/.claude/skills/deep-research/references/api-reference.md
Related Skills
相关技能
- Downstream: literature-search, literature-review, citation-management
- See also: novelty-assessment, survey-generation
- 下游技能:literature-search、literature-review、citation-management
- 另见:novelty-assessment、survey-generation