dossier-collect
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDossier Collect
档案收集
Recursive parallel investigation that builds a graph-structured dossier on a seed entity.
基于种子实体构建图结构档案的递归并行调查工具。
When to use
使用场景
You have a seed (a username, file, symbol, ADR-id, URL, or concept) and want to expand outward discovering every connected entity, with provenance per claim — rather than answering a specific question.
For specific questions use . For multi-step plans use .
deep-researchgoal-plan当你拥有一个种子(用户名、文件、符号、ADR编号、URL或概念),希望向外扩展发现所有关联实体,并为每个声明保留来源信息——而非回答特定问题时,可使用此工具。
若需回答特定问题,请使用;若需制定多步骤计划,请使用。
deep-researchgoal-planSteps
步骤
- Detect seed type — classify as one of: (handle),
username(path),file(code identifier),symbol(ADR-NNN),adr, orurl(free text).concept - Pick sources — match the source matrix to the seed type. Default: all applicable.
- Start trajectory — call with task
mcp__claude-flow__hooks_intelligence_trajectory-start.dossier:<slug> - Round 0 fan-out — issue ALL source queries in ONE message. Examples:
- For :
username,WebSearchon github.com/<user>,WebFetchmcp__claude-flow__memory_search_unified - For :
adrADR file,Readreferences,Grepnamespacemcp__claude-flow__memory_searchadr - For :
symbol,Grep,Globmcp__claude-flow__embeddings_search
- For
- Extract entities — from each hit, surface entities (people, repos, files, adrs, urls, terms). Lightweight regex + heuristics; no LLM extraction unless ambiguous.
- De-dup — drop entities already in the dossier. If is unset, also drop entities whose embedding cosine similarity ≥ 0.92 to an existing node.
--exact - Round k recursion — for each new entity (capped at per source), recurse to step 4 until depth ≥
--max-breadthOR budget exhausted.--max-depth - Aggregate — build graph. Each node carries
{ nodes, edges }. Each edge carries{ id, type, attrs, sources: [...] }.{ from, to, kind, source, confidence } - Render artifacts:
- — executive summary, entity table, mermaid graph, source-provenance footnotes
<slug>.md - — machine-readable graph
<slug>.json - Default location:
v3/docs/examples/dossiers/<slug>/
- Persist — namespace
mcp__claude-flow__memory_storekeydossier.<slug> - End trajectory — with success status.
mcp__claude-flow__hooks_intelligence_trajectory-end
- 检测种子类型——将种子分类为以下类型之一:(用户名/账号)、
username(路径)、file(代码标识符)、symbol(ADR-NNN)、adr或url(自由文本)。concept - 选择数据源——根据种子类型匹配数据源矩阵。默认:所有适用的数据源。
- 启动轨迹——调用,任务为
mcp__claude-flow__hooks_intelligence_trajectory-start。dossier:<slug> - 第0轮扩散——在一条消息中发起所有数据源查询。示例:
- 对于:
username、对github.com/<user>执行WebSearch、WebFetchmcp__claude-flow__memory_search_unified - 对于:读取ADR文件、
adr引用、Grep命名空间mcp__claude-flow__memory_searchadr - 对于:
symbol、Grep、Globmcp__claude-flow__embeddings_search
- 对于
- 提取实体——从每个命中结果中提取实体(人员、仓库、文件、ADR、URL、术语)。使用轻量级正则表达式+启发式方法;仅在模糊情况下使用LLM提取。
- 去重——移除已存在于档案中的实体。若未设置,还需移除与现有节点嵌入余弦相似度≥0.92的实体。
--exact - 第k轮递归——针对每个新实体(每个数据源最多限制为个),递归执行步骤4,直到深度≥
--max-breadth或预算耗尽。--max-depth - 聚合——构建图结构。每个节点包含
{ nodes, edges }。每条边包含{ id, type, attrs, sources: [...] }。{ from, to, kind, source, confidence } - 生成产物:
- ——执行摘要、实体表格、Mermaid图、来源注释
<slug>.md - ——机器可读的图结构
<slug>.json - 默认存储位置:
v3/docs/examples/dossiers/<slug>/
- 持久化——通过存储到命名空间
mcp__claude-flow__memory_store,键为dossier。<slug> - 结束轨迹——调用,标记成功状态。
mcp__claude-flow__hooks_intelligence_trajectory-end
Output schema (JSON)
输出 schema(JSON)
json
{
"seed": "ruvnet",
"seedType": "username",
"depth": 2,
"truncated": false,
"generatedAt": "ISO-8601",
"nodes": [
{ "id": "ruvnet", "type": "username", "attrs": { "...": "..." }, "sources": ["WebSearch", "github.com"] }
],
"edges": [
{ "from": "ruvnet", "to": "ruflo", "kind": "owns", "source": "github.com", "confidence": "high" }
],
"stats": { "nodesByType": {}, "sourcesUsed": [], "tokensSpent": 0 }
}json
{
"seed": "ruvnet",
"seedType": "username",
"depth": 2,
"truncated": false,
"generatedAt": "ISO-8601",
"nodes": [
{ "id": "ruvnet", "type": "username", "attrs": { "...": "..." }, "sources": ["WebSearch", "github.com"] }
],
"edges": [
{ "from": "ruvnet", "to": "ruflo", "kind": "owns", "source": "github.com", "confidence": "high" }
],
"stats": { "nodesByType": {}, "sourcesUsed": [], "tokensSpent": 0 }
}Budget discipline
预算管控
- If is set, track approximate cost via trajectory. On exhaustion: emit partial dossier with
--budget-usdand the entities still queued.truncated: true - BFS expansion only — finish round k before round k+1.
- Never silently truncate. Always mark and record what was skipped.
- 若设置了,通过轨迹跟踪近似成本。预算耗尽时:生成部分档案,标记
--budget-usd并记录待处理的实体。truncated: true - 仅使用BFS扩展——完成第k轮后再开始第k+1轮。
- 绝不静默截断。始终标记并记录跳过的内容。
Examples
示例
/ruflo-goals:dossier-collect ruvnet
/ruflo-goals:dossier-collect ADR-097 --max-depth 1
/ruflo-goals:dossier-collect "src/memory/hnsw.ts" --sources codebase,git,memory
/ruflo-goals:dossier-collect "ruflo-goals" --max-breadth 5 --budget-usd 1/ruflo-goals:dossier-collect ruvnet
/ruflo-goals:dossier-collect ADR-097 --max-depth 1
/ruflo-goals:dossier-collect "src/memory/hnsw.ts" --sources codebase,git,memory
/ruflo-goals:dossier-collect "ruflo-goals" --max-breadth 5 --budget-usd 1