synthesis

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Synthesis

合成

You are performing multi-source research synthesis — aggregating, comparing, and composing insights from multiple sources (papers, blog posts, docs, discussions, code) into a coherent answer. The goal is a single composed response that draws on all sources, not a list of summaries.
This skill exists because loading 3+ full web pages or documents into your main context window is wasteful and noisy. The pattern (Recursive Language Model — see the RLM substrate section in CLAUDE.md) is: fetch each source, store it in shared state, dispatch a sub-agent per source to extract key claims, then dispatch a final synthesis sub-agent to compose the answer.
This skill complements the existing
research
skill. Research handles single small sources (fetch → read → answer). Synthesis handles the cases where multiple sources or large sources make direct reading impractical.
你正在执行多源研究合成——将来自多个来源(论文、博客文章、文档、讨论、代码)的见解进行聚合、对比并整合为连贯的答案。目标是生成一份整合所有来源信息的单一响应,而非摘要列表。
该技能的存在是因为将3个及以上完整网页或文档加载到主上下文窗口既浪费资源又会引入冗余信息。其模式(Recursive Language Model——详见CLAUDE.md中的RLM底层部分)为:获取每个来源,将其存储在共享状态中,为每个来源调度一个子代理以提取关键论点,然后调度一个最终的合成子代理来整合答案。
此技能是现有
research
技能的补充。
research
技能处理单个小型来源(获取→读取→回答)。合成技能则处理多来源或大来源导致直接读取不切实际的情况。

When to use

适用场景

Trigger this skill when ANY of these hold:
  • 3 or more sources on a single topic need to be compared or aggregated
  • Any single source exceeds 5KB (roughly 150 lines / 1,250 tokens) — too large to read efficiently in main context
  • A question requires cross-referencing claims from different authors or documents
  • A community issue links to multiple external references that need to be digested together
当满足以下任一条件时触发此技能:
  • 3个或更多来源围绕同一主题需要进行对比或聚合
  • 任何单个来源大小超过5KB(约150行/1250个token)——在主上下文中高效读取过于困难
  • 问题需要交叉引用不同作者或文档的论点
  • 社区问题链接了多个外部参考资料,需要一起消化

When NOT to use

不适用场景

  • 1-2 sources, all under 5KB. Use the
    research
    skill's single-source procedure — direct
    curl
    and read. Sub-agent overhead exceeds the savings.
  • The sources are code files, not prose. Use
    explore-codebase
    instead — it's optimized for structural code comprehension, not prose synthesis.
  • You already know the answer. Don't synthesize to confirm what you already understand. That's burning sub-agent budget for validation theater.
  • You're inside a sub-agent at depth 3. Stop. Return what you have. Do not dispatch further.
  • 1-2个来源,且均小于5KB。使用
    research
    技能的单源流程——直接
    curl
    并读取。子代理的开销超过了其带来的收益。
  • 来源为代码文件,而非散文类内容。改用
    explore-codebase
    ——它针对结构化代码理解进行了优化,而非散文合成。
  • 你已经知道答案。不要为了确认已知内容而进行合成。这只是在浪费子代理资源做无用的验证。
  • 你处于深度为3的子代理中。停止操作。返回已有的内容。不要继续调度更多子代理。

Procedure

流程

1. Frame the research question (single sentence)

1. 明确研究问题(单句)

Examples of well-framed questions:
  • "What are the tradeoffs between streaming JSON parsing libraries in Rust (serde_json, simd-json, sonic-rs)?"
  • "How do different coding agents (Aider, Continue, Cursor) handle context window management?"
  • "What does the academic literature say about recursive LLM agent architectures?"
A good question names a specific topic and what you want to learn. Don't ask vague questions like "tell me about Rust".
明确问题示例:
  • “Rust中的流式JSON解析库(serde_json、simd-json、sonic-rs)之间有哪些权衡?”
  • “不同的编码Agent(Aider、Continue、Cursor)如何处理上下文窗口管理?”
  • “学术文献对递归LLM代理架构有何论述?”
一个好的问题应明确特定主题以及你想要了解的内容。不要提出诸如*“告诉我关于Rust的内容”*这类模糊的问题。

2. Gather sources

2. 收集来源

Fetch each source via
bash
. Typical patterns:
bash
undefined
通过
bash
获取每个来源。典型模式:
bash
undefined

Web page

网页

curl -sL "https://example.com/article" | sed 's/<[^>]*>//g' | head -500
curl -sL "https://example.com/article" | sed 's/<[^>]*>//g' | head -500

GitHub README

GitHub README

Documentation page

文档页面

curl -sL "https://docs.rs/crate/latest/crate/" | sed 's/<[^>]*>//g' | head -500
curl -sL "https://docs.rs/crate/latest/crate/" | sed 's/<[^>]*>//g' | head -500

Search results (to find sources)

搜索结果(用于查找来源)

curl -s "https://lite.duckduckgo.com/lite?q=your+query" | sed 's/<[^>]*>//g' | head -60

Aim for 3-7 sources. More than 7 rarely adds insight — diminishing returns set in fast.
curl -s "https://lite.duckduckgo.com/lite?q=your+query" | sed 's/<[^>]*>//g' | head -60

目标收集3-7个来源。超过7个来源很少能增加新见解——收益递减效应会很快显现。

3. Decide: direct synthesis or sub-agent dispatch?

3. 决策:直接合成还是子代理调度?

Estimate total content size:
bash
echo "$SOURCE_1" | wc -c
估算总内容大小:
bash
echo "$SOURCE_1" | wc -c

repeat for each source

对每个来源重复此操作


- **Total < 5KB across all sources**: Synthesize directly in your main context. Skip sub-agents — the overhead isn't worth it.
- **Total ≥ 5KB**: Proceed with sub-agent dispatch (Step 4).

- **所有来源总大小 < 5KB**:在主上下文中直接合成。跳过子代理——开销不值得。
- **总大小 ≥ 5KB**:继续执行子代理调度(步骤4)。

4. Store sources in shared state

4. 将来源存储到共享状态

Store each source under a namespaced key:
shared_state set key="synthesis.<topic>.source-1" value="<source 1 content>"
shared_state set key="synthesis.<topic>.source-2" value="<source 2 content>"
shared_state set key="synthesis.<topic>.source-3" value="<source 3 content>"
Namespace convention:
synthesis.<topic>.source-<N>
where
<topic>
is a short kebab-case slug (e.g.,
synthesis.rust-json-parsers.source-1
).
将每个来源存储在命名空间键下:
shared_state set key="synthesis.<topic>.source-1" value="<source 1 content>"
shared_state set key="synthesis.<topic>.source-2" value="<source 2 content>"
shared_state set key="synthesis.<topic>.source-3" value="<source 3 content>"
命名空间约定:
synthesis.<topic>.source-<N>
,其中
<topic>
是短横线分隔的小写字符串(例如:
synthesis.rust-json-parsers.source-1
)。

4a. Chunking for large sources (>30KB)

4a. 大来源(>30KB)的分块处理

If any single source exceeds 30KB (~120,000 bytes):
  1. Split into chunks of ~25KB at paragraph boundaries (double newline
    \n\n
    ). Don't split mid-sentence.
  2. Store each chunk separately:
    shared_state set key="synthesis.<topic>.source-<N>.chunk-1" value="<first ~25KB>"
    shared_state set key="synthesis.<topic>.source-<N>.chunk-2" value="<next ~25KB>"
  3. Dispatch one sub-agent per chunk (same prompt as Step 5, but referencing the chunk key instead of the source key).
  4. Merge chunk summaries before the final synthesis: combine
    key_claims
    and
    key_quotes
    from all chunks of the same source, deduplicate, and store the merged result as the source's summary.
如果任何单个来源超过30KB(约120,000字节):
  1. 分块:在段落边界(双换行
    \n\n
    )处分割为约25KB的块。不要在句子中间分割。
  2. 单独存储每个块
    shared_state set key="synthesis.<topic>.source-<N>.chunk-1" value="<first ~25KB>"
    shared_state set key="synthesis.<topic>.source-<N>.chunk-2" value="<next ~25KB>"
  3. 为每个块调度一个子代理(提示与步骤5相同,但引用块的键而非来源的键)。
  4. 在最终合成前合并块摘要:合并同一来源所有块的
    key_claims
    key_quotes
    ,去重,并将合并结果存储为该来源的摘要。

5. Dispatch per-source sub-agents

5. 调度每个来源的子代理

For each source, dispatch a sub-agent with a focused extraction question. One source per sub-agent — sources are the natural unit of synthesis.
sub_agent: You are extracting key claims from a research source.

The source is stored in shared state under key "synthesis.<topic>.source-<N>".
Read it with: shared_state get key="synthesis.<topic>.source-<N>"

Research question: <your single-sentence question from step 1>

Extract the source's relevant claims and evidence. Reply with ONLY a JSON object (no markdown fences, no prose):
{
  "key_claims": ["claim 1", "claim 2", ...],
  "key_quotes": ["exact quote or close paraphrase with attribution", ...],
  "relevance": "high|medium|low",
  "confidence": 0.0-1.0,
  "source_type": "paper|blog|docs|discussion|code|other",
  "deeper_question": "a follow-up question if something is unclear, or null"
}
Skills do not chain. Sub-agents don't load this skill or any other; include the full question and shared-state key reference directly in the sub-agent's prompt.
为每个来源调度一个子代理,附带聚焦的提取问题。一个来源对应一个子代理——来源是合成的自然单元。
sub_agent: You are extracting key claims from a research source.

The source is stored in shared state under key "synthesis.<topic>.source-<N>".
Read it with: shared_state get key="synthesis.<topic>.source-<N>"

Research question: <your single-sentence question from step 1>

Extract the source's relevant claims and evidence. Reply with ONLY a JSON object (no markdown fences, no prose):
{
  "key_claims": ["claim 1", "claim 2", ...],
  "key_quotes": ["exact quote or close paraphrase with attribution", ...],
  "relevance": "high|medium|low",
  "confidence": 0.0-1.0,
  "source_type": "paper|blog|docs|discussion|code|other",
  "deeper_question": "a follow-up question if something is unclear, or null"
}
技能不会链式调用。子代理不会加载此技能或任何其他技能;需将完整问题和共享状态键引用直接包含在子代理的提示中。

5a. Handle sub-agent responses

5a. 处理子代理响应

Parse each sub-agent's response as JSON:
  1. Valid JSON with all fields: Store the summary in shared state under
    synthesis.<topic>.summary-<N>
    .
  2. Malformed JSON but readable text: Extract what you can. Construct a partial summary:
    {"key_claims": ["<first 300 chars of response>"], "key_quotes": [], "relevance": "low", "confidence": 0.2, "source_type": "other", "deeper_question": null}
    .
  3. Empty or errored: Fall back to direct read of the source via
    curl | head -100
    . Produce a low-confidence summary manually from what you can see.
将每个子代理的响应解析为JSON:
  1. 包含所有字段的有效JSON:将摘要存储在共享状态的
    synthesis.<topic>.summary-<N>
    键下。
  2. 格式错误但可读的文本:提取尽可能多的信息。构造部分摘要:
    {"key_claims": ["<response的前300个字符>"], "key_quotes": [], "relevance": "low", "confidence": 0.2, "source_type": "other", "deeper_question": null}
  3. 空响应或错误响应:通过
    curl | head -100
    回退到直接读取来源。根据可见内容手动生成低置信度摘要。

5b. Recurse on deeper questions

5b. 针对深层问题递归处理

If a sub-agent returns a non-null
deeper_question
AND
confidence
< 0.5:
  1. Dispatch another sub-agent with the narrower question, referencing the same shared-state key.
  2. Merge the answer into the existing summary (append new claims, update confidence).
Hard cap: recursion depth = 3. That's: initial dispatch → 1st recursion → 2nd recursion. After depth 3, accept whatever you have. If you find yourself wanting depth 4, your original question was probably too vague — go back to Step 1 and narrow it.
如果子代理返回非空的
deeper_question
confidence
< 0.5:
  1. 调度另一个子代理,传入更具体的问题,并引用相同的共享状态键。
  2. 将答案合并到现有摘要中(追加新论点,更新置信度)。
硬限制:递归深度 = 3。即:初始调度→第一次递归→第二次递归。深度达到3后,接受现有结果。如果想要深度4,说明最初的问题可能过于模糊——回到步骤1缩小问题范围。

6. Store summaries and dispatch synthesis sub-agent

6. 存储摘要并调度合成子代理

After all per-source summaries are collected, store them together:
shared_state set key="synthesis.<topic>.summaries" value="<JSON array of all summaries>"
Then dispatch a synthesis sub-agent to compose the final answer:
sub_agent: You are synthesizing research from multiple sources into a composed answer.

The per-source summaries are stored in shared state under key "synthesis.<topic>.summaries".
Read them with: shared_state get key="synthesis.<topic>.summaries"

Research question: <your single-sentence question from step 1>

Compose a synthesis that:
1. Identifies areas of AGREEMENT across sources
2. Identifies areas of DISAGREEMENT or tension
3. Notes any gaps — important aspects of the question that no source addressed
4. Weighs claims by source confidence and relevance

Reply with ONLY a JSON object (no markdown fences, no prose):
{
  "answer": "3-5 paragraph composed answer to the research question",
  "consensus": ["claims that multiple sources agree on"],
  "disagreements": ["claims where sources conflict, with attribution"],
  "gaps": ["aspects of the question not covered by any source"],
  "confidence": 0.0-1.0,
  "source_count": <number of sources that contributed>
}
收集完所有单源摘要后,将它们一起存储:
shared_state set key="synthesis.<topic>.summaries" value="<所有摘要的JSON数组>"
然后调度一个合成子代理来生成最终答案:
sub_agent: You are synthesizing research from multiple sources into a composed answer.

The per-source summaries are stored in shared state under key "synthesis.<topic>.summaries".
Read them with: shared_state get key="synthesis.<topic>.summaries"

Research question: <your single-sentence question from step 1>

Compose a synthesis that:
1. Identifies areas of AGREEMENT across sources
2. Identifies areas of DISAGREEMENT or tension
3. Notes any gaps — important aspects of the question that no source addressed
4. Weighs claims by source confidence and relevance

Reply with ONLY a JSON object (no markdown fences, no prose):
{
  "answer": "3-5 paragraph composed answer to the research question",
  "consensus": ["claims that multiple sources agree on"],
  "disagreements": ["claims where sources conflict, with attribution"],
  "gaps": ["aspects of the question not covered by any source"],
  "confidence": 0.0-1.0,
  "source_count": <number of sources that contributed>
}

7. Use the synthesis

7. 使用合成结果

The synthesis sub-agent's
answer
field is your composed response. Use it to:
  • Answer the user's original question
  • Inform a technical decision in an evolve session
  • Write a journal entry or issue comment with cited sources
  • Add to
    memory/learnings.jsonl
    if the finding is novel and would change future behavior
Store the final synthesis in shared state under
synthesis.<topic>.result
so it can be referenced later in the session without re-running.
合成子代理的
answer
字段就是你的整合响应。可用于:
  • 回答用户的原始问题
  • 在演进会话中为技术决策提供信息
  • 撰写带有来源引用的日志条目或问题评论
  • 如果发现是新颖内容且会改变未来行为,添加到
    memory/learnings.jsonl
将最终合成结果存储在共享状态的
synthesis.<topic>.result
键下,以便在会话后续阶段无需重新运行即可引用。

Relationship to the research skill

与research技能的关系

ScenarioUse
1 source, < 5KB
research
— direct curl + read
1 source, ≥ 5KB
synthesis
— store in shared state, sub-agent extract
2 sources, both < 5KB
research
— direct curl + read both
2 sources, any ≥ 5KB
synthesis
— sub-agent dispatch
3+ sources, any size
synthesis
— always
The research skill finds and fetches sources. This skill processes and composes them. They're complementary: research is the scout, synthesis is the analyst.
场景适用技能
1个来源,< 5KB
research
— 直接curl + 读取
1个来源,≥ 5KB
synthesis
— 存储到共享状态,子代理提取
2个来源,均< 5KB
research
— 直接curl + 读取两者
2个来源,任一≥ 5KB
synthesis
— 子代理调度
3+个来源,任意大小
synthesis
— 始终适用
research
技能负责查找和获取来源。本技能负责处理和整合来源。它们互为补充:
research
是侦察兵,合成是分析师。

Pitfalls

注意事项

  • Don't ask sub-agents to make decisions. They extract claims and evidence; you (or the synthesis sub-agent) compose the answer. Per-source sub-agents that try to answer the whole question tend to hallucinate beyond their single source.
  • Don't dump multiple sources to one sub-agent. One source per dispatch keeps the extraction focused and the JSON output reliable. Cross-source reasoning belongs in the synthesis step (Step 6).
  • Don't forget the recursion cap. 3 is the hard limit. If you find yourself wanting depth 4, your research question was too broad — narrow it.
  • Don't synthesize without a question. "Research topic X" is not a question. "What are the tradeoffs of X vs Y for use case Z?" is. The question shapes what each sub-agent extracts.
  • Don't over-fetch. 3-7 sources is the sweet spot. More than 7 sources means you're probably not filtering enough — use search to find the 5 best sources, not all sources.
  • Don't re-synthesize within the same session. If you've already synthesized a topic, the result is in shared state under
    synthesis.<topic>.result
    . Read it with
    shared_state get
    instead of re-dispatching sub-agents.
  • Skills do not chain. Sub-agents can't load skills. Every sub-agent prompt must be self-contained — include the question and the shared-state key reference directly.
  • 不要让子代理做决策。它们只提取论点和证据;你(或合成子代理)来整合答案。试图回答整个问题的单源子代理往往会超出其单一来源的范围产生幻觉。
  • 不要将多个来源交给一个子代理。一个来源对应一次调度,可保持提取聚焦且JSON输出可靠。跨来源推理属于合成步骤(步骤6)。
  • 不要忘记递归限制。3是硬上限。如果想要深度4,说明你的研究问题过于宽泛——缩小范围。
  • 不要无问题地进行合成。“研究主题X”不是一个问题。“对于用例Z,X与Y相比有哪些权衡?”才是。问题决定了每个子代理要提取的内容。
  • 不要过度获取来源。3-7个来源是最佳范围。超过7个来源意味着你可能没有充分筛选——使用搜索找到5个最佳来源,而非所有来源。
  • 不要在同一会话中重复合成。如果你已经合成过某个主题,结果存储在共享状态的
    synthesis.<topic>.result
    键下。使用
    shared_state get
    读取即可,无需重新调度子代理。
  • 技能不会链式调用。子代理无法加载技能。每个子代理的提示必须自包含——直接包含问题和共享状态键引用。

Verification

验证标准

A synthesis is "good enough" when ALL of:
  • The answer addresses the specific research question (not a generic overview of the topic)
  • Multiple sources contributed claims to the answer (not just one source restated)
  • Areas of agreement and disagreement are explicitly identified
  • The answer cites specific claims to specific sources (even if informally — "the docs.rs page says X while the blog post argues Y")
  • The total work used ≤ N+2 sub-agent dispatches where N is the number of sources (N per-source + 1 synthesis + 1 possible recursion)
  • The work stayed within the depth-3 recursion cap
If the synthesis fails any of these, either add another source to fill the gap, or accept the partial result and note the open question.
当满足以下所有条件时,合成结果即为“足够好”:
  • 答案针对特定研究问题(而非主题的通用概述)
  • 多个来源为答案提供了论点(而非仅重复一个来源的内容)
  • 明确指出了共识和分歧的领域
  • 答案将特定论点与特定来源关联(即使是非正式的——“docs.rs页面称X,而博客文章主张Y”)
  • 总工作使用了**≤ N+2次子代理调度**,其中N是来源数量(每个来源1次+1次合成+最多1次递归)
  • 工作未超出深度3的递归限制
如果合成结果未满足任何一条,要么添加另一个来源填补空白,要么接受部分结果并记录未解决的问题。

What this skill deliberately does NOT do

本技能刻意不做的事

  • Does not find sources. Source discovery is the
    research
    skill's job (search → evaluate → pick). This skill takes sources as input and produces synthesis as output.
  • Does not modify code. Synthesis produces understanding, not changes. If the synthesis informs a code change, that's a separate task.
  • Does not write to the audit-log branch. Synthesis results live in shared state for the current session only.
  • Does not replace human judgment. The synthesis is a starting point for decisions, not a verdict. Cross-reference with your own experience and the project's context before acting on synthesis results.
  • 不查找来源。来源发现是
    research
    技能的工作(搜索→评估→选择)。本技能将来源作为输入,输出合成结果。
  • 不修改代码。合成产生的是理解,而非更改。如果合成结果为代码更改提供了信息,那是单独的任务。
  • 不写入audit-log分支。合成结果仅在当前会话的共享状态中存在。
  • 不替代人类判断。合成结果是决策的起点,而非最终结论。在根据合成结果采取行动前,需结合自身经验和项目上下文进行交叉验证。