doubt-driven-development

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Doubt-Driven Development

怀疑驱动开发(Doubt-Driven Development)

Overview

概述

A confident answer is not a correct one. Long sessions accumulate context that quietly turns assumptions into "facts" without anyone noticing. Doubt-driven development is the discipline of materializing a fresh-context reviewer — biased to disprove, not approve — before any non-trivial output stands.
This is not
/review
.
/review
is a verdict on a finished artifact. This is an in-flight posture: non-trivial decisions get cross-examined while course-correction is still cheap.
自信的答案未必正确。长时间的工作会话会累积上下文,在无人察觉的情况下悄然将假设转变为“事实”。怀疑驱动开发是一种规范,要求在任何重大输出敲定前,引入一位带有证伪而非证实偏见的全新视角审查者。
这不同于
/review
命令。
/review
是对已完成成果的最终评判,而怀疑驱动开发是一种执行中的姿态:在仍可低成本修正方向时,对重大决策进行盘问。

When to Use

适用场景

A decision is non-trivial when at least one of these is true:
  • It introduces or modifies branching logic
  • It crosses a module or service boundary
  • It asserts a property the type system or compiler cannot verify (thread safety, idempotence, ordering, invariants)
  • Its correctness depends on context the future reader cannot see
  • Its blast radius is irreversible (production deploy, data migration, public API change)
Apply the skill when:
  • About to make an architectural decision under uncertainty
  • About to commit non-trivial code
  • About to claim a non-obvious fact ("this is safe", "this scales", "this matches the spec")
  • Working in code you don't fully understand
When NOT to use:
  • Mechanical operations (renaming, formatting, file moves)
  • Following a clear, unambiguous user instruction
  • Reading or summarizing existing code
  • One-line changes with obvious correctness
  • Pure tooling operations (running tests, listing files)
  • The user has explicitly asked for speed over verification
If you doubt every keystroke, you ship nothing. The skill applies only to non-trivial decisions as defined above.
当满足以下至少一项时,决策属于重大决策
  • 引入或修改分支逻辑
  • 跨越模块或服务边界
  • 断言类型系统或编译器无法验证的特性(线程安全、幂等性、顺序性、不变量)
  • 其正确性依赖于未来读者无法看到的上下文
  • 影响范围不可逆(生产环境部署、数据迁移、公共API变更)
在以下场景应用该技能:
  • 即将在不确定情况下做出架构决策
  • 即将提交重大代码变更
  • 即将宣称非显而易见的事实(“这是安全的”、“这具备扩展性”、“这符合规范”)
  • 处理自己不完全理解的代码
不适用场景:
  • 机械操作(重命名、格式化、文件移动)
  • 遵循清晰明确的用户指令
  • 阅读或总结现有代码
  • 正确性显而易见的单行变更
  • 纯工具操作(运行测试、列出文件)
  • 用户明确要求优先速度而非验证
如果对每一次按键都持怀疑态度,你将无法交付任何成果。该技能仅适用于上述定义的重大决策。

Loading Constraints

加载约束

This skill is designed for the main-session orchestrator, where Step 3 (DOUBT, detailed below) can spawn a fresh-context reviewer.
  • Do NOT add this skill to a persona's
    skills:
    frontmatter.
    A persona that follows Step 3 would spawn another persona — the orchestration anti-pattern explicitly forbidden by
    references/orchestration-patterns.md
    ("personas do not invoke other personas").
  • If you find yourself applying this skill from inside a subagent context (where Claude Code prevents nested subagent spawn): the preferred path is to surface to the user that doubt-driven cannot run nested and let the main session handle it. As a last resort only, a degraded self-questioning fallback exists — rewrite ARTIFACT + CONTRACT as a fresh self-prompt with a hard mental separator from your prior reasoning, and walk Steps 1–5. This is not fresh-context review (you carry your own context with you), so flag the result as degraded and prefer escalation whenever the user is reachable.
本技能专为主会话协调器设计,其中步骤3(DOUBT,详见下文)可以生成全新视角的审查者。
  • 请勿将此技能添加到角色的
    skills:
    前置内容中。
    遵循步骤3的角色会生成另一个角色——这是
    references/orchestration-patterns.md
    中明确禁止的编排反模式(“角色不得调用其他角色”)。
  • 如果发现自己在子代理上下文中应用此技能(Claude Code禁止嵌套子代理生成):首选方案是告知用户无法在嵌套环境中运行怀疑驱动开发,由主会话处理。仅作为最后手段时,可使用降级的自我质疑回退方案——将ARTIFACT + CONTRACT重写为全新的自我提示,与之前的推理进行严格的心理分隔,然后执行步骤1-5。这并非全新视角审查(你仍会保留自身上下文),因此需标记结果为降级版本,且只要用户可联系,优先升级到主会话处理。

The Process

流程

Copy this checklist when applying the skill:
Doubt cycle:
- [ ] Step 1: CLAIM — wrote the claim + why-it-matters
- [ ] Step 2: EXTRACT — isolated artifact + contract, stripped reasoning
- [ ] Step 3: DOUBT — invoked fresh-context reviewer with adversarial prompt
- [ ] Step 4: RECONCILE — classified every finding against the artifact text
- [ ] Step 5: STOP — met stop condition (trivial findings, 3 cycles, or user override)
应用该技能时,请复制以下检查清单:
Doubt cycle:
- [ ] Step 1: CLAIM — wrote the claim + why-it-matters
- [ ] Step 2: EXTRACT — isolated artifact + contract, stripped reasoning
- [ ] Step 3: DOUBT — invoked fresh-context reviewer with adversarial prompt
- [ ] Step 4: RECONCILE — classified every finding against the artifact text
- [ ] Step 5: STOP — met stop condition (trivial findings, 3 cycles, or user override)

Step 1: CLAIM — Surface what stands

步骤1:CLAIM — 明确决策内容

Name the decision in two or three lines:
CLAIM: "The new caching layer is thread-safe under the
        read-heavy workload described in the spec."
WHY THIS MATTERS: a race here corrupts user data and is
                  hard to detect in QA.
If you can't write the claim that compactly, you have a vibe, not a decision. Surface it before scrutinizing it.
用2-3行描述决策:
CLAIM: "The new caching layer is thread-safe under the
        read-heavy workload described in the spec."
WHY THIS MATTERS: a race here corrupts user data and is
                  hard to detect in QA.
如果你无法简洁地写出这样的声明,说明你只是有个模糊的想法,而非明确的决策。在仔细审查前先明确它。

Step 2: EXTRACT — Smallest reviewable unit

步骤2:EXTRACT — 最小可审查单元

A fresh-context reviewer needs the artifact and the contract, not the journey.
  • Code: the diff or the function — not the whole file
  • Decision: the proposal in 3–5 sentences plus the constraints it has to satisfy
  • Assertion: the claim plus the evidence that supposedly supports it (kept distinct from the Step 1 CLAIM block, which is the orchestrator's hypothesis under scrutiny)
Strip your reasoning. If you hand over conclusions, you'll get back validation of your conclusions. The unit must be small enough that a reviewer can hold it in mind in one read — if it's a 500-line PR, decompose first.
全新视角的审查者需要成果契约,而非决策过程。
  • 代码:差异或函数——而非整个文件
  • 决策:3-5句话的提案及其必须满足的约束
  • 断言:声明加上据称支持它的证据(与步骤1的CLAIM块区分开,后者是协调器正在审查的假设)
剔除你的推理过程。如果你提交结论,得到的将是对结论的验证。单元必须小到审查者可以一次阅读就完全理解——如果是500行的PR,先分解它。

Step 3: DOUBT — Invoke the fresh-context reviewer

步骤3:DOUBT — 调用全新视角审查者

The reviewer's prompt must be adversarial. Framing decides the answer.
Adversarial review. Find what is wrong with this artifact.
Assume the author is overconfident. Look for:
- Unstated assumptions
- Edge cases not handled
- Hidden coupling or shared state
- Ways the contract could be violated
- Existing conventions this might break
- Failure modes under unexpected input

Do NOT validate. Do NOT summarize. Find issues, or state
explicitly that you cannot find any after thorough examination.

ARTIFACT: <paste artifact>
CONTRACT: <paste contract>
Pass ARTIFACT + CONTRACT only. Do NOT pass the CLAIM. Handing the reviewer your conclusion biases it toward agreement. The reviewer must independently determine whether the artifact satisfies the contract.
In Claude Code, the role-based reviewers in
agents/
start with isolated context by design and are usable here — see
agents/
for the roster and per-domain match.
The adversarial prompt above takes precedence over the persona's default response shape. Personas like
code-reviewer
are written to produce balanced verdicts with both strengths and weaknesses; doubt-driven needs issues-only output. Paste the adversarial prompt verbatim into the invocation so it overrides the persona's default. If a persona's response shape can't be overridden cleanly, fall back to a generic subagent with the adversarial prompt.
审查者的提示必须具有对抗性。框架决定答案。
Adversarial review. Find what is wrong with this artifact.
Assume the author is overconfident. Look for:
- Unstated assumptions
- Edge cases not handled
- Hidden coupling or shared state
- Ways the contract could be violated
- Existing conventions this might break
- Failure modes under unexpected input

Do NOT validate. Do NOT summarize. Find issues, or state
explicitly that you cannot find any after thorough examination.

ARTIFACT: <paste artifact>
CONTRACT: <paste contract>
仅传递ARTIFACT + CONTRACT。请勿传递CLAIM。 向审查者提供你的结论会使其偏向认同。审查者必须独立判断成果是否符合契约。
在Claude Code中,
agents/
目录下的基于角色的审查者默认具有独立上下文,可在此处使用——详见
agents/
目录中的角色列表和领域匹配说明。
上述对抗性提示优先于角色的默认响应格式。
code-reviewer
这样的角色默认会生成包含优缺点的平衡评判;但怀疑驱动开发只需要指出问题的输出。请逐字粘贴对抗性提示到调用中,以覆盖角色的默认设置。如果无法干净地覆盖角色的响应格式,回退到使用对抗性提示的通用子代理。

Cross-model escalation

跨模型升级

A single-model reviewer shares blind spots with the original author — a colder, different-architecture model catches them. Doubt-driven is already opt-in for non-trivial decisions, so within that scope offering cross-model is part of the skill's value, not optional friction.
Interactive sessions: always offer. Never silently skip.
Step 1: Ask the user
After the single-model review in Step 3 above, but before RECONCILE, pause and ask:
"Single-model review complete. Want a cross-model second opinion? Options: Gemini CLI, Codex CLI, manual external review (you paste it elsewhere), or skip."
This question is mandatory in every interactive doubt cycle — even on artifacts that feel low-stakes. The user — not the agent — decides whether the cost is worth it. The agent's job is to surface the choice.
Step 2: If the user picks a CLI — verify, then invoke
  1. Check the tool is in PATH (
    which gemini
    ,
    which codex
    ).
  2. Test it works (
    gemini --version
    or equivalent) before passing the full prompt — a stale or broken binary may pass
    which
    but fail on real input.
  3. Confirm the exact invocation with the user, including required flags, auth, and env vars (e.g., API keys). Implementations vary; never assume.
  4. Pass ARTIFACT + CONTRACT + the adversarial prompt only. No session context, no CLAIM.
  5. Mind shell escaping. If the artifact contains quotes,
    $(...)
    , or backticks, prefer stdin (
    echo … | gemini
    ) or a heredoc over inline
    -p "…"
    . When in doubt, ask the user to confirm the invocation before running it.
  6. Take the output into Step 4 (RECONCILE).
Never interpolate the artifact into a shell-quoted argument. Code, markdown, and review prompts routinely contain backticks,
$(...)
, and quote characters that will either truncate the prompt or execute embedded shell. Write the full prompt to a file and pipe it through stdin.
Example shapes (verify flags against your installed tool — syntax differs across implementations and versions):
bash
undefined
单一模型审查者与原作者存在相同的盲点——不同架构的更“冷静”模型可以发现这些盲点。怀疑驱动开发本就是针对重大决策的可选流程,因此在该范围内提供跨模型审查是技能价值的一部分,而非可选的额外步骤。
交互式会话:必须提供,不得静默跳过。
步骤1:询问用户
在完成步骤3的单一模型审查后、进入RECONCILE之前,暂停并询问:
"单一模型审查完成。是否需要跨模型的第二意见?选项:Gemini CLI、Codex CLI、手动外部审查(你粘贴到其他地方),或跳过。"
这个问题在每个交互式怀疑周期中都是强制性的——即使是看似低风险的成果也是如此。由用户而非代理决定是否值得付出成本。代理的职责是呈现选择。
步骤2:如果用户选择CLI——先验证,再调用
  1. 检查工具是否在PATH中(
    which gemini
    which codex
    )。
  2. 在传递完整提示前测试工具是否可用(
    gemini --version
    或等效命令)——过时或损坏的二进制文件可能通过
    which
    检查,但在处理真实输入时会失败。
  3. 与用户确认确切的调用命令,包括所需的标志、认证和环境变量(如API密钥)。实现方式各不相同;切勿假设。
  4. 仅传递ARTIFACT + CONTRACT + 对抗性提示。不传递会话上下文,不传递CLAIM。
  5. 注意shell转义。如果成果包含引号、
    $(...)
    或反引号,优先使用标准输入(
    echo … | gemini
    )或here文档,而非内联
    -p "…"
    。如有疑问,在运行前请用户确认调用命令。
  6. 将输出带入步骤4(RECONCILE)。
切勿将成果插入shell引用的参数中。 代码、Markdown和审查提示通常包含反引号、
$(...)
和引号字符,这些会截断提示或执行嵌入的shell命令。请将完整提示写入文件,然后通过标准输入传递。
示例格式(请根据你安装的工具验证标志——不同实现和版本的语法不同):
bash
undefined

Write the adversarial prompt + ARTIFACT + CONTRACT to a temp file first.

先将对抗性提示 + ARTIFACT + CONTRACT写入临时文件。

Then pipe via stdin so shell metacharacters in the artifact stay inert.

然后通过标准输入传递,确保成果中的shell元字符保持惰性。

Codex (read-only sandbox keeps the CLI from writing to your workspace):

Codex(只读沙箱防止CLI写入你的工作区):

codex exec --sandbox read-only -C <repo-path> - < /tmp/doubt-prompt.md
codex exec --sandbox read-only -C <repo-path> - < /tmp/doubt-prompt.md

Gemini ('--approval-mode plan' is read-only; '-p ""' triggers non-interactive

Gemini('--approval-mode plan'是只读模式;'-p ""'触发非交互式

mode and the prompt is read from stdin):

模式,提示从标准输入读取):

gemini --approval-mode plan -p "" < /tmp/doubt-prompt.md

A read-only sandbox is the load-bearing detail: a doubt artifact may itself contain instructions (intentional or accidental prompt injection) that the cross-model CLI would otherwise execute against your workspace.

**Step 3: If the CLI is unavailable or fails**

Surface the failure explicitly. Offer: run it manually, try a different tool, or skip. Do not silently fall back to single-model — the user should know cross-model didn't happen.

**Step 4: If the user skips**

Acknowledge the skip in the output (*"Proceeding with single-model findings only"*) and continue to RECONCILE. Skipping is fine; silent skipping is not.

**Non-interactive contexts** (CI, `/loop`, autonomous-loop, scheduled runs):

- Cross-model is **skipped**, and the skip must be **announced** in the output: *"Cross-model skipped: non-interactive context."*
- **Never invoke an external CLI without explicit user authorization** — this is a load-bearing safety property.

Cross-model adds cost, latency, and tool fragility. The agent surfaces the choice every cycle; the user decides whether this artifact warrants it.
gemini --approval-mode plan -p "" < /tmp/doubt-prompt.md

只读沙箱是核心细节:怀疑成果本身可能包含指令(有意或无意的提示注入),否则跨模型CLI会在你的工作区执行这些指令。

**步骤3:如果CLI不可用或失败**

明确告知失败情况。提供选项:手动运行、尝试其他工具,或跳过。不要静默回退到单一模型——用户应该知道跨模型审查未完成。

**步骤4:如果用户选择跳过**

在输出中确认跳过(*"仅基于单一模型的审查结果继续"*),然后进入RECONCILE。跳过是允许的;但静默跳过不行。

**非交互式上下文**(CI、`/loop`、自主循环、定时运行):

- 跨模型审查**被跳过**,且必须在输出中**声明**:*"跨模型审查已跳过:非交互式上下文。"*
- **未经用户明确授权,切勿调用外部CLI**——这是核心安全属性。

跨模型审查会增加成本、延迟和工具脆弱性。代理在每个周期都呈现选择;用户决定该成果是否值得。

Step 4: RECONCILE — Fold findings back

步骤4:RECONCILE — 整合审查结果

The reviewer's output is data, not verdict. You are still the orchestrator. Re-read the artifact text against each finding before classifying — rubber-stamping the reviewer is the same failure mode as ignoring it.
For each finding, classify in this precedence order (first matching class wins):
  1. Contract misread — reviewer flagged something specifically because the CONTRACT you provided was unclear or incomplete. Fix the contract first, re-classify on the next cycle.
  2. Valid + actionable — real issue requiring a change to the artifact. Change it, re-loop.
  3. Valid trade-off — issue is real but cost of fixing exceeds cost of accepting. Document the trade-off explicitly so the user sees it.
  4. Noise — reviewer flagged something that's actually correct under context the reviewer didn't have. Note it, move on, and ask: would adding that context to the contract have prevented the false flag?
A fresh reviewer can be wrong because it lacks context. Don't defer just because it's "fresh."
审查者的输出是数据,而非最终结论。你仍然是协调器。 在分类前,对照每个审查结果重新阅读成果文本——盲目认可审查者和忽略审查者是同样的错误模式。
对于每个审查结果,按以下优先级顺序分类(匹配的第一个类别生效):
  1. 契约误解——审查者指出问题是因为你提供的契约不清晰或不完整。先修正契约,在下一个周期重新分类。
  2. 有效且可操作——需要修改成果的真实问题。修改后重新循环。
  3. 有效权衡——问题真实存在,但修复成本高于接受成本。明确记录权衡,让用户可见。
  4. 噪音——审查者指出的内容在审查者不具备的上下文下实际上是正确的。记录下来,继续下一步,并思考:如果将该上下文添加到契约中,是否可以避免误报?
全新视角的审查者可能因缺乏上下文而犯错。不要仅仅因为它是“全新视角”就盲目遵从。

Step 5: STOP — Bounded loop, not recursion

步骤5:STOP — 有限循环,而非递归

Stop when:
  • Next iteration returns only trivial or already-considered findings, or
  • 3 cycles completed (escalate to user, don't grind a fourth alone), or
  • User explicitly says "ship it"
If after 3 cycles the reviewer still surfaces substantive issues, the artifact may not be ready. Surface this to the user — three unresolved cycles is information about the artifact, not a reason to keep looping.
If 3 cycles is "obviously insufficient" because the artifact is large: the artifact is too big — return to Step 2 and decompose. Do not lift the bound.
在以下情况时停止:
  • 下一次迭代仅返回琐碎或已考虑过的结果,
  • 完成3个循环(升级给用户,不要独自进行第四个循环),
  • 用户明确说“交付”
如果3个循环后审查者仍提出实质性问题,成果可能尚未准备好。告知用户——三个未解决的循环是关于成果的信息,而非继续循环的理由。
如果因为成果太大而明显“3个循环不够”:说明成果太大——回到步骤2进行分解。不要提高循环次数上限。

Common Rationalizations

常见合理化借口

RationalizationReality
"I'm confident, skip the doubt step"Confidence correlates poorly with correctness on novel problems. Moments of certainty are exactly when blind spots hide.
"Spawning a reviewer is expensive"Debugging a wrong commit in production is more expensive. The check is bounded; the bug isn't.
"The reviewer will just nitpick"Only if unscoped. Constrain the prompt to "issues that would make this fail under the contract."
"I'll do doubt at the end with
/review
"
/review
is a final gate. Doubt-driven catches wrong directions early when course-correction is cheap. By PR time it's too late.
"If I doubt every step I'll never ship"The skill applies to non-trivial decisions, not every keystroke. Re-read "When NOT to Use."
"Two opinions are always better than one"Not when the second has less context and produces noise. Reconcile, don't defer.
"The reviewer disagreed so I was wrong"The reviewer lacks your context — disagreement is information, not verdict. Re-read the artifact, classify, then decide.
"Cross-model is always better"Cross-model catches blind spots a single model shares with itself, but it adds cost and tool fragility. Offer it every interactive doubt cycle — the user decides whether the artifact warrants it. The agent's job is to surface the choice, not to gate it.
"User said yes once, so I can keep invoking the CLI"Each invocation is its own authorization. The artifact, the prompt, and the flags change between calls — re-confirm the exact command with the user before every run.
合理化借口实际情况
"我很有信心,跳过怀疑步骤"在新颖问题上,信心与正确性相关性很低。确信的时刻恰恰是盲点隐藏的时候。
"生成审查者成本很高"在生产环境中调试错误提交的成本更高。审查是有限的;而bug的影响是无限的。
"审查者只会吹毛求疵"只有在未限定范围时才会如此。将提示约束为“找出会导致成果不符合契约的问题”。
"我会在最后用
/review
进行怀疑检查"
/review
是最终关卡。怀疑驱动开发在仍可低成本修正方向时尽早发现错误。到PR阶段为时已晚。
"如果我怀疑每一步,永远无法交付"该技能仅适用于重大决策,而非每一次按键。重新阅读“不适用场景”。
"两个意见总比一个好"当第二个意见缺乏上下文且产生噪音时并非如此。要整合,而非遵从。
"审查者不同意我,所以我错了"审查者缺乏你的上下文——分歧是信息,而非最终结论。重新阅读成果,分类后再做决定。
"跨模型总是更好"跨模型可以发现单一模型与自身共享的盲点,但会增加成本和工具脆弱性。在每个交互式怀疑周期都提供该选项——用户决定成果是否值得。代理的职责是呈现选择,而非把关。
"用户曾经同意过,所以我可以继续调用CLI"每次调用都需要单独授权。成果、提示和标志在不同调用中会变化——每次运行前都要与用户确认确切的命令。

Red Flags

危险信号

  • Spawning a fresh-context reviewer for a one-line rename or formatting change
  • Treating reviewer output as authoritative without re-reading the artifact text
  • Looping >3 cycles without escalating to the user
  • Prompting the reviewer with "is this good?" instead of "find issues"
  • Skipping doubt under time pressure on a high-stakes decision
  • Re-spawning fresh-context on an unchanged artifact (you'll get the same findings; you're stalling)
  • Doubt theater (checkable signal): across 2 or more cycles where the reviewer surfaced substantive findings, zero findings were classified as actionable. You are validating, not doubting. Stop and escalate.
  • Doubting only after committing — that's
    /review
    , not doubt-driven development
  • Hardcoding an external CLI invocation without confirming with the user that the tool exists, is configured, and accepts that exact syntax
  • Silently skipping cross-model in an interactive doubt cycle. Even when not recommending it, the offer must be visible. Skipping is fine; silent skipping is not.
  • Falling back silently when an external CLI errors or is missing — surface the failure and let the user redirect
  • Stripping the contract from the reviewer's input
  • Passing the CLAIM to the reviewer (biases toward agreement)
  • 为单行重命名或格式化变更生成全新视角审查者
  • 不重新阅读成果文本就将审查者输出视为权威
  • 循环超过3次而不升级给用户
  • 用“这个好吗?”而非“找出问题”来提示审查者
  • 在高风险决策中因时间压力跳过怀疑步骤
  • 对未变更的成果重新生成全新视角审查者(你会得到相同的结果;这是在拖延)
  • 怀疑形式主义(可检查信号):在2个或更多循环中,审查者提出了实质性结果,但没有任何结果被归类为可操作。你在进行验证,而非怀疑。停止并升级给用户。
  • 仅在提交后才进行怀疑检查——那是
    /review
    ,而非怀疑驱动开发
  • 硬编码外部CLI调用,未与用户确认工具存在、已配置且接受该确切语法
  • 在交互式怀疑周期中静默跳过跨模型审查。 即使不推荐,也必须提供该选项。跳过是允许的;但静默跳过不行。
  • 当外部CLI出错或缺失时静默回退——告知失败情况,让用户重新选择
  • 从审查者的输入中剔除契约
  • 将CLAIM传递给审查者(使其偏向认同)

Interaction with Other Skills

与其他技能的交互

  • code-review-and-quality
    /
    /review
    : complementary.
    /review
    is post-hoc PR verdict; doubt-driven is in-flight per-decision. Use both.
  • source-driven-development
    : SDD verifies facts about frameworks against official docs. Doubt-driven verifies your reasoning about the artifact. SDD checks the API exists; doubt-driven checks you used it correctly under the contract.
  • test-driven-development
    : TDD's RED step is doubt made concrete — a failing test is a disproof attempt. When TDD applies, that failing test is the doubt step for behavioral claims.
  • debugging-and-error-recovery
    : when the reviewer surfaces a real failure mode, drop into the debugging skill to localize and fix.
  • Repo orchestration rules (
    references/orchestration-patterns.md
    ): this skill orchestrates from the main session. A persona calling another persona is anti-pattern B — see Loading Constraints above.
  • code-review-and-quality
    /
    /review
    :互补。
    /review
    是事后PR评判;怀疑驱动开发是执行中的逐决策审查。两者都要使用。
  • source-driven-development
    :SDD根据官方文档验证关于框架的事实。怀疑驱动开发验证你对成果的推理。SDD检查API是否存在;怀疑驱动开发检查你是否根据契约正确使用它。
  • test-driven-development
    :TDD的RED步骤是具体化的怀疑——失败的测试是证伪尝试。当适用TDD时,失败的测试就是行为声明的怀疑步骤(根据与其他技能的交互说明)。
  • debugging-and-error-recovery
    :当审查者指出真实的失败模式时,切换到调试技能进行定位和修复。
  • 仓库编排规则
    references/orchestration-patterns.md
    ):本技能从主会话进行编排。角色调用另一个角色是反模式B——详见上文的加载约束。

Verification

验证

After applying doubt-driven development:
  • Every non-trivial decision (per the definition above) was named explicitly as a CLAIM before standing
  • At least one fresh-context review per non-trivial artifact (a failing test produced by TDD's RED step satisfies this for behavioral claims, per Interaction with Other Skills)
  • The reviewer received ARTIFACT + CONTRACT — NOT the CLAIM, NOT your reasoning
  • The reviewer's prompt was adversarial ("find issues"), not validating ("is it good")
  • Findings were classified against the artifact text (not rubber-stamped) using the precedence: contract misread / actionable / trade-off / noise
  • A stop condition was met (trivial findings, 3 cycles, or user override)
  • In interactive mode, cross-model was explicitly offered to the user (regardless of artifact stakes) and the response was acknowledged in the output
  • In non-interactive mode, cross-model was skipped and the skip was announced
  • Any external CLI invocation was preceded by a PATH check, a working-binary test, syntax confirmation with the user, and explicit authorization to run
应用怀疑驱动开发后:
  • 每个重大决策(根据上述定义)在敲定前都被明确命名为CLAIM
  • 每个重大成果至少进行了一次全新视角审查(TDD的RED步骤生成的失败测试满足行为声明的此要求,根据与其他技能的交互说明)
  • 审查者收到了ARTIFACT + CONTRACT — 未收到CLAIM,未收到你的推理
  • 审查者的提示具有对抗性(“找出问题”),而非验证性(“它好吗”)
  • 结果对照成果文本进行了分类(未盲目认可),优先级为:契约误解 / 可操作 / 权衡 / 噪音
  • 满足停止条件(琐碎结果、3个循环、或用户覆盖)
  • 在交互式模式下,明确向用户提供跨模型审查选项(无论成果风险如何),并在输出中确认响应
  • 在非交互式模式下,跨模型审查被跳过且已声明
  • 任何外部CLI调用前都进行了PATH检查、可用二进制测试、与用户确认语法,并获得明确授权",