tdd
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTDD
TDD
Value: Feedback -- short cycles with verifiable evidence keep AI-generated
code honest and the human in control. Tests express intent; evidence confirms
progress.
价值: 反馈——通过可验证证据的短周期确保AI生成代码的可靠性,同时保持人类的控制权。测试表达意图,证据确认进展。
Purpose
目标
Teaches a five-step TDD cycle (RED, DOMAIN, GREEN, DOMAIN, COMMIT) that
adapts to whatever harness runs it. Detects available delegation primitives
and routes to guided mode (human drives each phase) or automated mode
(system orchestrates phases). Prevents primitive obsession, skipped reviews,
and untested complexity regardless of mode.
教授五步TDD循环(RED、DOMAIN、GREEN、DOMAIN、COMMIT),可适配任何运行它的工具框架。检测可用的委托原语,自动切换至引导模式(人工驱动每个阶段)或自动化模式(系统编排各阶段)。无论采用哪种模式,都能防止原始类型痴迷、跳过评审以及未测试的复杂度问题。
Practices
实践
The Five-Step Cycle
五步循环
Every feature is built by repeating: RED -> DOMAIN -> GREEN -> DOMAIN -> COMMIT.
-
RED -- Write one failing test with one assertion. Only edit test files. Write the code you wish you had -- reference types and functions that do not exist yet. Run the test. Paste the failure output. Stop. Done when: tests run and FAIL (compilation error OR assertion failure).
-
DOMAIN (after RED) -- Review the test for primitive obsession and invalid-state risks. Create type definitions with stub bodies (,
todo!(), etc.). Do not implement logic. Stop. Done when: tests COMPILE but still FAIL (assertion/panic, not compilation error).raise NotImplementedError -
GREEN -- Write the minimal code to make the test pass. Only edit production files. Run the test. Paste passing output. Stop. Done when: tests PASS with minimal implementation.
-
DOMAIN (after GREEN) -- Review the implementation for domain violations: anemic models, leaked validation, primitive obsession that slipped through. If violations found, raise a concern and propose a revision. Done when: types are clean and tests still pass.
-
COMMIT -- Run the full test suite. Stage all changes and create a git commit referencing the GWT scenario. This is a hard gate: no new RED phase may begin until this commit exists. Done when: git commit created with all tests passing.
After step 5, either start the next RED phase or tidy the code (structural
changes only, separate commit).
A compilation failure IS a test failure. Do not pre-create types to avoid
compilation errors. Types flow FROM tests, never precede them.
Domain review has veto power over primitive obsession and invalid-state
representability. Vetoes escalate to the human after two rounds.
每个功能的构建都遵循重复循环:RED -> DOMAIN -> GREEN -> DOMAIN -> COMMIT。
-
RED——编写一个包含单个断言的失败测试。仅编辑测试文件。编写你期望拥有的代码——引用尚不存在的类型和函数。运行测试,粘贴失败输出,停止操作。 完成标志:测试运行并失败(编译错误或断言失败)。
-
DOMAIN(RED之后)——评审测试,检查是否存在原始类型痴迷和无效状态风险。创建带有存根体的类型定义(、
todo!()等)。不要实现逻辑,停止操作。 完成标志:测试可编译但仍失败(断言/ panic,而非编译错误)。raise NotImplementedError -
GREEN——编写最少代码使测试通过。仅编辑生产文件。运行测试,粘贴通过输出,停止操作。 完成标志:测试通过,且实现代码最简。
-
DOMAIN(GREEN之后)——评审实现代码,检查是否存在领域违规:贫血模型、验证泄露、遗漏的原始类型痴迷问题。若发现违规,提出问题并建议修订。 完成标志:类型定义清晰,且测试仍通过。
-
COMMIT——运行完整测试套件。暂存所有变更并创建引用GWT场景的git提交。这是一个硬性关卡:在提交完成前,不得启动新的RED阶段。 完成标志:创建git提交,且所有测试通过。
完成第5步后,要么启动下一个RED阶段,要么整理代码(仅结构变更,单独提交)。
编译错误属于测试失败。请勿预先创建类型以避免编译错误。类型应从测试衍生,而非先于测试存在。
领域评审对原始类型痴迷和无效状态表示拥有否决权。若经过两轮仍未解决,需升级至人工处理。
User-Facing Modes
用户可见模式
Guided mode (, , , ):
Each phase loads with detailed instructions for that
step. For experienced engineers who want explicit phase control. Works on
any harness -- no delegation primitives required. The human decides when to
advance phases.
/tdd red/tdd domain/tdd green/tdd commitreferences/{phase}.mdAutomated mode ( or ):
The system detects harness capabilities, selects an execution strategy, and
orchestrates the full cycle. The user sees working code, not sausage-making.
For verbose output showing phase transitions and evidence, use .
/tdd/tdd auto/tdd auto --verbose引导模式(、、、):
每个阶段会加载,包含该步骤的详细说明。适合需要明确阶段控制的资深工程师。可在任何工具框架上运行——无需委托原语。由人工决定何时推进阶段。
/tdd red/tdd domain/tdd green/tdd commitreferences/{phase}.md自动化模式(或):
系统检测工具框架能力,选择执行策略并编排完整循环。用户仅需查看可用代码,无需关注内部流程。若需显示阶段转换和证据的详细输出,使用。
/tdd/tdd auto/tdd auto --verboseCapability Detection (Automated Mode)
能力检测(自动化模式)
When automated mode activates, detect available primitives in this order:
- Agent teams available? Check for TeamCreate tool. If present, use the agent teams strategy with persistent pair sessions.
- Subagents available? Check for Task tool (subagent spawning). If present, use the serial subagents strategy with focused per-phase agents.
- Fallback. Use the chaining strategy -- role-switch internally between phases within a single context.
Select the most capable strategy available. Do not attempt a higher strategy
when its primitives are missing.
启动自动化模式时,按以下顺序检测可用原语:
- 是否支持Agent团队? 检查是否有TeamCreate工具。若存在,使用Agent团队策略,搭配持久结对会话。
- 是否支持子Agent? 检查是否有Task工具(用于生成子Agent)。若存在,使用串行子Agent策略,搭配专注于各阶段的Agent。
- 回退方案:使用链式策略——在单个上下文内,各阶段之间切换角色。
选择可用的最强大策略。若缺少对应原语,请勿尝试更高阶策略。
Execution Strategy: Chaining (Fallback)
执行策略:链式(回退方案)
Used when no delegation primitives are available. The agent plays each role
sequentially:
- Load . Execute the RED phase.
references/red.md - Load . Execute DOMAIN review of the test.
references/domain.md - Load . Execute the GREEN phase.
references/green.md - Load . Execute DOMAIN review of the implementation.
references/domain.md - Load . Execute the COMMIT phase.
references/commit.md - Repeat.
Role boundaries are advisory in this mode. The agent must self-enforce phase
boundaries: only edit file types permitted by the current phase (see
).
references/phase-boundaries.md当无委托原语可用时使用。Agent会依次扮演每个角色:
- 加载,执行RED阶段。
references/red.md - 加载,对测试进行DOMAIN评审。
references/domain.md - 加载,执行GREEN阶段。
references/green.md - 加载,对实现代码进行DOMAIN评审。
references/domain.md - 加载,执行COMMIT阶段。
references/commit.md - 重复循环。
在此模式下,角色边界仅为建议性。Agent必须自我强制执行阶段边界:仅编辑当前阶段允许的文件类型(详见)。
references/phase-boundaries.mdExecution Strategy: Serial Subagents
执行策略:串行子Agent
Used when the Task tool is available for spawning focused subagents. Each
phase runs in an isolated subagent with constrained scope.
- Spawn each phase agent using the prompt template in .
references/{phase}-prompt.md - The orchestrator follows for coordination rules.
references/orchestrator.md - Structural handoff schema (): every phase agent must return evidence fields (test output, file paths changed, domain concerns). Missing evidence fields = handoff blocked. The orchestrator does not proceed to the next phase until the schema is satisfied.
references/handoff-schema.md - Context isolation provides structural enforcement: each subagent receives only the files relevant to its phase.
当有Task工具可用于生成专注子Agent时使用。每个阶段在独立的子Agent中运行,范围受限。
- 使用中的提示模板生成每个阶段的Agent。
references/{phase}-prompt.md - 编排器遵循中的协调规则。
references/orchestrator.md - 结构化交接模式():每个阶段Agent必须返回证据字段(测试输出、变更文件路径、领域问题)。缺少证据字段将阻止交接。编排器在模式满足前不会推进至下一阶段。
references/handoff-schema.md - 上下文隔离提供结构强制:每个子Agent仅接收与其阶段相关的文件。
Execution Strategy: Agent Teams
执行策略:Agent团队
Used when TeamCreate is available for persistent agent sessions. Maximum
enforcement through role specialization and persistent pair context.
- Follow for pair session lifecycle, role selection, structured handoffs, and drill-down ownership.
references/ping-pong-pairing.md - Both engineers persist for the entire TDD cycle of a vertical slice. Handoffs happen via lightweight structured messages, not agent recreation.
- Track pairing history in . Do not repeat either of the last 2 pairings.
.team/pairing-history.json - The orchestrator monitors and intervenes only for external clarification routing or blocking disagreements.
当有TeamCreate工具可用于持久Agent会话时使用。通过角色专业化和持久结对上下文实现最大程度的强制。
- 遵循中的结对会话生命周期、角色选择、结构化交接和深入所有权规则。
references/ping-pong-pairing.md - 两名工程师(Agent)会在垂直切片的整个TDD循环中持续存在。交接通过轻量级结构化消息完成,而非重新创建Agent。
- 在中跟踪结对历史。请勿重复最近2次结对组合中的任何一种。
.team/pairing-history.json - 编排器仅在外部澄清路由或阻塞性分歧时进行监控和干预。
Phase Boundary Rules
阶段边界规则
Each phase edits only its own file types. This prevents drift. See
for the complete file-type matrix.
references/phase-boundaries.md| Phase | Can Edit | Cannot Edit |
|---|---|---|
| RED | Test files | Production code, type definitions |
| DOMAIN | Type definitions (stubs) | Test logic, implementation bodies |
| GREEN | Implementation bodies | Test files, type signatures |
| COMMIT | Nothing -- git operations only | All source files |
If blocked by a boundary, stop and return to the orchestrator (automated) or
report to the user (guided). Never circumvent boundaries.
每个阶段仅可编辑其对应的文件类型,防止偏离。完整文件类型矩阵详见。
references/phase-boundaries.md| 阶段 | 可编辑 | 不可编辑 |
|---|---|---|
| RED | 测试文件 | 生产代码、类型定义 |
| DOMAIN | 类型定义(存根) | 测试逻辑、实现主体 |
| GREEN | 实现主体 | 测试文件、类型签名 |
| COMMIT | 无——仅执行git操作 | 所有源文件 |
若被边界规则阻止,停止操作并返回编排器(自动化模式)或告知用户(引导模式)。绝不可规避边界规则。
Walking Skeleton First
先构建可行骨架
The first vertical slice must be a walking skeleton: the thinnest end-to-end
path proving all architectural layers connect. It may use hardcoded values or
stubs. Build it before any other slice. It de-risks the architecture and gives
subsequent slices a proven wiring path to extend.
第一个垂直切片必须是可行骨架:证明所有架构层连通的最简端到端路径。可使用硬编码值或存根。在构建其他切片前完成此步骤。它可降低架构风险,并为后续切片提供经过验证的连接路径。
Outside-In TDD
由外而内的TDD
Start from an acceptance test at the application boundary -- the point where
external input enters the system. Drill inward through unit tests. The outer
acceptance test stays RED while inner unit tests go through their own
red-green-domain-commit cycles. The slice is complete only when the outer
acceptance test passes.
A test that calls internal functions directly is a unit test, not an acceptance
test -- even if it asserts on user-visible behavior.
Boundary enforcement by mode:
- Pipeline mode: The CYCLE_COMPLETE evidence must include and
boundary_typeon the acceptance test. The pipeline's TDD gate rejects evidence where the acceptance test calls internal functions directly.boundary_evidence - Automated mode (non-pipeline): The orchestrator checks boundary scope and re-delegates if the first test is not a boundary test. Advisory -- no gate blocks progression.
- Guided mode: The human is responsible for ensuring boundary-level tests. The skill text instructs correct behavior but cannot enforce it.
从应用边界的验收测试开始——即外部输入进入系统的节点。通过单元测试向内深入。外部验收测试保持RED状态,而内部单元测试完成各自的red-green-domain-commit循环。仅当外部验收测试通过时,切片才算完成。
直接调用内部函数的测试属于单元测试,而非验收测试——即使它断言用户可见行为。
模式下的边界强制:
- 流水线模式: CYCLE_COMPLETE证据必须包含验收测试的和
boundary_type。流水线的TDD关卡会拒绝验收测试直接调用内部函数的证据。boundary_evidence - 自动化模式(非流水线): 编排器会检查边界范围,若首个测试不是边界测试,会重新委托。仅为建议性——无关卡阻止推进。
- 引导模式: 由人工负责确保边界级测试。技能文本会指导正确行为,但无法强制。
Cycle-Complete Evidence
循环完成证据
At the end of each complete RED-DOMAIN-GREEN-DOMAIN-COMMIT cycle, produce
a CYCLE_COMPLETE evidence packet containing: slice_id, acceptance_test
{file, name, output, boundary_type, boundary_evidence}, unit_tests
{count, all_passing, output}, domain_reviews [{phase, verdict, concerns}],
commits [{hash, message}], rework_cycles, pair {driver, navigator}.
When is provided in context metadata, the TDD skill
operates in pipeline mode: it receives a and stores
evidence to .
When running standalone, the evidence is informational only (not stored).
pipeline-stateslice_id.factory/audit-trail/slices/<slice-id>/tdd-cycles/cycle-NNN.jsonSee for full schema.
references/cycle-evidence.md在每个完整的RED-DOMAIN-GREEN-DOMAIN-COMMIT循环结束时,生成CYCLE_COMPLETE证据包,包含:slice_id、acceptance_test {file, name, output, boundary_type, boundary_evidence}、unit_tests {count, all_passing, output}、domain_reviews [{phase, verdict, concerns}]、commits [{hash, message}]、rework_cycles、pair {driver, navigator}。
若上下文元数据中提供,TDD技能将在流水线模式下运行:接收并将证据存储至。独立运行时,证据仅作信息参考(不存储)。
pipeline-stateslice_id.factory/audit-trail/slices/<slice-id>/tdd-cycles/cycle-NNN.json完整模式详见。
references/cycle-evidence.mdHarness-Specific Guidance
工具框架特定指南
If running on Claude Code, also read for
harness-specific rules including hook-based enforcement. For maximum
mechanical enforcement, ask the bootstrap skill to install optional hooks
from .
references/claude-code.mdreferences/hooks/claude-code-hooks.json若在Claude Code上运行,还需阅读中的工具框架特定规则,包括基于钩子的强制。若需最大程度的机械强制,可要求引导技能从安装可选钩子。
references/claude-code.mdreferences/hooks/claude-code-hooks.jsonEnforcement Note
强制说明
Enforcement is proportional to capability:
- Guided mode: Advisory. The skill text instructs correct behavior but cannot prevent violations. The human enforces by controlling phase transitions.
- Automated mode (chaining): Advisory with self-enforcement. The agent follows phase boundaries by convention.
- Automated mode (serial subagents): Structural enforcement via context isolation and handoff schemas. Subagents receive only phase-relevant files. Missing evidence blocks handoffs.
- Automated mode (agent teams): Maximum enforcement through role specialization. Neither engineer can skip review because the other is watching. Persistent context means accumulated understanding, not just rules.
- Optional hooks (Claude Code): Mechanical enforcement. Pre-tool-use hooks
block unauthorized file edits per phase. See .
references/claude-code.md
No mode guarantees perfect discipline. If you observe violations -- production
code edited during RED, domain review skipped, commits missing -- point it out.
强制力度与能力成正比:
- 引导模式: 建议性。技能文本指导正确行为,但无法阻止违规。由人工通过控制阶段转换进行强制。
- 自动化模式(链式): 建议性,辅以自我强制。Agent按约定遵循阶段边界。
- 自动化模式(串行子Agent): 通过上下文隔离和交接模式实现结构强制。子Agent仅接收与阶段相关的文件。缺少证据会阻止交接。
- 自动化模式(Agent团队): 通过角色专业化实现最大程度的强制。两名工程师都无法跳过评审,因为另一方会监督。持久上下文意味着积累的理解,而非仅依赖规则。
- 可选钩子(Claude Code):机械强制。工具使用前的钩子会阻止每个阶段的未授权文件编辑。详见。
references/claude-code.md
没有任何模式能保证完美的纪律性。若发现违规行为——如在RED阶段编辑生产代码、跳过领域评审、遗漏提交——请指出。
Verification
验证
After completing a cycle, verify:
- Every failing test was written BEFORE its implementation
- Domain review occurred after EVERY RED and GREEN phase
- Phase boundary rules were respected (file-type restrictions)
- Evidence (test output) was provided at each handoff
- Commit exists for every completed RED-GREEN cycle
- Walking skeleton completed first (first vertical slice)
HARD GATE -- COMMIT (must pass before any new RED phase):
- All tests pass
- Git commit created with message referencing the current GWT scenario
- No new RED phase started before this commit was made
完成循环后,验证以下内容:
- 所有失败测试均在实现代码之前编写
- 每个RED和GREEN阶段后都进行了领域评审
- 遵守了阶段边界规则(文件类型限制)
- 每次交接都提供了证据(测试输出)
- 每个完成的RED-GREEN循环都有对应的提交
- 首个垂直切片完成了可行骨架构建
硬性关卡——COMMIT(启动新RED阶段前必须通过):
- 所有测试通过
- 创建了引用当前GWT场景的git提交
- 在提交完成前未启动新的RED阶段
Dependencies
依赖
This skill works standalone. For enhanced workflows, it integrates with:
- domain-modeling: Strengthens the domain review phases with parse-don't-validate, semantic types, and invalid-state prevention principles.
- code-review: Three-stage review (spec compliance, code quality, domain integrity) after TDD cycles complete.
- mutation-testing: Validates test quality by checking that tests detect injected mutations in production code.
- ensemble-team: Provides real-world expert personas for pair selection and mob review.
Missing a dependency? Install with:
npx skills add jwilger/agent-skills --skill domain-modeling该技能可独立运行。若需增强工作流,可与以下技能集成:
- domain-modeling: 通过parse-don't-validate、语义类型和无效状态预防原则强化领域评审阶段。
- code-review: TDD循环完成后进行三阶段评审(规范合规性、代码质量、领域完整性)。
- mutation-testing: 通过检查测试是否能检测到生产代码中注入的突变,验证测试质量。
- ensemble-team: 为结对选择和群体评审提供真实世界的专家角色。
缺少依赖?使用以下命令安装:
npx skills add jwilger/agent-skills --skill domain-modeling