tdd

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Test-Driven Development (TDD)

测试驱动开发(TDD)

Strict Red-Green-Refactor workflow for robust, self-documenting, production-ready code.

用于生成健壮、自解释、可投产代码的严格Red-Green-Refactor工作流。

Quick Navigation

快速导航

Situation	Go To
New to this codebase	Step 1: Explore Environment
Know the framework, starting work	Step 2: Select Mode
Need the core loop reference	Step 3: Core TDD Loop
Complex edge cases to cover	Property-Based Testing
Tests are flaky/unreliable	Flaky Test Management
Need isolated test environment	Hermetic Testing
Measuring test quality	Mutation Testing

场景	跳转至
首次接触该代码库	步骤1：探索测试环境
已知测试框架，准备开始开发	步骤2：选择工作模式
需要核心循环参考	步骤3：TDD核心循环
需要覆盖复杂边界场景	属性化测试
测试结果不稳定/不可靠	脆弱测试管理
需要隔离的测试环境	密封测试
衡量测试质量	变异测试

The Three Rules (Robert C. Martin)

三大规则（罗伯特·C·马丁）

No Production Code without a failing test
Write Only Enough Test to Fail (compilation errors count)
Write Only Enough Code to Pass (no optimizations yet)

The Loop: 🔴 RED (write failing test) → 🟢 GREEN (minimal code to pass) → 🔵 REFACTOR (clean up) → Repeat

没有失败的测试就不写生产代码
仅编写刚好能失败的测试（编译错误也算失败）
仅编写刚好能通过测试的代码（暂不做任何优化）

核心循环： 🔴 RED（编写失败测试）→ 🟢 GREEN（编写最少代码通过测试）→ 🔵 REFACTOR（代码清理）→ 重复

Step 1: Explore Test Environment

步骤1：探索测试环境

Do NOT assume anything. Explore the codebase first.

Checklist:

Search for test files:

glob("**/*.test.*")

glob("**/*.spec.*")

glob("**/test_*.py")

Check
```
package.json
```
scripts,
```
Makefile
```
, or CI workflows

Look for config:

vitest.config.*

jest.config.*

pytest.ini

Cargo.toml

Framework Detection:

Language	Config Files	Test Command
Node.js	`package.json` , `vitest.config.*`	`npm test` , `bun test`
Python	`pyproject.toml` , `pytest.ini`	`pytest`
Go	`go.mod` , `*_test.go`	`go test ./...`
Rust	`Cargo.toml`	`cargo test`

不要做任何假设，先探索代码库。

检查清单：

搜索测试文件：

glob("**/*.test.*")

、

glob("**/*.spec.*")

、

glob("**/test_*.py")

检查
```
package.json
```
脚本、
```
Makefile
```
或CI工作流配置

查找配置文件：

vitest.config.*

、

jest.config.*

、

pytest.ini

、

Cargo.toml

框架检测：

开发语言	配置文件	测试命令
Node.js	`package.json` 、 `vitest.config.*`	`npm test` 、 `bun test`
Python	`pyproject.toml` 、 `pytest.ini`	`pytest`
Go	`go.mod` 、 `*_test.go`	`go test ./...`
Rust	`Cargo.toml`	`cargo test`

Step 2: Select Mode

步骤2：选择工作模式

Mode	When	First Action
New Feature	Adding functionality	Read existing module tests, confirm green baseline
Bug Fix	Reproducing issue	Write failing reproduction test FIRST
Refactor	Cleaning code	Ensure ≥80% coverage on target code
Legacy	No tests exist	Add characterization tests before changing

Tie-breaker: If coverage <20% or tests absent → use Legacy Mode first.

模式	适用场景	第一步操作
新功能开发	新增功能	阅读现有模块的测试，确认基线测试全部通过
缺陷修复	复现问题	优先编写可复现问题的失败测试
代码重构	代码清理	确保目标代码的测试覆盖率≥80%
遗留代码处理	无可用测试	修改代码前先添加特性测试

优先级规则： 如果覆盖率<20%或无测试 → 优先使用遗留代码模式。

Mode: New Feature

模式：新功能开发

Read existing tests for the module
Run tests to confirm green baseline
Enter Core Loop for new behavior

Commits:

test(module): add test for X

→

feat(module): implement X

阅读目标模块的现有测试
运行测试确认基线全部通过
针对新功能进入核心循环

提交规范：

test(module): add test for X

→

feat(module): implement X

Mode: Bug Fix

模式：缺陷修复

Write failing reproduction test (MUST fail before fix)
Confirm failure is assertion error, not syntax error
Write minimal fix
Run full test suite

Commits:

test: add failing test for bug #123

→

fix: description (#123)

编写可复现缺陷的失败测试（修复前必须确实运行失败）
确认失败是断言错误，而非语法错误
编写最简修复代码
运行完整测试套件

提交规范：

test: add failing test for bug #123

→

fix: 缺陷描述 (#123)

Mode: Refactor

模式：代码重构

Run coverage on the specific function you'll refactor
If coverage <80% → add characterization tests first
Refactor in small steps (ONE change → run tests → repeat)
Never change behavior during refactor

针对要重构的具体函数运行覆盖率检测
如果覆盖率<80% → 优先添加特性测试
小步重构（每次仅做一处修改 → 运行测试 → 重复）
重构过程中绝对不修改代码行为

Mode: Legacy Code

模式：遗留代码处理

Find Seams - insertion points for tests (Sensing Seams, Separation Seams)
Break Dependencies - use Sprout Method or Wrap Method
Add characterization tests (capture current behavior)
Build safety net: happy path + error cases + boundaries
Then apply TDD for your changes

→ See

references/examples.md

for full code examples of each mode.

找到接缝 - 可插入测试的接入点（感知接缝、分离接缝）
解除依赖 - 使用萌芽方法或包装方法
添加特性测试（捕获当前代码行为）
搭建测试安全网：覆盖正常路径 + 错误场景 + 边界值
之后再对需要修改的部分应用TDD流程

→ 查看

references/examples.md

获取各模式的完整代码示例。

Step 3: The Core TDD Loop

步骤3：TDD核心循环

Before Starting: Scenario List

开始前：梳理场景清单

List all behaviors to cover:

Happy path cases
Edge cases and boundaries
Error/failure cases
Pessimism: 3 ways this could fail (network, null, invalid state)

列出所有需要覆盖的行为：

正常路径场景
边界场景和极值
错误/失败场景
悲观假设： 列举3种可能的失败原因（网络、空值、无效状态）

🔴 RED Phase

🔴 RED 阶段

Write ONE test (single behavior or edge case)
Use AAA: Arrange → Act → Assert
Run test, verify it FAILS for expected reason

Checks:

Is failure an assertion error? (Not
```
SyntaxError
```
/
```
ModuleNotFoundError
```
)
Can I explain why this should fail?
If test passes immediately → STOP. Test is broken or feature exists.

编写单个测试（覆盖单一行为或边界场景）
采用AAA模式：准备测试数据 → 执行待测逻辑 → 断言结果
运行测试，确认它因预期原因失败

检查项：

失败是否为断言错误？（不是
```
SyntaxError
```
/
```
ModuleNotFoundError
```
这类错误）
我能解释测试失败的原因吗？
如果测试直接通过 → 停止，要么测试本身有问题，要么该功能已经实现。

🟢 GREEN Phase

🟢 GREEN 阶段

Write minimal code to pass
Do NOT implement "perfect" solution
Verify test passes

Checks:

Is this the simplest solution?
Can I delete any of this code and still pass?

编写最少的代码让测试通过
不要实现「完美」方案
确认测试通过

检查项：

这是最简解决方案吗？
删除其中任意部分代码还能通过测试吗？

🔵 REFACTOR Phase

🔵 REFACTOR 阶段

Look for duplication, unclear names, magic values
Clean up without changing behavior
Verify tests still pass

排查重复代码、不清晰的命名、魔法值
清理代码且不改变原有行为
确认测试仍然全部通过

Repeat

重复

Select next scenario, return to RED.

Triangulation: If implementation is too specific (hardcoded), write another test with different inputs to force generalization.

选择下一个测试场景，回到RED阶段。

三角验证： 如果实现过于具体（硬编码），编写另一个不同输入的测试，倒逼代码实现通用化。

Stop Conditions

停止条件

Signal	Response
Test passes immediately	Check assertions, verify feature isn't already built
Test fails for wrong reason	Fix setup/imports first
Flaky test	STOP. Fix non-determinism immediately
Slow feedback (>5s)	Optimize or mock external calls
Coverage decreased	Add tests for uncovered paths

信号	应对措施
测试直接通过	检查断言，确认该功能尚未被实现
测试因非预期原因失败	优先修复配置/导入问题
测试结果不稳定	立即停止，第一时间修复非确定性问题
反馈过慢 (>5秒)	优化或模拟外部调用
覆盖率下降	为未覆盖的路径添加测试

Test Distribution: The Testing Trophy

测试分层：测试奖杯

The Testing Trophy (Kent C. Dodds) reflects modern testing reality: integration tests give the best confidence-to-effort ratio.

          _____________
         /   System    \      ← Few, slow, high confidence; brittle (E2E)
        /_______________\
       /                 \
      /    Integration    \   ← Real interactions between units — **BEST ROI** (Integration)
      \                   /
       \_________________/
         \    Unit     /      ← Fast & cheap but test in isolation (Unit) 
          \___________/
          /   Static  \       ← Typecheck, linting — typos/types (Static)
         /_____________\

测试奖杯（Kent C. Dodds）反映了现代测试的实际情况：集成测试能带来最高的信心投入比。

          _____________
         /   System    \      ← 数量少，运行慢，置信度高；易失效（端到端测试）
        /_______________\
       /                 \
      /    Integration    \   ← 单元间的真实交互 — **投入回报率最高**（集成测试）
      \                   /
       \_________________/
         \    Unit     /      ← 运行快、成本低，但仅测试隔离逻辑（单元测试） 
          \___________/
          /   Static  \       ← 类型检查、语法检查 — 捕获拼写/类型错误（静态检查）
         /_____________\

Layer Breakdown

分层说明

Layer	What	Tools	When
Static	Type errors, syntax, linting	TypeScript, ESLint	Always on, catches 50%+ of bugs for free
Unit	Pure functions, algorithms, utilities	vitest, jest, pytest	Isolated logic with no dependencies
Integration	Components + hooks + services together	Testing Library, MSW, Testcontainers	Real user flows, real(ish) data
E2E	Full app in browser	Playwright, Cypress	Critical paths only (login, checkout)

分层	内容	工具	适用场景
静态检查	类型错误、语法问题、代码规范	TypeScript、ESLint	始终开启，可免费捕获50%以上的缺陷
单元测试	纯函数、算法、工具函数	vitest、jest、pytest	无外部依赖的隔离逻辑
集成测试	组件 + hooks + 服务联合测试	Testing Library、MSW、Testcontainers	真实用户流程、接近真实的业务数据
端到端测试	浏览器中运行的完整应用	Playwright、Cypress	仅覆盖核心路径（登录、支付）

Why Integration Tests Win

集成测试的优势

Unit tests prove code works in isolation. Integration tests prove code works together.

Concern	Unit Test	Integration Test
Component renders	✅	✅
Component + hook works	❌	✅
Component + API works	❌	✅
User flow works	❌	✅
Catches real bugs	Sometimes	Usually

The insight: Most bugs live in the seams between modules, not inside pure functions. Integration tests catch seam bugs; unit tests don't.

单元测试 证明代码在隔离环境下可以运行。集成测试 证明代码联合起来可以正常工作。

考察项	单元测试	集成测试
组件渲染	✅	✅
组件+hook正常工作	❌	✅
组件+API正常工作	❌	✅
用户流程正常	❌	✅
捕获真实缺陷	有时	通常

核心洞察： 大部分缺陷存在于模块之间的接缝处，而非纯函数内部。集成测试可以捕获接缝处的缺陷，单元测试做不到。

Practical Guidance

实践指导

Start with integration tests - Test the way users use your code
Drop to unit tests for complex algorithms or edge cases
Use E2E sparingly - Slow, flaky, expensive to maintain
Let static analysis do the heavy lifting - TypeScript catches more bugs than most unit tests
Prefer fakes over mocks - Fakes have real behavior; mocks just return canned data
SMURF quality: Sustainable, Maintainable, Useful, Resilient, Fast

从集成测试开始 - 按照用户使用代码的方式测试
复杂算法或边界场景降级为单元测试
谨慎使用端到端测试 - 运行慢、易失效、维护成本高
让静态分析承担大部分工作 - TypeScript能捕获比大多数单元测试更多的缺陷
优先使用伪造对象而非模拟对象 - 伪造对象有真实行为，模拟对象仅返回预设数据
SMURF质量标准： 可持续、可维护、有用、有韧性、运行快

Anti-Patterns

反模式

Pattern	Problem	Fix
Mirror Blindness	Same agent writes test AND code	State test intent before GREEN
Happy Path Bias	Only success scenarios	Include errors in Scenario List
Refactoring While Red	Changing structure with failing tests	Get to GREEN first
The Mockery	Over-mocking hides bugs	Prefer fakes or real implementations
Coverage Theater	Tests without meaningful assertions	Assert behavior, not lines
Multi-Test Step	Multiple tests before implementing	One test at a time
Verification Trap 🤖	AI tests what code does not what it should do	State intent in plain language; separate agent review
Test Exploitation 🤖	LLMs exploit weak assertions or overload operators	Use PBT alongside examples; strict equality
Assertion Omission 🤖	Missing edge cases (null, undefined, boundaries)	Scenario list with errors; `test.each`
Hallucinated Mock 🤖	AI generates fake mocks without proper setup	Testcontainers for integration; real Fakes for unit

Critical: Verify tests by (1) running them, (2) having separate agent review, (3) never trusting generated tests blindly.

模式	问题	修复方案
镜像盲区	同一开发者同时编写测试和代码	在GREEN阶段前明确说明测试意图
正常路径偏好	仅测试成功场景	在场景清单中包含错误场景
红色阶段重构	测试失败时修改代码结构	先让测试通过到GREEN阶段
过度模拟	过多模拟隐藏了真实缺陷	优先使用伪造对象或真实实现
覆盖率表演	测试没有有效断言	断言行为而非代码行
多测试步长	编写多个测试后才实现功能	一次仅编写一个测试
验证陷阱 🤖	AI仅测试「代码做了什么」而非「代码应该做什么」	用自然语言明确需求，引入独立评审
测试投机 🤖	大语言模型利用弱断言或重载运算符绕过检测	结合属性化测试和示例测试，使用严格相等断言
断言缺失 🤖	遗漏边界场景（null、undefined、极值）	场景清单包含错误场景，使用 `test.each` 批量测试
幻觉模拟 🤖	AI生成没有合理配置的虚假模拟对象	集成测试使用Testcontainers，单元测试使用真实伪造对象

关键提醒： 验证测试需要(1) 实际运行测试，(2) 引入独立评审，(3) 永远不要盲目信任生成的测试。

Advanced Techniques

高级技术

Use these techniques at specific points in your workflow:

Technique	Use During	Purpose
Test Doubles	🔴 RED phase	Isolate dependencies when writing tests
Property-Based Testing	🔴 RED phase	Cover edge cases for complex logic
Contract Testing	🔴 RED phase	Define API expectations between services
Snapshot Testing	🔴 RED phase	Capture UI/response structure
Hermetic Testing	🔵 Setup	Ensure test isolation and determinism
Mutation Testing	✅ After GREEN	Validate test suite effectiveness
Coverage Analysis	✅ After GREEN	Find untested code paths
Flaky Test Management	🔧 Maintenance	Fix unreliable tests blocking CI

在工作流的特定阶段使用这些技术：

技术	使用阶段	目的
测试替身	🔴 RED阶段	编写测试时隔离依赖
属性化测试	🔴 RED阶段	覆盖复杂逻辑的边界场景
契约测试	🔴 RED阶段	定义服务间的API预期
快照测试	🔴 RED阶段	捕获UI/响应结构
密封测试	🔵 配置阶段	确保测试隔离性和确定性
变异测试	✅ GREEN阶段后	验证测试套件的有效性
覆盖率分析	✅ GREEN阶段后	发现未测试的代码路径
脆弱测试管理	🔧 维护阶段	修复阻塞CI的不稳定测试

Test Doubles (Use: Writing Tests with Dependencies)

测试替身（适用场景：编写有依赖的测试）

When: Your code depends on something slow, unreliable, or complex (DB, API, filesystem).

Type	Purpose	When
Stub	Returns canned answers	Need specific return values
Mock	Verifies interactions	Need to verify calls made
Fake	Simplified implementation	Need real behavior without cost
Spy	Records calls	Need to observe without changing

Decision: Dependency slow/unreliable? → Fake (complex) or Stub (simple). Need to verify calls? → Mock/Spy. Otherwise → real implementation.

→ See

references/examples.md

→ Test Double Examples

适用场景： 代码依赖运行慢、不可靠或复杂的组件（数据库、API、文件系统）。

类型	用途	适用场景
存根(Stub)	返回预设响应	需要特定返回值时
模拟(Mock)	验证交互行为	需要确认接口调用情况时
伪造(Fake)	简化的真实实现	需要真实行为但不想承担性能成本时
间谍(Spy)	记录调用信息	需要观测调用但不改变行为时

决策逻辑： 依赖运行慢/不可靠？→ 伪造(复杂场景)或存根(简单场景)。需要验证调用？→ 模拟/间谍。其他情况 → 使用真实实现。

→ 查看

references/examples.md

→ 测试替身示例

Hermetic Testing (Use: Test Environment Setup)

密封测试（适用场景：测试环境配置）

When: Setting up test infrastructure. Tests must be isolated and deterministic.

Principles:

Isolation: Unique temp directories/state per test
Reset: Clean up in setUp/tearDown
Determinism: No time-based logic or shared mutable state

Database Strategies:

Strategy	Speed	Fidelity	Use When
In-memory (SQLite)	Fast	Low	Unit tests, simple queries
Testcontainers	Medium	High	Integration tests
Transactional Rollback	Fast	High	Tests sharing schema (80x faster than TRUNCATE)

→ See

references/examples.md

→ Hermetic Testing Examples

适用场景： 搭建测试基础设施时，测试必须具备隔离性和确定性。

原则：

隔离性： 每个测试使用唯一的临时目录/状态
重置： 在setUp/tearDown阶段清理资源
确定性： 没有基于时间的逻辑或共享可变状态

数据库策略：

策略	速度	保真度	适用场景
内存数据库(SQLite)	快	低	单元测试、简单查询
Testcontainers	中等	高	集成测试
事务回滚	快	高	共享schema的测试（比TRUNCATE快80倍）

→ 查看

references/examples.md

→ 密封测试示例

Property-Based Testing (Use: Writing Tests for Complex Logic)

属性化测试（适用场景：为复杂逻辑编写测试）

When: Writing tests for algorithms, state machines, serialization, or code with many edge cases.

Tools: fast-check (JS/TS), Hypothesis (Python), proptest (Rust)

Properties to Test:

Commutativity:
```
f(a, b) == f(b, a)
```
Associativity:
```
f(f(a, b), c) == f(a, f(b, c))
```
Identity:
```
f(a, identity) == a
```
Round-trip:
```
decode(encode(x)) == x
```
Metamorphic: If input changes by X, output changes by Y (useful when you don't know expected output)

How: Replace multiple example-based tests with one property test that generates random inputs.

Critical: Always log the seed on failure. Without it, you cannot reproduce the failing case.

→ See

references/examples.md

→ Property-Based Testing Examples

适用场景： 为算法、状态机、序列化或存在大量边界场景的代码编写测试。

工具： fast-check(JS/TS)、Hypothesis(Python)、proptest(Rust)

可测试的属性：

交换律：
```
f(a, b) == f(b, a)
```
结合律：
```
f(f(a, b), c) == f(a, f(b, c))
```
恒等律：
```
f(a, identity) == a
```
往返一致性：
```
decode(encode(x)) == x
```
变质关系：如果输入变化X，输出变化Y（不知道预期输出时非常有用）

使用方式： 用一个可生成随机输入的属性测试替代多个基于示例的测试。

关键提醒： 测试失败时务必记录种子值，没有种子值就无法复现失败场景。

→ 查看

references/examples.md

→ 属性化测试示例

Mutation Testing (Use: Validating Test Quality)

变异测试（适用场景：验证测试质量）

When: After tests pass, to verify they actually catch bugs. Use for critical code (auth, payments) or before major refactors.

Tools: Stryker (JS/TS), PIT (Java), mutmut (Python)

How: Tool mutates your code (e.g., changes

>=

). If tests still pass → your tests are weak.

Interpretation:

>80% mutation score = good test suite
Survived mutants = tests don't catch those changes → add tests for these

Equivalent Mutant Problem: Some mutants change syntax but not behavior (e.g.,

i < 10

→

i != 10

in a loop where i only increments). These can't be killed—100% score is often impossible. Focus on surviving mutants in critical paths, not chasing perfect scores.

When NOT to use: Tool-generated code (OpenAPI clients, Protobuf stubs, ORM models), simple DTOs/getters, legacy code with slow tests, or CI pipelines that must finish in <5 minutes. Use

--incremental --since main

for PR-focused runs. Note: This does NOT mean skip mutation testing on code you (the agent) wrote—always validate your own work.

→ See

references/examples.md

→ Mutation Testing Examples

适用场景： 测试通过后，验证测试确实能捕获缺陷。用于核心代码（权限、支付）或大规模重构前。

工具： Stryker(JS/TS)、PIT(Java)、mutmut(Python)

使用方式： 工具修改你的代码（例如把

改成

>=

），如果修改后测试仍然全部通过 → 说明你的测试强度不足。

结果解读：

变异得分>80% = 优秀的测试套件
存活变异体 = 测试无法捕获这些修改 → 为这些场景添加测试

等价变异体问题： 部分变异体修改了语法但没有改变行为（例如循环中

i < 10

改成

i != 10

，i仅自增）。这类变异体无法被「杀死」，通常不可能拿到100%的得分。重点关注核心路径的存活变异体，不要追求完美得分。

不适用场景： 工具生成的代码（OpenAPI客户端、Protobuf存根、ORM模型）、简单DTO/取值方法、测试运行很慢的遗留代码、要求5分钟内完成的CI流水线。PR场景下使用

--incremental --since main

增量运行。注意：这并不意味着你（Agent）编写的代码可以跳过变异测试——始终要验证自己的产出。

→ 查看

references/examples.md

→ 变异测试示例

Flaky Test Management (Use: CI/CD Maintenance)

脆弱测试管理（适用场景：CI/CD维护）

When: Tests fail intermittently, blocking CI or eroding trust in the test suite.

Root Causes:

Cause	Fix
Timing ( `setTimeout` , races)	Fake timers, await properly
Shared state	Isolate per test
Randomness	Seed or mock
Network	Use MSW or fakes
Order dependency	Make tests independent
Parallel transaction conflicts	Isolate DB connections per worker

How: Detect (

--repeat 10

) → Quarantine (separate suite) → Fix root cause → Restore

Quarantine Rules:

Issue-linked: Every quarantined test MUST link to a tracking issue. Prevents "quarantine-and-forget."
Mute, don't skip: Prefer muting (runs but doesn't fail build) over skipping. You still collect failure data.
Reintroduction criteria: Test must pass N consecutive runs (e.g., 100) on main before leaving quarantine.

→ See

references/examples.md

→ Flaky Test Examples

适用场景： 测试间歇性失败，阻塞CI或降低测试套件可信度。

根本原因：

原因	修复方案
时序问题（ `setTimeout` 、竞态条件）	使用模拟计时器，正确await异步逻辑
共享状态	每个测试隔离状态
随机逻辑	固定种子或模拟随机逻辑
网络依赖	使用MSW或伪造对象
执行顺序依赖	让测试完全独立
并行事务冲突	每个工作线程隔离数据库连接

处理流程： 检测（

--repeat 10

）→ 隔离（放入独立测试套件）→ 修复根本原因 → 恢复

隔离规则：

关联问题： 每个被隔离的测试必须关联跟踪问题，避免「隔离后就遗忘」
静音而非跳过： 优先使用静音（运行但不阻塞构建）而非跳过，仍然可以收集失败数据
恢复标准： 测试在主分支连续通过N次（例如100次）后才能移出隔离区

→ 查看

references/examples.md

→ 脆弱测试示例

Contract Testing (Use: Writing Tests for Service Boundaries)

契约测试（适用场景：为服务边界编写测试）

When: Writing tests for code that calls or exposes APIs. Prevents integration breakage.

How (Pact): Consumer defines expected interactions → Contract published → Provider verifies → CI fails if contract broken.

→ See

references/examples.md

→ Contract Testing Examples

适用场景： 为调用或暴露API的代码编写测试，避免集成失败。

使用方式（Pact）： 消费方定义预期交互 → 发布契约 → 提供方验证契约 → 契约被破坏时CI失败

→ 查看

references/examples.md

→ 契约测试示例

Coverage Analysis (Use: Finding Gaps After Tests Pass)

覆盖率分析（适用场景：测试通过后发现覆盖缺口）

When: After writing tests, to find untested code paths. NOT a goal in itself.

Metric	Measures	Threshold
Line	Lines executed	70-80%
Branch	Decision paths	60-70%
Mutation	Test effectiveness	>80%

Risk-Based Prioritization: P0 (auth, payments) → P1 (core logic) → P2 (helpers) → P3 (config)

Warning: High coverage ≠ good tests. Tests must assert meaningful behavior.

适用场景： 编写测试后，发现未测试的代码路径。覆盖率本身不是最终目标。

指标	衡量内容	阈值
行覆盖率	执行到的代码行	70-80%
分支覆盖率	覆盖的决策路径	60-70%
变异得分	测试有效性	>80%

风险优先级： P0（权限、支付）→ P1（核心逻辑）→ P2（工具函数）→ P3（配置）

警告： 高覆盖率≠高质量测试，测试必须断言有意义的行为。

Snapshot Testing (Use: Writing Tests for UI/Output Structure)

快照测试（适用场景：为UI/输出结构编写测试）

When: Writing tests for UI components, API responses, or error message formats.

Appropriate: UI structure, API response shapes, error formats. Avoid: Behavior testing, dynamic content, entire pages.

How: Capture output once, verify it doesn't change unexpectedly. Always review diffs carefully.

→ See

references/examples.md

→ Snapshot Testing Examples

适用场景： 为UI组件、API响应或错误信息格式编写测试。

适用： UI结构、API响应格式、错误信息格式。 避免： 行为测试、动态内容、整页测试。

使用方式： 首次运行捕获输出，后续运行验证输出没有意外变更。务必仔细审查快照差异。

→ 查看

references/examples.md

→ 快照测试示例

Integration with Other Skills

与其他技能的联动

Task	Skill	Usage
Committing	`git-commit`	`test:` for RED, `feat:` for GREEN
Code Quality	`code-quality`	Run during REFACTOR phase
Documentation	`docs-check`	Check if behavior changes need docs

任务	技能	使用方式
代码提交	`git-commit`	RED阶段用 `test:` 前缀，GREEN阶段用 `feat:` 前缀
代码质量	`code-quality`	REFACTOR阶段运行
文档	`docs-check`	检查行为变更是否需要更新文档

References

参考资料

Foundational:

Three Rules of TDD - Robert C. Martin
Test Pyramid - Martin Fowler
Testing Trophy - Kent C. Dodds
Working Effectively with Legacy Code - Michael Feathers

Tools: Testcontainers | fast-check | Stryker | MSW | Pact

基础资料：

TDD三大规则 - 罗伯特·C·马丁
测试金字塔 - 马丁·福勒
测试奖杯 - Kent C. Dodds
遗留代码高效改造 - Michael Feathers

工具： Testcontainers | fast-check | Stryker | MSW | Pact