tdd

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Test-Driven Development (TDD)

测试驱动开发(TDD)

Strict Red-Green-Refactor workflow for robust, self-documenting, production-ready code.
用于生成健壮、自解释、可投产代码的严格Red-Green-Refactor工作流。

Quick Navigation

快速导航

SituationGo To
New to this codebaseStep 1: Explore Environment
Know the framework, starting workStep 2: Select Mode
Need the core loop referenceStep 3: Core TDD Loop
Complex edge cases to coverProperty-Based Testing
Tests are flaky/unreliableFlaky Test Management
Need isolated test environmentHermetic Testing
Measuring test qualityMutation Testing
场景跳转至
首次接触该代码库步骤1:探索测试环境
已知测试框架,准备开始开发步骤2:选择工作模式
需要核心循环参考步骤3:TDD核心循环
需要覆盖复杂边界场景属性化测试
测试结果不稳定/不可靠脆弱测试管理
需要隔离的测试环境密封测试
衡量测试质量变异测试

The Three Rules (Robert C. Martin)

三大规则(罗伯特·C·马丁)

  1. No Production Code without a failing test
  2. Write Only Enough Test to Fail (compilation errors count)
  3. Write Only Enough Code to Pass (no optimizations yet)
The Loop: 🔴 RED (write failing test) → 🟢 GREEN (minimal code to pass) → 🔵 REFACTOR (clean up) → Repeat

  1. 没有失败的测试就不写生产代码
  2. 仅编写刚好能失败的测试(编译错误也算失败)
  3. 仅编写刚好能通过测试的代码(暂不做任何优化)
核心循环: 🔴 RED(编写失败测试)→ 🟢 GREEN(编写最少代码通过测试)→ 🔵 REFACTOR(代码清理)→ 重复

Step 1: Explore Test Environment

步骤1:探索测试环境

Do NOT assume anything. Explore the codebase first.
Checklist:
  • Search for test files:
    glob("**/*.test.*")
    ,
    glob("**/*.spec.*")
    ,
    glob("**/test_*.py")
  • Check
    package.json
    scripts,
    Makefile
    , or CI workflows
  • Look for config:
    vitest.config.*
    ,
    jest.config.*
    ,
    pytest.ini
    ,
    Cargo.toml
Framework Detection:
LanguageConfig FilesTest Command
Node.js
package.json
,
vitest.config.*
npm test
,
bun test
Python
pyproject.toml
,
pytest.ini
pytest
Go
go.mod
,
*_test.go
go test ./...
Rust
Cargo.toml
cargo test

不要做任何假设,先探索代码库。
检查清单:
  • 搜索测试文件:
    glob("**/*.test.*")
    glob("**/*.spec.*")
    glob("**/test_*.py")
  • 检查
    package.json
    脚本、
    Makefile
    或CI工作流配置
  • 查找配置文件:
    vitest.config.*
    jest.config.*
    pytest.ini
    Cargo.toml
框架检测:
开发语言配置文件测试命令
Node.js
package.json
vitest.config.*
npm test
bun test
Python
pyproject.toml
pytest.ini
pytest
Go
go.mod
*_test.go
go test ./...
Rust
Cargo.toml
cargo test

Step 2: Select Mode

步骤2:选择工作模式

ModeWhenFirst Action
New FeatureAdding functionalityRead existing module tests, confirm green baseline
Bug FixReproducing issueWrite failing reproduction test FIRST
RefactorCleaning codeEnsure ≥80% coverage on target code
LegacyNo tests existAdd characterization tests before changing
Tie-breaker: If coverage <20% or tests absent → use Legacy Mode first.
模式适用场景第一步操作
新功能开发新增功能阅读现有模块的测试,确认基线测试全部通过
缺陷修复复现问题优先编写可复现问题的失败测试
代码重构代码清理确保目标代码的测试覆盖率≥80%
遗留代码处理无可用测试修改代码前先添加特性测试
优先级规则: 如果覆盖率<20%或无测试 → 优先使用遗留代码模式

Mode: New Feature

模式:新功能开发

  1. Read existing tests for the module
  2. Run tests to confirm green baseline
  3. Enter Core Loop for new behavior
  4. Commits:
    test(module): add test for X
    feat(module): implement X
  1. 阅读目标模块的现有测试
  2. 运行测试确认基线全部通过
  3. 针对新功能进入核心循环
  4. 提交规范:
    test(module): add test for X
    feat(module): implement X

Mode: Bug Fix

模式:缺陷修复

  1. Write failing reproduction test (MUST fail before fix)
  2. Confirm failure is assertion error, not syntax error
  3. Write minimal fix
  4. Run full test suite
  5. Commits:
    test: add failing test for bug #123
    fix: description (#123)
  1. 编写可复现缺陷的失败测试(修复前必须确实运行失败)
  2. 确认失败是断言错误,而非语法错误
  3. 编写最简修复代码
  4. 运行完整测试套件
  5. 提交规范:
    test: add failing test for bug #123
    fix: 缺陷描述 (#123)

Mode: Refactor

模式:代码重构

  1. Run coverage on the specific function you'll refactor
  2. If coverage <80% → add characterization tests first
  3. Refactor in small steps (ONE change → run tests → repeat)
  4. Never change behavior during refactor
  1. 针对要重构的具体函数运行覆盖率检测
  2. 如果覆盖率<80% → 优先添加特性测试
  3. 小步重构(每次仅做一处修改 → 运行测试 → 重复)
  4. 重构过程中绝对不修改代码行为

Mode: Legacy Code

模式:遗留代码处理

  1. Find Seams - insertion points for tests (Sensing Seams, Separation Seams)
  2. Break Dependencies - use Sprout Method or Wrap Method
  3. Add characterization tests (capture current behavior)
  4. Build safety net: happy path + error cases + boundaries
  5. Then apply TDD for your changes
→ See
references/examples.md
for full code examples of each mode.

  1. 找到接缝 - 可插入测试的接入点(感知接缝、分离接缝)
  2. 解除依赖 - 使用萌芽方法或包装方法
  3. 添加特性测试(捕获当前代码行为)
  4. 搭建测试安全网:覆盖正常路径 + 错误场景 + 边界值
  5. 之后再对需要修改的部分应用TDD流程
→ 查看
references/examples.md
获取各模式的完整代码示例。

Step 3: The Core TDD Loop

步骤3:TDD核心循环

Before Starting: Scenario List

开始前:梳理场景清单

List all behaviors to cover:
  • Happy path cases
  • Edge cases and boundaries
  • Error/failure cases
  • Pessimism: 3 ways this could fail (network, null, invalid state)
列出所有需要覆盖的行为:
  • 正常路径场景
  • 边界场景和极值
  • 错误/失败场景
  • 悲观假设: 列举3种可能的失败原因(网络、空值、无效状态)

🔴 RED Phase

🔴 RED 阶段

  1. Write ONE test (single behavior or edge case)
  2. Use AAA: Arrange → Act → Assert
  3. Run test, verify it FAILS for expected reason
Checks:
  • Is failure an assertion error? (Not
    SyntaxError
    /
    ModuleNotFoundError
    )
  • Can I explain why this should fail?
  • If test passes immediately → STOP. Test is broken or feature exists.
  1. 编写单个测试(覆盖单一行为或边界场景)
  2. 采用AAA模式:准备测试数据 → 执行待测逻辑 → 断言结果
  3. 运行测试,确认它因预期原因失败
检查项:
  • 失败是否为断言错误?(不是
    SyntaxError
    /
    ModuleNotFoundError
    这类错误)
  • 我能解释测试失败的原因吗?
  • 如果测试直接通过 → 停止,要么测试本身有问题,要么该功能已经实现。

🟢 GREEN Phase

🟢 GREEN 阶段

  1. Write minimal code to pass
  2. Do NOT implement "perfect" solution
  3. Verify test passes
Checks:
  • Is this the simplest solution?
  • Can I delete any of this code and still pass?
  1. 编写最少的代码让测试通过
  2. 不要实现「完美」方案
  3. 确认测试通过
检查项:
  • 这是最简解决方案吗?
  • 删除其中任意部分代码还能通过测试吗?

🔵 REFACTOR Phase

🔵 REFACTOR 阶段

  1. Look for duplication, unclear names, magic values
  2. Clean up without changing behavior
  3. Verify tests still pass
  1. 排查重复代码、不清晰的命名、魔法值
  2. 清理代码且不改变原有行为
  3. 确认测试仍然全部通过

Repeat

重复

Select next scenario, return to RED.
Triangulation: If implementation is too specific (hardcoded), write another test with different inputs to force generalization.

选择下一个测试场景,回到RED阶段。
三角验证: 如果实现过于具体(硬编码),编写另一个不同输入的测试,倒逼代码实现通用化。

Stop Conditions

停止条件

SignalResponse
Test passes immediatelyCheck assertions, verify feature isn't already built
Test fails for wrong reasonFix setup/imports first
Flaky testSTOP. Fix non-determinism immediately
Slow feedback (>5s)Optimize or mock external calls
Coverage decreasedAdd tests for uncovered paths

信号应对措施
测试直接通过检查断言,确认该功能尚未被实现
测试因非预期原因失败优先修复配置/导入问题
测试结果不稳定立即停止,第一时间修复非确定性问题
反馈过慢 (>5秒)优化或模拟外部调用
覆盖率下降为未覆盖的路径添加测试

Test Distribution: The Testing Trophy

测试分层:测试奖杯

The Testing Trophy (Kent C. Dodds) reflects modern testing reality: integration tests give the best confidence-to-effort ratio.
          _____________
         /   System    \      ← Few, slow, high confidence; brittle (E2E)
        /_______________\
       /                 \
      /    Integration    \   ← Real interactions between units — **BEST ROI** (Integration)
      \                   /
       \_________________/
         \    Unit     /      ← Fast & cheap but test in isolation (Unit) 
          \___________/
          /   Static  \       ← Typecheck, linting — typos/types (Static)
         /_____________\
测试奖杯(Kent C. Dodds)反映了现代测试的实际情况:集成测试能带来最高的信心投入比
          _____________
         /   System    \      ← 数量少,运行慢,置信度高;易失效(端到端测试)
        /_______________\
       /                 \
      /    Integration    \   ← 单元间的真实交互 — **投入回报率最高**(集成测试)
      \                   /
       \_________________/
         \    Unit     /      ← 运行快、成本低,但仅测试隔离逻辑(单元测试) 
          \___________/
          /   Static  \       ← 类型检查、语法检查 — 捕获拼写/类型错误(静态检查)
         /_____________\

Layer Breakdown

分层说明

LayerWhatToolsWhen
StaticType errors, syntax, lintingTypeScript, ESLintAlways on, catches 50%+ of bugs for free
UnitPure functions, algorithms, utilitiesvitest, jest, pytestIsolated logic with no dependencies
IntegrationComponents + hooks + services togetherTesting Library, MSW, TestcontainersReal user flows, real(ish) data
E2EFull app in browserPlaywright, CypressCritical paths only (login, checkout)
分层内容工具适用场景
静态检查类型错误、语法问题、代码规范TypeScript、ESLint始终开启,可免费捕获50%以上的缺陷
单元测试纯函数、算法、工具函数vitest、jest、pytest无外部依赖的隔离逻辑
集成测试组件 + hooks + 服务联合测试Testing Library、MSW、Testcontainers真实用户流程、接近真实的业务数据
端到端测试浏览器中运行的完整应用Playwright、Cypress仅覆盖核心路径(登录、支付)

Why Integration Tests Win

集成测试的优势

Unit tests prove code works in isolation. Integration tests prove code works together.
ConcernUnit TestIntegration Test
Component renders
Component + hook works
Component + API works
User flow works
Catches real bugsSometimesUsually
The insight: Most bugs live in the seams between modules, not inside pure functions. Integration tests catch seam bugs; unit tests don't.
单元测试 证明代码在隔离环境下可以运行。集成测试 证明代码联合起来可以正常工作。
考察项单元测试集成测试
组件渲染
组件+hook正常工作
组件+API正常工作
用户流程正常
捕获真实缺陷有时通常
核心洞察: 大部分缺陷存在于模块之间的接缝处,而非纯函数内部。集成测试可以捕获接缝处的缺陷,单元测试做不到。

Practical Guidance

实践指导

  1. Start with integration tests - Test the way users use your code
  2. Drop to unit tests for complex algorithms or edge cases
  3. Use E2E sparingly - Slow, flaky, expensive to maintain
  4. Let static analysis do the heavy lifting - TypeScript catches more bugs than most unit tests
  5. Prefer fakes over mocks - Fakes have real behavior; mocks just return canned data
  6. SMURF quality: Sustainable, Maintainable, Useful, Resilient, Fast

  1. 从集成测试开始 - 按照用户使用代码的方式测试
  2. 复杂算法或边界场景降级为单元测试
  3. 谨慎使用端到端测试 - 运行慢、易失效、维护成本高
  4. 让静态分析承担大部分工作 - TypeScript能捕获比大多数单元测试更多的缺陷
  5. 优先使用伪造对象而非模拟对象 - 伪造对象有真实行为,模拟对象仅返回预设数据
  6. SMURF质量标准: 可持续、可维护、有用、有韧性、运行快

Anti-Patterns

反模式

PatternProblemFix
Mirror BlindnessSame agent writes test AND codeState test intent before GREEN
Happy Path BiasOnly success scenariosInclude errors in Scenario List
Refactoring While RedChanging structure with failing testsGet to GREEN first
The MockeryOver-mocking hides bugsPrefer fakes or real implementations
Coverage TheaterTests without meaningful assertionsAssert behavior, not lines
Multi-Test StepMultiple tests before implementingOne test at a time
Verification Trap 🤖AI tests what code does not what it should doState intent in plain language; separate agent review
Test Exploitation 🤖LLMs exploit weak assertions or overload operatorsUse PBT alongside examples; strict equality
Assertion Omission 🤖Missing edge cases (null, undefined, boundaries)Scenario list with errors;
test.each
Hallucinated Mock 🤖AI generates fake mocks without proper setupTestcontainers for integration; real Fakes for unit
Critical: Verify tests by (1) running them, (2) having separate agent review, (3) never trusting generated tests blindly.

模式问题修复方案
镜像盲区同一开发者同时编写测试和代码在GREEN阶段前明确说明测试意图
正常路径偏好仅测试成功场景在场景清单中包含错误场景
红色阶段重构测试失败时修改代码结构先让测试通过到GREEN阶段
过度模拟过多模拟隐藏了真实缺陷优先使用伪造对象或真实实现
覆盖率表演测试没有有效断言断言行为而非代码行
多测试步长编写多个测试后才实现功能一次仅编写一个测试
验证陷阱 🤖AI仅测试「代码做了什么」而非「代码应该做什么」用自然语言明确需求,引入独立评审
测试投机 🤖大语言模型利用弱断言或重载运算符绕过检测结合属性化测试和示例测试,使用严格相等断言
断言缺失 🤖遗漏边界场景(null、undefined、极值)场景清单包含错误场景,使用
test.each
批量测试
幻觉模拟 🤖AI生成没有合理配置的虚假模拟对象集成测试使用Testcontainers,单元测试使用真实伪造对象
关键提醒: 验证测试需要(1) 实际运行测试,(2) 引入独立评审,(3) 永远不要盲目信任生成的测试。

Advanced Techniques

高级技术

Use these techniques at specific points in your workflow:
TechniqueUse DuringPurpose
Test Doubles🔴 RED phaseIsolate dependencies when writing tests
Property-Based Testing🔴 RED phaseCover edge cases for complex logic
Contract Testing🔴 RED phaseDefine API expectations between services
Snapshot Testing🔴 RED phaseCapture UI/response structure
Hermetic Testing🔵 SetupEnsure test isolation and determinism
Mutation Testing✅ After GREENValidate test suite effectiveness
Coverage Analysis✅ After GREENFind untested code paths
Flaky Test Management🔧 MaintenanceFix unreliable tests blocking CI

在工作流的特定阶段使用这些技术:
技术使用阶段目的
测试替身🔴 RED阶段编写测试时隔离依赖
属性化测试🔴 RED阶段覆盖复杂逻辑的边界场景
契约测试🔴 RED阶段定义服务间的API预期
快照测试🔴 RED阶段捕获UI/响应结构
密封测试🔵 配置阶段确保测试隔离性和确定性
变异测试✅ GREEN阶段后验证测试套件的有效性
覆盖率分析✅ GREEN阶段后发现未测试的代码路径
脆弱测试管理🔧 维护阶段修复阻塞CI的不稳定测试

Test Doubles (Use: Writing Tests with Dependencies)

测试替身(适用场景:编写有依赖的测试)

When: Your code depends on something slow, unreliable, or complex (DB, API, filesystem).
TypePurposeWhen
StubReturns canned answersNeed specific return values
MockVerifies interactionsNeed to verify calls made
FakeSimplified implementationNeed real behavior without cost
SpyRecords callsNeed to observe without changing
Decision: Dependency slow/unreliable? → Fake (complex) or Stub (simple). Need to verify calls? → Mock/Spy. Otherwise → real implementation.
→ See
references/examples.md
Test Double Examples

适用场景: 代码依赖运行慢、不可靠或复杂的组件(数据库、API、文件系统)。
类型用途适用场景
存根(Stub)返回预设响应需要特定返回值时
模拟(Mock)验证交互行为需要确认接口调用情况时
伪造(Fake)简化的真实实现需要真实行为但不想承担性能成本时
间谍(Spy)记录调用信息需要观测调用但不改变行为时
决策逻辑: 依赖运行慢/不可靠?→ 伪造(复杂场景)或存根(简单场景)。需要验证调用?→ 模拟/间谍。其他情况 → 使用真实实现。
→ 查看
references/examples.md
测试替身示例

Hermetic Testing (Use: Test Environment Setup)

密封测试(适用场景:测试环境配置)

When: Setting up test infrastructure. Tests must be isolated and deterministic.
Principles:
  • Isolation: Unique temp directories/state per test
  • Reset: Clean up in setUp/tearDown
  • Determinism: No time-based logic or shared mutable state
Database Strategies:
StrategySpeedFidelityUse When
In-memory (SQLite)FastLowUnit tests, simple queries
TestcontainersMediumHighIntegration tests
Transactional RollbackFastHighTests sharing schema (80x faster than TRUNCATE)
→ See
references/examples.md
Hermetic Testing Examples

适用场景: 搭建测试基础设施时,测试必须具备隔离性和确定性。
原则:
  • 隔离性: 每个测试使用唯一的临时目录/状态
  • 重置: 在setUp/tearDown阶段清理资源
  • 确定性: 没有基于时间的逻辑或共享可变状态
数据库策略:
策略速度保真度适用场景
内存数据库(SQLite)单元测试、简单查询
Testcontainers中等集成测试
事务回滚共享schema的测试(比TRUNCATE快80倍)
→ 查看
references/examples.md
密封测试示例

Property-Based Testing (Use: Writing Tests for Complex Logic)

属性化测试(适用场景:为复杂逻辑编写测试)

When: Writing tests for algorithms, state machines, serialization, or code with many edge cases.
Tools: fast-check (JS/TS), Hypothesis (Python), proptest (Rust)
Properties to Test:
  • Commutativity:
    f(a, b) == f(b, a)
  • Associativity:
    f(f(a, b), c) == f(a, f(b, c))
  • Identity:
    f(a, identity) == a
  • Round-trip:
    decode(encode(x)) == x
  • Metamorphic: If input changes by X, output changes by Y (useful when you don't know expected output)
How: Replace multiple example-based tests with one property test that generates random inputs.
Critical: Always log the seed on failure. Without it, you cannot reproduce the failing case.
→ See
references/examples.md
Property-Based Testing Examples

适用场景: 为算法、状态机、序列化或存在大量边界场景的代码编写测试。
工具: fast-check(JS/TS)、Hypothesis(Python)、proptest(Rust)
可测试的属性:
  • 交换律:
    f(a, b) == f(b, a)
  • 结合律:
    f(f(a, b), c) == f(a, f(b, c))
  • 恒等律:
    f(a, identity) == a
  • 往返一致性:
    decode(encode(x)) == x
  • 变质关系:如果输入变化X,输出变化Y(不知道预期输出时非常有用)
使用方式: 用一个可生成随机输入的属性测试替代多个基于示例的测试。
关键提醒: 测试失败时务必记录种子值,没有种子值就无法复现失败场景。
→ 查看
references/examples.md
属性化测试示例

Mutation Testing (Use: Validating Test Quality)

变异测试(适用场景:验证测试质量)

When: After tests pass, to verify they actually catch bugs. Use for critical code (auth, payments) or before major refactors.
Tools: Stryker (JS/TS), PIT (Java), mutmut (Python)
How: Tool mutates your code (e.g., changes
>
to
>=
). If tests still pass → your tests are weak.
Interpretation:
  • >80% mutation score = good test suite
  • Survived mutants = tests don't catch those changes → add tests for these
Equivalent Mutant Problem: Some mutants change syntax but not behavior (e.g.,
i < 10
i != 10
in a loop where i only increments). These can't be killed—100% score is often impossible. Focus on surviving mutants in critical paths, not chasing perfect scores.
When NOT to use: Tool-generated code (OpenAPI clients, Protobuf stubs, ORM models), simple DTOs/getters, legacy code with slow tests, or CI pipelines that must finish in <5 minutes. Use
--incremental --since main
for PR-focused runs. Note: This does NOT mean skip mutation testing on code you (the agent) wrote—always validate your own work.
→ See
references/examples.md
Mutation Testing Examples

适用场景: 测试通过后,验证测试确实能捕获缺陷。用于核心代码(权限、支付)或大规模重构前。
工具: Stryker(JS/TS)、PIT(Java)、mutmut(Python)
使用方式: 工具修改你的代码(例如把
>
改成
>=
),如果修改后测试仍然全部通过 → 说明你的测试强度不足。
结果解读:
  • 变异得分>80% = 优秀的测试套件
  • 存活变异体 = 测试无法捕获这些修改 → 为这些场景添加测试
等价变异体问题: 部分变异体修改了语法但没有改变行为(例如循环中
i < 10
改成
i != 10
,i仅自增)。这类变异体无法被「杀死」,通常不可能拿到100%的得分。重点关注核心路径的存活变异体,不要追求完美得分。
不适用场景: 工具生成的代码(OpenAPI客户端、Protobuf存根、ORM模型)、简单DTO/取值方法、测试运行很慢的遗留代码、要求5分钟内完成的CI流水线。PR场景下使用
--incremental --since main
增量运行。注意:这并不意味着你(Agent)编写的代码可以跳过变异测试——始终要验证自己的产出。
→ 查看
references/examples.md
变异测试示例

Flaky Test Management (Use: CI/CD Maintenance)

脆弱测试管理(适用场景:CI/CD维护)

When: Tests fail intermittently, blocking CI or eroding trust in the test suite.
Root Causes:
CauseFix
Timing (
setTimeout
, races)
Fake timers, await properly
Shared stateIsolate per test
RandomnessSeed or mock
NetworkUse MSW or fakes
Order dependencyMake tests independent
Parallel transaction conflictsIsolate DB connections per worker
How: Detect (
--repeat 10
) → Quarantine (separate suite) → Fix root cause → Restore
Quarantine Rules:
  • Issue-linked: Every quarantined test MUST link to a tracking issue. Prevents "quarantine-and-forget."
  • Mute, don't skip: Prefer muting (runs but doesn't fail build) over skipping. You still collect failure data.
  • Reintroduction criteria: Test must pass N consecutive runs (e.g., 100) on main before leaving quarantine.
→ See
references/examples.md
Flaky Test Examples

适用场景: 测试间歇性失败,阻塞CI或降低测试套件可信度。
根本原因:
原因修复方案
时序问题(
setTimeout
、竞态条件)
使用模拟计时器,正确await异步逻辑
共享状态每个测试隔离状态
随机逻辑固定种子或模拟随机逻辑
网络依赖使用MSW或伪造对象
执行顺序依赖让测试完全独立
并行事务冲突每个工作线程隔离数据库连接
处理流程: 检测(
--repeat 10
)→ 隔离(放入独立测试套件)→ 修复根本原因 → 恢复
隔离规则:
  • 关联问题: 每个被隔离的测试必须关联跟踪问题,避免「隔离后就遗忘」
  • 静音而非跳过: 优先使用静音(运行但不阻塞构建)而非跳过,仍然可以收集失败数据
  • 恢复标准: 测试在主分支连续通过N次(例如100次)后才能移出隔离区
→ 查看
references/examples.md
脆弱测试示例

Contract Testing (Use: Writing Tests for Service Boundaries)

契约测试(适用场景:为服务边界编写测试)

When: Writing tests for code that calls or exposes APIs. Prevents integration breakage.
How (Pact): Consumer defines expected interactions → Contract published → Provider verifies → CI fails if contract broken.
→ See
references/examples.md
Contract Testing Examples

适用场景: 为调用或暴露API的代码编写测试,避免集成失败。
使用方式(Pact): 消费方定义预期交互 → 发布契约 → 提供方验证契约 → 契约被破坏时CI失败
→ 查看
references/examples.md
契约测试示例

Coverage Analysis (Use: Finding Gaps After Tests Pass)

覆盖率分析(适用场景:测试通过后发现覆盖缺口)

When: After writing tests, to find untested code paths. NOT a goal in itself.
MetricMeasuresThreshold
LineLines executed70-80%
BranchDecision paths60-70%
MutationTest effectiveness>80%
Risk-Based Prioritization: P0 (auth, payments) → P1 (core logic) → P2 (helpers) → P3 (config)
Warning: High coverage ≠ good tests. Tests must assert meaningful behavior.

适用场景: 编写测试后,发现未测试的代码路径。覆盖率本身不是最终目标。
指标衡量内容阈值
行覆盖率执行到的代码行70-80%
分支覆盖率覆盖的决策路径60-70%
变异得分测试有效性>80%
风险优先级: P0(权限、支付)→ P1(核心逻辑)→ P2(工具函数)→ P3(配置)
警告: 高覆盖率≠高质量测试,测试必须断言有意义的行为。

Snapshot Testing (Use: Writing Tests for UI/Output Structure)

快照测试(适用场景:为UI/输出结构编写测试)

When: Writing tests for UI components, API responses, or error message formats.
Appropriate: UI structure, API response shapes, error formats. Avoid: Behavior testing, dynamic content, entire pages.
How: Capture output once, verify it doesn't change unexpectedly. Always review diffs carefully.
→ See
references/examples.md
Snapshot Testing Examples

适用场景: 为UI组件、API响应或错误信息格式编写测试。
适用: UI结构、API响应格式、错误信息格式。 避免: 行为测试、动态内容、整页测试。
使用方式: 首次运行捕获输出,后续运行验证输出没有意外变更。务必仔细审查快照差异。
→ 查看
references/examples.md
快照测试示例

Integration with Other Skills

与其他技能的联动

TaskSkillUsage
Committing
git-commit
test:
for RED,
feat:
for GREEN
Code Quality
code-quality
Run during REFACTOR phase
Documentation
docs-check
Check if behavior changes need docs

任务技能使用方式
代码提交
git-commit
RED阶段用
test:
前缀,GREEN阶段用
feat:
前缀
代码质量
code-quality
REFACTOR阶段运行
文档
docs-check
检查行为变更是否需要更新文档

References

参考资料

Foundational:
基础资料: