tdd

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

测试驱动开发(TDD)

Test-Driven Development (TDD)

哲学(Philosophy)

Philosophy

核心原则:测试应当通过公共接口验证行为,而不是依赖实现细节。代码可以被完全重写;测试不应该随之频繁改变。
好的测试具有集成式特征:它会通过公共 API 实际跑通真实代码路径。它描述系统“做什么”,而不是“怎么做”。好的测试读起来像规范——例如“用户在使用有效购物车时可以结账”,你就能明确知道系统具备这个能力。由于测试不关心内部结构,它能在重构后继续存活。
坏的测试会耦合到实现。它们 mock 内部协作者、测试私有方法,或用“绕过接口”的方式验证(例如直接查询数据库而不是使用接口)。危险信号是:当你重构但行为没有变化时,测试却失败了;如果你重命名了某个内部函数导致测试失败,但用户可观察行为却没有变,这些测试是在测实现而不是在测行为。
查看 tests.md 获取更多示例,并查看 mocking.md 了解 mock 指南。
Core Principle: Tests should verify behavior through public interfaces, not rely on implementation details. Code can be completely rewritten; tests should not change frequently as a result.
Good Tests have integrated characteristics: they actually run through real code paths via public APIs. They describe what the system "does", not "how" it does it. A good test reads like a specification—for example, "A user can check out with a valid shopping cart" makes it clear that the system has this capability. Since tests don't care about internal structure, they survive refactoring.
Bad Tests are coupled to implementation. They mock internal collaborators, test private methods, or verify by "bypassing the interface" (such as querying the database directly instead of using the interface). Red flags include: tests fail when you refactor but behavior doesn't change; tests fail when you rename an internal function but user-observable behavior remains the same—these tests are testing implementation, not behavior.
See tests.md for more examples, and mocking.md for mocking guidelines.

反模式:横向切片(Horizontal Slices)

Anti-Pattern: Horizontal Slices

不要写“先写完所有测试,再写所有实现”。
这是一种“把 RED 当成写所有测试”的横向切片:把 RED 理解成“先写完所有测试”,把 GREEN 理解成“一次性写完所有代码”。
这会产生 烂测试(crap tests),包括:
  • 大量批量编写的测试验证的是“想象出来的行为”,而不是“真实的行为”
  • 测试会变成在测“结构形状”(数据结构、函数签名),而不是用户可见的行为
  • 测试会对真实变化变得不敏感:行为坏了反而测试仍可能通过;行为没坏时反而会失败
  • 你会在还没理解实现之前就“把车灯开出来”,过早承诺测试结构
正确做法:通过 tracer bullets 进行竖向切片(vertical slices)。一个测试 → 一个实现 → 重复。每一轮测试都会基于上一轮你从代码/理解中学到的东西来推进。由于你刚刚写完代码,所以你会非常清楚“哪些行为重要”以及“用什么方式验证它”。
WRONG(横向):
  RED:   test1, test2, test3, test4, test5
  GREEN: impl1, impl2, impl3, impl4, impl5

RIGHT(竖向):
  RED→GREEN: test1→impl1
  RED→GREEN: test2→impl2
  RED→GREEN: test3→impl3
  ...
Don't write "all tests first, then all implementation".
This is a horizontal slice that interprets RED as "write all tests first" and GREEN as "write all code at once."
This results in crap tests, including:
  • A large number of batch-written tests that verify "imagined behavior" rather than "real behavior"
  • Tests that end up testing "structural shape" (data structures, function signatures) instead of user-visible behavior
  • Tests that become insensitive to real changes: tests may still pass when behavior breaks; tests may fail when behavior hasn't changed
  • You "turn on the headlights" before understanding the implementation, committing to test structure prematurely
Correct Approach: Vertical slices via tracer bullets. One test → one implementation → repeat. Each round of testing builds on what you learned from the code/understanding in the previous round. Since you just wrote the code, you'll know exactly "which behaviors are important" and "how to verify them."
WRONG(Horizontal):
  RED:   test1, test2, test3, test4, test5
  GREEN: impl1, impl2, impl3, impl4, impl5

RIGHT(Vertical):
  RED→GREEN: test1→impl1
  RED→GREEN: test2→impl2
  RED→GREEN: test3→impl3
  ...

工作流(Workflow)

Workflow

1. 规划(Planning)

1. Planning

在写任何代码之前:
  • 与用户确认需要如何修改接口(interface changes)
  • 与用户确认需要测试哪些行为(优先级更高)
  • 识别可以提取的 深模块(deep modules)(小接口、深实现)
  • 可测试性(testability) 设计接口
  • 列出要测试的行为(而不是实现步骤)
  • 让用户批准该计划
提问:
公共接口应当长什么样?哪些行为最重要、最需要先测?
你无法把所有东西都测完。 请与用户明确“哪些行为最关键”。把测试资源集中在关键路径与复杂逻辑上,而不是覆盖所有可能的边界情况。
Before writing any code:
  • Confirm interface changes with the user
  • Confirm which behaviors need to be tested with the user (higher priority)
  • Identify extractable deep modules (small interfaces, deep implementations)
  • Design interfaces for testability
  • List behaviors to test (not implementation steps)
  • Get user approval for the plan
Question:
What should the public interface look like? Which behaviors are most critical and need to be tested first?
You can't test everything. Clearly define with the user "which behaviors are most critical." Focus testing resources on critical paths and complex logic, not covering all possible edge cases.

2. tracer bullet(示踪子弹)

2. Tracer Bullet

写出 一个测试,用来确认系统的 一个要点:
RED:   为第一个行为写测试 → 测试失败
GREEN: 写最小代码让测试通过 → 测试通过
这就是你的 tracer bullet——它证明这条路径能端到端跑通。
Write one test to confirm one key point of the system:
RED:   Write test for the first behavior → test fails
GREEN: Write minimal code to pass the test → test passes
This is your tracer bullet—it proves this path works end-to-end.

3. 逐步循环(Incremental Loop)

3. Incremental Loop

对剩余的每个行为:
RED:   写下一个测试 → 失败
GREEN: 最小实现让它通过 → 通过
规则:
  • 一次一个测试
  • 只写足够通过当前测试的代码
  • 不要提前为未来测试“预埋”
  • 保持测试聚焦在可观察的行为
For each remaining behavior:
RED:   Write next test → fails
GREEN: Minimal implementation to pass it → passes
Rules:
  • One test at a time
  • Only write enough code to pass the current test
  • Don't "pre-empt" for future tests
  • Keep tests focused on observable behavior

4. 重构(Refactor)

4. Refactor

当所有测试都通过之后,再查看 重构候选(refactoring.md)
  • 抽取重复逻辑
  • 加深模块(把复杂度移到简单接口背后)
  • 在自然的地方应用 SOLID 原则
  • 评估新代码暴露出的、对现有代码的潜在问题
  • 每完成一个重构步骤就运行测试
不要在 RED 状态下重构。 先到 GREEN 再谈重构。
Once all tests pass, check refactoring candidates:
  • Extract duplicate logic
  • Deepen modules (move complexity behind simple interfaces)
  • Apply SOLID principles where natural
  • Evaluate potential issues with existing code exposed by new code
  • Run tests after each refactoring step
Don't refactor in the RED state. Get to GREEN first before refactoring.

每一轮循环的清单(Checklist Per Cycle)

Checklist Per Cycle

[ ] 测试描述的是行为,而不是实现
[ ] 测试只使用公共接口
[ ] 测试在内部重构后仍能存活
[ ] 针对该测试的代码足够最小
[ ] 没有加入“为了以后而猜”的功能
[ ] Tests describe behavior, not implementation
[ ] Tests only use public interfaces
[ ] Tests survive internal refactoring
[ ] Code for this test is minimal enough
[ ] No "guessing for the future" features added