Test-Driven Development (TDD)
Philosophy
Core Principle: Tests should verify behavior through public interfaces, not rely on implementation details. Code can be completely rewritten; tests should not change frequently as a result.
Good Tests have integrated characteristics: they actually run through real code paths via public APIs. They describe what the system "does", not "how" it does it. A good test reads like a specification—for example, "A user can check out with a valid shopping cart" makes it clear that the system has this capability. Since tests don't care about internal structure, they survive refactoring.
Bad Tests are coupled to implementation. They mock internal collaborators, test private methods, or verify by "bypassing the interface" (such as querying the database directly instead of using the interface). Red flags include: tests fail when you refactor but behavior doesn't change; tests fail when you rename an internal function but user-observable behavior remains the same—these tests are testing implementation, not behavior.
See tests.md for more examples, and mocking.md for mocking guidelines.
Anti-Pattern: Horizontal Slices
Don't write "all tests first, then all implementation".
This is a horizontal slice that interprets RED as "write all tests first" and GREEN as "write all code at once."
This results in crap tests, including:
- A large number of batch-written tests that verify "imagined behavior" rather than "real behavior"
- Tests that end up testing "structural shape" (data structures, function signatures) instead of user-visible behavior
- Tests that become insensitive to real changes: tests may still pass when behavior breaks; tests may fail when behavior hasn't changed
- You "turn on the headlights" before understanding the implementation, committing to test structure prematurely
Correct Approach: Vertical slices via tracer bullets. One test → one implementation → repeat. Each round of testing builds on what you learned from the code/understanding in the previous round. Since you just wrote the code, you'll know exactly "which behaviors are important" and "how to verify them."
WRONG(Horizontal):
RED: test1, test2, test3, test4, test5
GREEN: impl1, impl2, impl3, impl4, impl5
RIGHT(Vertical):
RED→GREEN: test1→impl1
RED→GREEN: test2→impl2
RED→GREEN: test3→impl3
...
Workflow
1. Planning
Before writing any code:
Question:
What should the public interface look like? Which behaviors are most critical and need to be tested first?
You can't test everything. Clearly define with the user "which behaviors are most critical." Focus testing resources on critical paths and complex logic, not covering all possible edge cases.
2. Tracer Bullet
Write one test to confirm one key point of the system:
RED: Write test for the first behavior → test fails
GREEN: Write minimal code to pass the test → test passes
This is your tracer bullet—it proves this path works end-to-end.
3. Incremental Loop
For each remaining behavior:
RED: Write next test → fails
GREEN: Minimal implementation to pass it → passes
Rules:
- One test at a time
- Only write enough code to pass the current test
- Don't "pre-empt" for future tests
- Keep tests focused on observable behavior
4. Refactor
Once all tests pass, check refactoring candidates:
Don't refactor in the RED state. Get to GREEN first before refactoring.
Checklist Per Cycle
[ ] Tests describe behavior, not implementation
[ ] Tests only use public interfaces
[ ] Tests survive internal refactoring
[ ] Code for this test is minimal enough
[ ] No "guessing for the future" features added