Test-Driven Development (TDD)

Overview

Write tests first. Watch it fail. Write the minimal code to make it pass.

Core Principle: If you don't see the test fail, you don't know if it's testing the right thing.

Violating the letter of the rule is violating the spirit of the rule.

When to Use

Always Use:

New features
Bug fixes
Refactoring
Behavior changes

Exceptions (Ask your human partner):

One-off prototypes
Generated code
Configuration files

Thinking "I'll skip TDD just this once"? Stop. That's making excuses.

Iron Law

No production code is written without a failing test

Wrote code before tests? Delete it. Start over.

No exceptions:

Don't keep it as "reference"
Don't "adapt" it while writing tests
Don't look at it
Delete means delete

Reimplement starting from tests. Period.

Red-Green-Refactor

dot

digraph tdd_cycle {
    rankdir=LR;
    red [label="Red\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
    verify_red [label="Verify correct failure", shape=diamond];
    green [label="Green\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
    verify_green [label="Verify pass\nAll green", shape=diamond];
    refactor [label="Refactor\nClean up code", shape=box, style=filled, fillcolor="#ccccff"];
    next [label="Next", shape=ellipse];

    red -> verify_red;
    verify_red -> green [label="Yes"];
    verify_red -> red [label="Incorrect\nfailure"];
    green -> verify_green;
    verify_green -> refactor [label="Yes"];
    verify_green -> green [label="No"];
    refactor -> verify_green [label="Keep\ngreen"];
    verify_green -> next;
    next -> red;
}

Red - Write Failing Test

Write a minimal test that demonstrates the desired behavior.

<Good> ```typescript test('retries failed operations 3 times', async () => { let attempts = 0; const operation = () => { attempts++; if (attempts < 3) throw new Error('fail'); return 'success'; };

const result = await retryOperation(operation);

expect(result).toBe('success'); expect(attempts).toBe(3); });

Clear name, tests real behavior, tests only one thing
</Good>

<Bad>
```typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});

Vague name, tests mock instead of code </Bad>

Requirements:

One behavior
Clear name
Use real code (only use mock if absolutely necessary)

Verify Red - Watch It Fail

Must execute. Never skip.

bash

npm test path/to/test.test.ts

Confirm:

Test fails (not errors)
Failure message matches expectations
Failure reason is missing functionality (not typo)

Test passed? You're testing existing behavior. Modify the test.

Test errored? Fix the error, rerun until it fails correctly.

Green - Minimal Code

Write the simplest code to make the test pass.

<Good> ```typescript async function retryOperation<T>(fn: () => Promise<T>): Promise<T> { for (let i = 0; i < 3; i++) { try { return await fn(); } catch (e) { if (i === 2) throw e; } } throw new Error('unreachable'); } ``` Just enough to pass the test </Good> <Bad> ```typescript async function retryOperation<T>( fn: () => Promise<T>, options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (attempt: number) => void; } ): Promise<T> { // YAGNI } ``` Over-engineered </Bad>

Don't add features, refactor other code, or make "improvements" beyond what the test requires.

Verify Green - Watch It Pass

Must execute.

bash

npm test path/to/test.test.ts

Confirm:

Test passes
Other tests still pass
Output is clean (no errors, warnings)

Test failed? Modify the code, not the test.

Other tests failed? Fix immediately.

Refactor - Clean Up Code

Only refactor after green:

Eliminate duplication
Improve naming
Extract helper functions

Keep tests green. Don't add behavior.

Repeat

Write the next failing test for the next feature.

Good Tests

Characteristic	Good	Bad
Minimal	Tests only one thing. Name has "and"? Split it.	`test('validates email and domain and whitespace')`
Clear	Name describes behavior	`test('test1')`
Demonstrates Intent	Shows expected API	Hides what the code should do

Why Order Matters

"I'll write tests later to verify after finishing"

Tests written afterwards pass immediately. Passing immediately proves nothing:

Might be testing the wrong thing
Might be testing implementation instead of behavior
Might miss edge cases you forgot
You never see it catch bugs

Writing tests first forces you to see the test fail, proving it's actually testing something.

"I've manually tested all edge cases"

Manual testing is temporary. You think you tested everything, but:

No test record
Can't rerun after code changes
Easy to forget under pressure
"I tried it and it works" ≠ comprehensive testing

Automated tests are systematic. They run the same way every time.

"Deleting X hours of work is too wasteful"

Sunk cost fallacy. The time is already spent. Your options now:

Delete and rewrite with TDD (another X hours, high confidence)
Keep and add tests later (30 minutes, low confidence, potential bugs)

The "waste" is keeping code you can't trust. Working code without real tests is technical debt.

"TDD is too dogmatic, pragmatism means flexibility"

TDD is pragmatic:

Find bugs before commit (faster than debugging later)
Prevent regressions (tests catch breaks immediately)
Document behavior (tests show how to use code)
Support refactoring (modify with confidence, tests catch breaks)

"Pragmatic" shortcuts = debugging in production = slower.

"Writing tests later achieves the same purpose—what matters is the spirit not the ritual"

No. Tests written later answer "What does this code do?" Tests written first answer "What should this code do?"

Tests written later are biased by your implementation. You test what you built, not what the requirements demand. You verify edge cases you remembered, not those you discovered.

Writing tests first forces you to discover edge cases before implementation. Tests written later verify you remembered all cases (you didn't).

30 minutes of tests written later ≠ TDD. You get coverage, but lose proof that the tests are effective.

Common Excuses

Excuse	Reality
"It's too simple to test"	Simple code can still have bugs. Tests take 30 seconds.
"I'll add tests later"	Tests that pass immediately prove nothing.
"Writing tests later achieves the same purpose"	Tests later = "What does this do?" Tests first = "What should this do?"
"I've tested it manually"	Temporary testing ≠ systematic testing. No record, cannot reproduce.
"Deleting X hours of work is too wasteful"	Sunk cost fallacy. Keeping unvalidated code is technical debt.
"Keep it as reference, then write tests first"	You'll adapt it. That's writing tests later. Delete means delete.
"Need to explore first"	You can. After exploring, throw it away and start with TDD.
"Hard to test = unclear design"	Listen to the tests. Hard to test = hard to use.
"TDD slows me down"	TDD is faster than debugging. Pragmatism = write tests first.
"Manual testing is faster"	Manual testing can't prove edge cases. You have to retest every time you modify code.
"Existing code has no tests"	You're improving it. Add tests to existing code.

Red Flags - Stop, Start Over

Wrote code before tests
Added tests after implementation
Test passes immediately
Can't explain why the test failed
"I'll add tests later"
Convincing yourself "just this once"
"I've tested it manually"
"Writing tests later achieves the same purpose"
"What matters is the spirit not the ritual"
"Keep as reference" or "adapt existing code"
"I've spent X hours, deleting is too wasteful"
"TDD is too dogmatic, I'm being pragmatic"
"This situation is different because..."

All of the above mean: Delete the code. Start over with TDD.

Example: Bug Fix

Bug: Empty email is accepted

Red

typescript

test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});

Verify Red

bash

$ npm test
FAIL: expected 'Email required', got undefined

Green

typescript

function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}

Verify Green

bash

$ npm test
PASS

Refactor If needed, extract validation logic to support multiple fields.

Verification Checklist

Before marking work complete:

Every new function/method has tests
Saw each test fail before implementation
Each test failed for the expected reason (missing functionality, not typo)
Wrote minimal code for each test to pass
All tests pass
Output is clean (no errors, warnings)
Tests use real code (only use mock when unavoidable)
Covered edge cases and error scenarios

Can't check all boxes? You skipped TDD. Start over.

When Stuck

Problem	Solution
Don't know how to test	Write the API you expect. Write assertions first. Ask your human partner.
Tests are too complex	Design is too complex. Simplify the interface.
Have to mock everything	Code is too tightly coupled. Use dependency injection.
Test setup is too bulky	Extract helper functions. Still complex? Simplify design.

Debugging Integration

Found a bug? Write a failing test that reproduces the bug. Follow the TDD cycle. The test both proves the fix works and prevents regressions.

Never fix a bug without a test.

Testing Anti-Patterns

When adding mocks or testing tools, read @testing-anti-patterns.md to avoid common pitfalls:

Testing mock behavior instead of real behavior
Adding test-only methods to production classes
Using mocks without understanding dependencies

Final Rule

Production code → Test exists and failed first
Otherwise → Not TDD

No exceptions without permission from your human partner.

test-driven-development

NPX Install

Tags

SKILL.md Content (Chinese)

Test-Driven Development (TDD)

Overview

When to Use

Iron Law

Red-Green-Refactor

Red - Write Failing Test

Verify Red - Watch It Fail

Green - Minimal Code

Verify Green - Watch It Pass

Refactor - Clean Up Code

Repeat

Good Tests

Why Order Matters

Common Excuses

Red Flags - Stop, Start Over

Example: Bug Fix

Verification Checklist

When Stuck

Debugging Integration

Testing Anti-Patterns

Final Rule