test-driven-development

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Test-Driven Development (TDD)

测试驱动开发(TDD)

Overview

概述

Write the test first. Watch it fail. Write minimal code to pass.
Core principle: If you didn't watch the test fail, you don't know if it tests the right thing.
Violating the letter of the rules is violating the spirit of the rules.
先编写测试,观察测试失败,再编写最少代码使测试通过。
核心原则: 如果没看到测试失败,你无法确认测试是否针对正确的内容。
违反规则的字面要求就是违反规则的精神。

When to Use

适用场景

Always:
  • New features
  • Bug fixes
  • Refactoring
  • Behavior changes
Exceptions (ask your human partner):
  • Throwaway prototypes
  • Generated code
  • Configuration files
Thinking "skip TDD just this once"? Stop. That's rationalization.
始终适用:
  • 新功能开发
  • Bug修复
  • 代码重构
  • 行为变更
例外情况(需咨询人类搭档):
  • 一次性原型
  • 生成代码
  • 配置文件
想着“就这一次跳过TDD”?打住,这是合理化借口。

The Iron Law

铁律

NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
  • Don't keep it as "reference"
  • Don't "adapt" it while writing tests
  • Don't look at it
  • Delete means delete
Implement fresh from tests. Period.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
先写代码再写测试?删掉代码,重新开始。
无例外:
  • 不要把它当作“参考”保留
  • 不要在写测试时“改编”它
  • 不要查看它
  • 删掉就是彻底删除
从测试开始重新实现,就这么简单。

Test Process Discipline (CRITICAL)

测试流程规范(至关重要)

Problem: Test runners (Vitest, Jest) default to watch mode, leaving processes hanging indefinitely.
Mandatory Rules:
  1. Always use run mode — Never invoke watch mode:
    • Vitest:
      npx vitest run
      (NOT
      npx vitest
      )
    • Jest:
      CI=true npx jest
      or
      npx jest --watchAll=false
    • npm scripts:
      CI=true npm test
      or
      npm test -- --run
  2. Prefer CI=true prefix for all test commands:
    CI=true npm test
  3. After TDD cycle complete, verify no orphaned processes:
    pgrep -f "vitest|jest" || echo "Clean"
  4. Kill if found:
    pkill -f "vitest" 2>/dev/null || true
问题: 测试运行器(Vitest、Jest)默认使用监听模式,会导致进程无限挂起。
强制规则:
  1. 始终使用运行模式 — 绝不使用监听模式:
    • Vitest:
      npx vitest run
      (不要用
      npx vitest
    • Jest:
      CI=true npx jest
      npx jest --watchAll=false
    • npm脚本:
      CI=true npm test
      npm test -- --run
  2. 所有测试命令优先添加CI=true前缀
    CI=true npm test
  3. TDD周期完成后,验证是否存在孤立进程:
    pgrep -f "vitest|jest" || echo "Clean"
  4. 若发现则杀死进程
    pkill -f "vitest" 2>/dev/null || true

Red-Green-Refactor

红-绿-重构

    ┌─────────┐       ┌─────────┐       ┌───────────┐
    │   RED   │──────>│  GREEN  │──────>│ REFACTOR  │
    │ (Fail)  │       │ (Pass)  │       │ (Clean)   │
    └─────────┘       └─────────┘       └───────────┘
         ^                                    │
         │                                    │
         └────────────────────────────────────┘
                    Next Feature
    ┌─────────┐       ┌─────────┐       ┌───────────┐
    │   RED   │──────>│  GREEN  │──────>│ REFACTOR  │
    │ (Fail)  │       │ (Pass)  │       │ (Clean)   │
    └─────────┘       └─────────┘       └───────────┘
         ^                                    │
         │                                    │
         └────────────────────────────────────┘
                    Next Feature

RED - Write Failing Test

红 - 编写失败的测试

Write one minimal test showing what should happen.
Good:
typescript
test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };

  const result = await retryOperation(operation);

  expect(result).toBe('success');
  expect(attempts).toBe(3);
});
Clear name, tests real behavior, one thing
Bad:
typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});
Vague name, tests mock not code
Requirements:
  • One behavior
  • Clear name
  • Real code (no mocks unless unavoidable)
编写一个最小化的测试,展示预期的行为。
良好示例:
typescript
test('retries failed operations 3 times', async () => {
  let attempts = 0;
  const operation = () => {
    attempts++;
    if (attempts < 3) throw new Error('fail');
    return 'success';
  };

  const result = await retryOperation(operation);

  expect(result).toBe('success');
  expect(attempts).toBe(3);
});
名称清晰,测试真实行为,聚焦单一功能
不良示例:
typescript
test('retry works', async () => {
  const mock = jest.fn()
    .mockRejectedValueOnce(new Error())
    .mockRejectedValueOnce(new Error())
    .mockResolvedValueOnce('success');
  await retryOperation(mock);
  expect(mock).toHaveBeenCalledTimes(3);
});
名称模糊,测试模拟而非真实代码
要求:
  • 单一行为
  • 名称清晰
  • 使用真实代码(万不得已才用模拟)

Verify RED - Watch It Fail

验证红阶段 - 观察测试失败

MANDATORY. Never skip.
bash
CI=true npm test path/to/test.test.ts
Confirm:
  • Test fails (not errors)
  • Failure message is expected
  • Fails because feature missing (not typos)
Test passes? You're testing existing behavior. Fix test.
Test errors? Fix error, re-run until it fails correctly.
强制要求,绝不跳过。
bash
CI=true npm test path/to/test.test.ts
确认:
  • 测试失败(非报错)
  • 失败消息符合预期
  • 失败原因是功能缺失(而非拼写错误)
测试通过了? 你在测试现有行为,修改测试。
测试报错? 修复错误,重新运行直到测试正确失败。

GREEN - Minimal Code

绿 - 编写最少代码

Write simplest code to pass the test.
Good:
typescript
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
  for (let i = 0; i < 3; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === 2) throw e;
    }
  }
  throw new Error('unreachable');
}
Just enough to pass
Bad:
typescript
async function retryOperation<T>(
  fn: () => Promise<T>,
  options?: {
    maxRetries?: number;
    backoff?: 'linear' | 'exponential';
    onRetry?: (attempt: number) => void;
  }
): Promise<T> {
  // YAGNI - You Ain't Gonna Need It
}
Over-engineered
Don't add features, refactor other code, or "improve" beyond the test. Don't hard-code test values - implement general logic that works for ALL inputs.
编写最简单的代码使测试通过。
良好示例:
typescript
async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
  for (let i = 0; i < 3; i++) {
    try {
      return await fn();
    } catch (e) {
      if (i === 2) throw e;
    }
  }
  throw new Error('unreachable');
}
刚好满足测试要求
不良示例:
typescript
async function retryOperation<T>(
  fn: () => Promise<T>,
  options?: {
    maxRetries?: number;
    backoff?: 'linear' | 'exponential';
    onRetry?: (attempt: number) => void;
  }
): Promise<T> {
  // YAGNI - You Ain't Gonna Need It
}
过度设计
不要添加额外功能、重构其他代码或“优化”超出测试要求的内容。不要硬编码测试值 - 实现适用于所有输入的通用逻辑。

Verify GREEN - Watch It Pass

验证绿阶段 - 观察测试通过

MANDATORY.
bash
CI=true npm test path/to/test.test.ts
Confirm:
  • Test passes
  • Other tests still pass
  • Output pristine (no errors, warnings)
Test fails? Fix code, not test.
Other tests fail? Fix now.
强制要求。
bash
CI=true npm test path/to/test.test.ts
确认:
  • 测试通过
  • 其他测试仍通过
  • 输出干净(无错误、警告)
测试失败? 修复代码,而非测试。
其他测试失败? 立即修复。

REFACTOR - Clean Up

重构 - 代码清理

After green only:
  • Remove duplication
  • Improve names
  • Extract helpers
Keep tests green. Don't add behavior.
仅在绿阶段完成后进行:
  • 消除重复代码
  • 优化命名
  • 提取辅助函数
保持测试通过,不要添加新行为。

Repeat

重复循环

Next failing test for next feature.
为下一个功能编写新的失败测试。

Good Tests

优质测试标准

QualityGoodBad
MinimalOne thing. "and" in name? Split it.
test('validates email and domain and whitespace')
ClearName describes behavior
test('test1')
Shows intentDemonstrates desired APIObscures what code should do
质量维度良好示例不良示例
最小化只测试一个点。名称里有“和”?拆分它。
test('validates email and domain and whitespace')
清晰性名称描述行为
test('test1')
体现意图展示期望的API模糊代码应实现的功能

Factory Pattern for Tests (Reference Pattern)

测试工厂模式(参考模式)

Create
getMockX(overrides?: Partial<X>)
functions for reusable test data:
typescript
interface User {
  id: string;
  name: string;
  email: string;
  role: 'admin' | 'user';
}

const getMockUser = (overrides?: Partial<User>): User => ({
  id: '123',
  name: 'John Doe',
  email: 'john@example.com',
  role: 'user',
  ...overrides,
});

// Usage - override only what matters for the test
it('shows admin badge for admin users', () => {
  const user = getMockUser({ role: 'admin' });
  render(<UserCard user={user} />);
  expect(screen.getByText('Admin')).toBeTruthy();
});
Benefits:
  • Sensible defaults - less boilerplate per test
  • Override specific properties - focus on what test cares about
  • Type-safe - catches missing properties
  • DRY - change mock in one place
创建
getMockX(overrides?: Partial<X>)
函数用于可复用测试数据:
typescript
interface User {
  id: string;
  name: string;
  email: string;
  role: 'admin' | 'user';
}

const getMockUser = (overrides?: Partial<User>): User => ({
  id: '123',
  name: 'John Doe',
  email: 'john@example.com',
  role: 'user',
  ...overrides,
});

// Usage - override only what matters for the test
it('shows admin badge for admin users', () => {
  const user = getMockUser({ role: 'admin' });
  render(<UserCard user={user} />);
  expect(screen.getByText('Admin')).toBeTruthy();
});
优势:
  • 合理的默认值 - 每个测试减少样板代码
  • 可覆盖特定属性 - 聚焦测试关注的点
  • 类型安全 - 捕获缺失属性
  • DRY(不重复)- 在一处修改模拟数据

Mocking External Dependencies (When Unavoidable)

外部依赖模拟(万不得已时使用)

Rule: Prefer real code. Mock only when:
  • External API (network calls)
  • Database (test isolation)
  • Time-dependent logic
  • Third-party services
规则: 优先使用真实代码。仅在以下情况模拟:
  • 外部API(网络调用)
  • 数据库(测试隔离)
  • 时间相关逻辑
  • 第三方服务

Common Mock Patterns

常见模拟模式

Supabase:
typescript
jest.mock('@/lib/supabase', () => ({
  supabase: {
    from: jest.fn(() => ({
      select: jest.fn(() => ({
        eq: jest.fn(() => Promise.resolve({ data: mockData, error: null }))
      }))
    }))
  }
}))
Fetch/API:
typescript
global.fetch = jest.fn(() =>
  Promise.resolve({ ok: true, json: () => Promise.resolve(mockResponse) })
) as jest.Mock
Redis:
typescript
jest.mock('@/lib/redis', () => ({
  get: jest.fn(() => Promise.resolve(cachedValue)),
  set: jest.fn(() => Promise.resolve('OK'))
}))
Environment Variables:
typescript
beforeEach(() => {
  process.env.API_KEY = 'test-key'
})
afterEach(() => {
  delete process.env.API_KEY
})
Time:
typescript
jest.useFakeTimers()
// In test:
jest.advanceTimersByTime(1000)
Mock quality check: If mock setup > test code, reconsider design.
Supabase:
typescript
jest.mock('@/lib/supabase', () => ({
  supabase: {
    from: jest.fn(() => ({
      select: jest.fn(() => ({
        eq: jest.fn(() => Promise.resolve({ data: mockData, error: null }))
      }))
    }))
  }
}))
Fetch/API:
typescript
global.fetch = jest.fn(() =>
  Promise.resolve({ ok: true, json: () => Promise.resolve(mockResponse) })
) as jest.Mock
Redis:
typescript
jest.mock('@/lib/redis', () => ({
  get: jest.fn(() => Promise.resolve(cachedValue)),
  set: jest.fn(() => Promise.resolve('OK'))
}))
环境变量:
typescript
beforeEach(() => {
  process.env.API_KEY = 'test-key'
})
afterEach(() => {
  delete process.env.API_KEY
})
时间:
typescript
jest.useFakeTimers()
// In test:
jest.advanceTimersByTime(1000)
模拟质量检查: 如果模拟设置代码超过测试代码,重新考虑设计。

Why Order Matters

顺序为何重要

"I'll write tests after to verify it works"
Tests written after code pass immediately. Passing immediately proves nothing:
  • Might test wrong thing
  • Might test implementation, not behavior
  • Might miss edge cases you forgot
  • You never saw it catch the bug
Test-first forces you to see the test fail, proving it actually tests something.
"I already manually tested all the edge cases"
Manual testing is ad-hoc. You think you tested everything but:
  • No record of what you tested
  • Can't re-run when code changes
  • Easy to forget cases under pressure
  • "It worked when I tried it" ≠ comprehensive
Automated tests are systematic. They run the same way every time.
"Deleting X hours of work is wasteful"
Sunk cost fallacy. The time is already gone. Your choice now:
  • Delete and rewrite with TDD (X more hours, high confidence)
  • Keep it and add tests after (30 min, low confidence, likely bugs)
The "waste" is keeping code you can't trust. Working code without real tests is technical debt.
“我会在写完代码后再写测试验证功能”
在代码后写的测试会立即通过。立即通过根本无法证明任何事:
  • 可能测试了错误的内容
  • 可能测试的是实现细节而非行为
  • 可能遗漏了你忘记的边缘情况
  • 你从未看到它捕获过Bug
测试先行迫使你看到测试失败,证明测试确实针对了某个功能。
“我已经手动测试了所有边缘情况”
手动测试是临时的。你以为测试了所有内容,但:
  • 没有测试记录
  • 代码变更时无法重新运行测试
  • 压力下容易忘记测试场景
  • “我试的时候是好的” ≠ 全面覆盖
自动化测试是系统化的,每次运行都保持一致。
“删掉X小时的工作太浪费”
沉没成本谬误。时间已经花了,现在的选择是:
  • 删掉用TDD重写(再花X小时,高可信度)
  • 保留并后加测试(30分钟,低可信度,可能有Bug)
“浪费”的是保留你无法信任的代码。没有真实测试的可运行代码是技术债务。

Red Flags - STOP and Start Over

危险信号 - 停止并重新开始

If you catch yourself:
  • Code before test
  • Test after implementation
  • Test passes immediately
  • Can't explain why test failed
  • Tests added "later"
  • Rationalizing "just this once"
  • "I already manually tested it"
  • "Tests after achieve the same purpose"
  • "It's about spirit not ritual"
  • "Keep as reference" or "adapt existing code"
  • "Already spent X hours, deleting is wasteful"
  • "TDD is dogmatic, I'm being pragmatic"
  • "This is different because..."
All of these mean: Delete code. Start over with TDD.
如果你发现自己有以下行为:
  • 先写代码再写测试
  • 实现后再写测试
  • 测试立即通过
  • 无法解释测试失败的原因
  • 后来才添加测试
  • 合理化“就这一次”
  • “我已经手动测试过了”
  • “测试后达到同样目的”
  • “这关乎精神而非形式”
  • “保留作为参考”或“改编现有代码”
  • “已经花了X小时,删掉太浪费”
  • “TDD太教条,我是务实的”
  • “这次情况不同因为...”
所有这些情况都意味着:删除代码,从TDD开始重新做。

Rationalization Prevention

合理化借口预防

ExcuseReality
"Too simple to test"Simple code breaks. Test takes 30 seconds.
"I'll test after"Tests passing immediately prove nothing.
"Tests after achieve same goals"Tests-after = "what does this do?" Tests-first = "what should this do?"
"Already manually tested"Ad-hoc ≠ systematic. No record, can't re-run.
"Deleting X hours is wasteful"Sunk cost fallacy. Keeping unverified code is technical debt.
"Keep as reference, write tests first"You'll adapt it. That's testing after. Delete means delete.
"Need to explore first"Fine. Throw away exploration, start with TDD.
"Test hard = design unclear"Listen to test. Hard to test = hard to use.
"TDD will slow me down"TDD faster than debugging. Pragmatic = test-first.
"Manual test faster"Manual doesn't prove edge cases. You'll re-test every change.
"Existing code has no tests"You're improving it. Add tests for existing code.
借口现实
“太简单不用测试”简单代码也会出错。测试只需30秒。
“我之后再写测试”测试立即通过证明不了任何事。
“测试后达到同样目标”后写测试=“这代码做了什么?” 先写测试=“这代码应该做什么?”
“我已经手动测试过了”手动测试是临时的。没有记录,无法重复运行。
“删掉X小时的工作太浪费”沉没成本谬误。时间已经花了,现在的选择是:<br>- 删掉用TDD重写(再花X小时,高可信度)<br>- 保留并后加测试(30分钟,低可信度,可能有Bug)<br>“浪费”的是保留你无法信任的代码。没有真实测试的可运行代码是技术债务。
“保留作为参考,先写测试”你会改编它,这属于后写测试。删掉就是彻底删除。
“需要先探索一下”没问题。扔掉探索代码,从TDD开始。
“测试难度大=设计不清晰”倾听测试的反馈。难测试=难使用。
“TDD会拖慢我”TDD比调试更快。务实就是测试先行。
“手动测试更快”手动测试无法覆盖边缘情况。每次代码变更你都要重新测试。
“现有代码没有测试”你正在改进它。为现有代码添加测试。

Example: Bug Fix

示例:Bug修复

Bug: Empty email accepted
RED
typescript
test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});
Verify RED
bash
$ npm test
FAIL: expected 'Email required', got undefined
GREEN
typescript
function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}
Verify GREEN
bash
$ npm test
PASS
REFACTOR Extract validation for multiple fields if needed.
Bug: 允许空邮箱提交
红阶段
typescript
test('rejects empty email', async () => {
  const result = await submitForm({ email: '' });
  expect(result.error).toBe('Email required');
});
验证红阶段
bash
$ npm test
FAIL: expected 'Email required', got undefined
绿阶段
typescript
function submitForm(data: FormData) {
  if (!data.email?.trim()) {
    return { error: 'Email required' };
  }
  // ...
}
验证绿阶段
bash
$ npm test
PASS
重构阶段 若需要,提取多字段验证逻辑。

Verification Checklist

验证清单

Before marking work complete:
  • Every new function/method has a test
  • Watched each test fail before implementing
  • Each test failed for expected reason (feature missing, not typo)
  • Wrote minimal code to pass each test
  • All tests pass
  • Output pristine (no errors, warnings)
  • Tests use real code (mocks only if unavoidable)
  • Edge cases and errors covered
  • No hanging test processes (pgrep -f "vitest|jest" returns empty)
Can't check all boxes? You skipped TDD. Start over.
在标记工作完成前:
  • 每个新函数/方法都有对应的测试
  • 在实现前看到每个测试失败
  • 每个测试因预期原因失败(功能缺失,而非拼写错误)
  • 编写最少代码使每个测试通过
  • 所有测试通过
  • 输出干净(无错误、警告)
  • 测试使用真实代码(万不得已才用模拟)
  • 覆盖边缘情况和错误场景
  • 无挂起的测试进程(pgrep -f "vitest|jest" 返回空)
无法勾选所有选项?说明你跳过了TDD,重新开始。

Coverage Threshold (Project Default)

覆盖率阈值(项目默认)

Target: 80%+ code coverage across:
  • Branches: 80%
  • Functions: 80%
  • Lines: 80%
  • Statements: 80%
Verify with:
npm run test:coverage
or equivalent.
Below threshold? Add missing tests before claiming completion.
目标:全项目80%+代码覆盖率,涵盖:
  • 分支覆盖率:80%
  • 函数覆盖率:80%
  • 行覆盖率:80%
  • 语句覆盖率:80%
验证方式:
npm run test:coverage
或等效命令。
低于阈值? 在完成工作前添加缺失的测试。

Test Smells (Anti-Patterns)

测试坏味道(反模式)

SmellBad ExampleWhy It's BadFix
Testing implementation
expect(component.state.count).toBe(5)
Breaks when internals changeTest user-visible behavior
Dependent testsTest B relies on Test A's stateFlaky, order-dependentEach test sets up own data
Mocking everythingEvery dependency mockedTests mock, not codeUse real code where feasible
Giant setup50 lines of setup per testHard to understandExtract factories
Magic numbers
expect(result).toBe(42)
Meaning unclearUse named constants
Test name lies
test('works')
passes but doesn't test 'works'
MisleadingName describes actual behavior
No assertions
test('loads', () => { loadData() })
Tests nothingAlways assert outcomes
Commented tests
// test('edge case'...
Dead code, skipped coverageDelete or uncomment
If you spot these in your tests: Fix before claiming TDD cycle complete.
坏味道不良示例危害修复方案
测试实现细节
expect(component.state.count).toBe(5)
内部实现变更时测试会失败测试用户可见的行为
测试依赖测试B依赖测试A的状态不稳定,依赖执行顺序每个测试自行设置数据
模拟所有依赖所有依赖都被模拟测试的是模拟而非真实代码尽可能使用真实代码
庞大的前置设置每个测试有50行前置代码难以理解提取工厂函数
魔法数字
expect(result).toBe(42)
含义不明确使用命名常量
测试名不符实
test('works')
通过但未测试实际功能
误导性名称描述实际测试的行为
无断言
test('loads', () => { loadData() })
未测试任何内容始终断言结果
注释掉的测试
// test('edge case'...
死代码,跳过覆盖率统计删除或取消注释
如果在测试中发现这些问题: 在完成TDD周期前修复。

When Stuck

遇到困境时

ProblemSolution
Don't know how to testWrite wished-for API. Write assertion first. Ask your human partner.
Test too complicatedDesign too complicated. Simplify interface.
Must mock everythingCode too coupled. Use dependency injection.
Test setup hugeExtract helpers. Still complex? Simplify design.
问题解决方案
不知道如何测试写出期望的API,先写断言,咨询人类搭档。
测试过于复杂设计过于复杂,简化接口。
必须模拟所有依赖代码耦合度过高,使用依赖注入。
测试前置设置过于庞大提取辅助函数。仍然复杂?简化设计。

Output Format

输出格式

markdown
undefined
markdown
undefined

TDD Cycle

TDD周期

Requirements

需求

[What functionality is being built]
[正在构建的功能是什么]

RED Phase

红阶段

  • Test: [test name]
  • Command:
    npm test -- --grep "test name"
  • Result: exit 1 (FAIL as expected)
  • Failure reason: [function not defined / expected X got Y]
  • 测试:[测试名称]
  • 命令:
    npm test -- --grep "test name"
  • 结果:exit 1(如预期失败)
  • 失败原因:[函数未定义 / 期望X得到Y]

GREEN Phase

绿阶段

  • Implementation: [summary]
  • File: [path:line]
  • Command:
    npm test -- --grep "test name"
  • Result: exit 0 (PASS)
  • 实现:[摘要]
  • 文件:[路径:行号]
  • 命令:
    npm test -- --grep "test name"
  • 结果:exit 0(通过)

REFACTOR Phase

重构阶段

  • Changes: [what was improved]
  • Command:
    npm test
  • Result: exit 0 (all tests pass)
undefined
  • 变更:[改进了什么]
  • 命令:
    npm test
  • 结果:exit 0(所有测试通过)
undefined

Final Rule

最终规则

Production code → test exists and failed first
Otherwise → not TDD
No exceptions without your human partner's permission.
Production code → test exists and failed first
Otherwise → not TDD
未经人类搭档许可,无例外。