debugging-strategies

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Debugging Strategies

调试策略

Transform debugging from frustrating guesswork into systematic problem-solving with proven strategies, powerful tools, and methodical approaches.
通过成熟的策略、强大的工具和系统化的方法,将调试从令人沮丧的猜测转变为有条理的问题解决过程。

When to Use This Skill

适用场景

  • Tracking down elusive bugs
  • Investigating performance issues
  • Understanding unfamiliar codebases
  • Debugging production issues
  • Analyzing crash dumps and stack traces
  • Profiling application performance
  • Investigating memory leaks
  • Debugging distributed systems
  • 定位难以捉摸的bug
  • 排查性能问题
  • 理解不熟悉的代码库
  • 调试生产环境问题
  • 分析崩溃转储和堆栈跟踪
  • 分析应用程序性能
  • 排查内存泄漏
  • 调试分布式系统

Core Principles

核心原则

1. The Scientific Method

1. 科学方法

1. Observe: What's the actual behavior? 2. Hypothesize: What could be causing it? 3. Experiment: Test your hypothesis 4. Analyze: Did it prove/disprove your theory? 5. Repeat: Until you find the root cause
1. 观察:实际行为是什么? 2. 假设:可能的原因是什么? 3. 实验:验证你的假设 4. 分析:结果是否证实或推翻了你的理论? 5. 重复:直到找到根本原因

2. Debugging Mindset

2. 调试思维

Don't Assume:
  • "It can't be X" - Yes it can
  • "I didn't change Y" - Check anyway
  • "It works on my machine" - Find out why
Do:
  • Reproduce consistently
  • Isolate the problem
  • Keep detailed notes
  • Question everything
  • Take breaks when stuck
不要假设:
  • "不可能是X的问题" - 它完全可能是
  • "我没改动过Y" - 无论如何都要检查
  • "在我机器上是好的" - 找出原因
应该做:
  • 稳定复现问题
  • 隔离问题
  • 详细记录
  • 质疑一切
  • 遇到瓶颈时休息一下

3. Rubber Duck Debugging

3. 橡皮鸭调试法

Explain your code and problem out loud (to a rubber duck, colleague, or yourself). Often reveals the issue.
把你的代码和问题大声解释出来(可以对着橡皮鸭、同事,甚至自己)。通常这样就能发现问题所在。

Systematic Debugging Process

系统化调试流程

Phase 1: Reproduce

第一阶段:复现问题

markdown
undefined
markdown
undefined

Reproduction Checklist

复现检查清单

  1. Can you reproduce it?
    • Always? Sometimes? Randomly?
    • Specific conditions needed?
    • Can others reproduce it?
  2. Create minimal reproduction
    • Simplify to smallest example
    • Remove unrelated code
    • Isolate the problem
  3. Document steps
    • Write down exact steps
    • Note environment details
    • Capture error messages
undefined
  1. 能否复现问题?
    • 每次都能?偶尔?随机?
    • 需要特定条件吗?
    • 其他人能复现吗?
  2. 创建最小复现示例
    • 简化到最小可复现的例子
    • 移除无关代码
    • 隔离问题
  3. 记录步骤
    • 写下精确的操作步骤
    • 记录环境细节
    • 捕获错误信息
undefined

Phase 2: Gather Information

第二阶段:收集信息

markdown
undefined
markdown
undefined

Information Collection

信息收集

  1. Error Messages
    • Full stack trace
    • Error codes
    • Console/log output
  2. Environment
    • OS version
    • Language/runtime version
    • Dependencies versions
    • Environment variables
  3. Recent Changes
    • Git history
    • Deployment timeline
    • Configuration changes
  4. Scope
    • Affects all users or specific ones?
    • All browsers or specific ones?
    • Production only or also dev?
undefined
  1. 错误信息
    • 完整堆栈跟踪
    • 错误代码
    • 控制台/日志输出
  2. 环境信息
    • 操作系统版本
    • 语言/运行时版本
    • 依赖包版本
    • 环境变量
  3. 近期变更
    • Git历史记录
    • 部署时间线
    • 配置变更
  4. 影响范围
    • 影响所有用户还是特定用户?
    • 影响所有浏览器还是特定浏览器?
    • 仅生产环境还是开发环境也存在?
undefined

Phase 3: Form Hypothesis

第三阶段:提出假设

markdown
undefined
markdown
undefined

Hypothesis Formation

假设形成

Based on gathered info, ask:
  1. What changed?
    • Recent code changes
    • Dependency updates
    • Infrastructure changes
  2. What's different?
    • Working vs broken environment
    • Working vs broken user
    • Before vs after
  3. Where could this fail?
    • Input validation
    • Business logic
    • Data layer
    • External services
undefined
基于收集到的信息,思考:
  1. 什么发生了变化?
    • 近期代码变更
    • 依赖包更新
    • 基础设施变更
  2. 差异在哪里?
    • 正常环境 vs 异常环境
    • 正常用户 vs 受影响用户
    • 变更前 vs 变更后
  3. 可能在哪个环节出错?
    • 输入验证
    • 业务逻辑
    • 数据层
    • 外部服务
undefined

Phase 4: Test & Verify

第四阶段:测试与验证

markdown
undefined
markdown
undefined

Testing Strategies

测试策略

  1. Binary Search
    • Comment out half the code
    • Narrow down problematic section
    • Repeat until found
  2. Add Logging
    • Strategic console.log/print
    • Track variable values
    • Trace execution flow
  3. Isolate Components
    • Test each piece separately
    • Mock dependencies
    • Remove complexity
  4. Compare Working vs Broken
    • Diff configurations
    • Diff environments
    • Diff data
undefined
  1. 二分查找法
    • 注释掉一半代码
    • 缩小问题范围
    • 重复直到找到问题
  2. 添加日志
    • 策略性地使用console.log/打印语句
    • 跟踪变量值
    • 追踪执行流程
  3. 隔离组件
    • 单独测试每个模块
    • 模拟依赖
    • 降低复杂度
  4. 对比正常与异常情况
    • 对比配置差异
    • 对比环境差异
    • 对比数据差异
undefined

Debugging Tools

调试工具

JavaScript/TypeScript Debugging

JavaScript/TypeScript调试

typescript
// Chrome DevTools Debugger
function processOrder(order: Order) {
  debugger; // Execution pauses here

  const total = calculateTotal(order);
  console.log("Total:", total);

  // Conditional breakpoint
  if (order.items.length > 10) {
    debugger; // Only breaks if condition true
  }

  return total;
}

// Console debugging techniques
console.log("Value:", value); // Basic
console.table(arrayOfObjects); // Table format
console.time("operation");
/* code */ console.timeEnd("operation"); // Timing
console.trace(); // Stack trace
console.assert(value > 0, "Value must be positive"); // Assertion

// Performance profiling
performance.mark("start-operation");
// ... operation code
performance.mark("end-operation");
performance.measure("operation", "start-operation", "end-operation");
console.log(performance.getEntriesByType("measure"));
VS Code Debugger Configuration:
json
// .vscode/launch.json
{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Program",
      "program": "${workspaceFolder}/src/index.ts",
      "preLaunchTask": "tsc: build - tsconfig.json",
      "outFiles": ["${workspaceFolder}/dist/**/*.js"],
      "skipFiles": ["<node_internals>/**"]
    },
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Tests",
      "program": "${workspaceFolder}/node_modules/jest/bin/jest",
      "args": ["--runInBand", "--no-cache"],
      "console": "integratedTerminal"
    }
  ]
}
typescript
// Chrome DevTools Debugger
function processOrder(order: Order) {
  debugger; // Execution pauses here

  const total = calculateTotal(order);
  console.log("Total:", total);

  // Conditional breakpoint
  if (order.items.length > 10) {
    debugger; // Only breaks if condition true
  }

  return total;
}

// Console debugging techniques
console.log("Value:", value); // Basic
console.table(arrayOfObjects); // Table format
console.time("operation");
/* code */ console.timeEnd("operation"); // Timing
console.trace(); // Stack trace
console.assert(value > 0, "Value must be positive"); // Assertion

// Performance profiling
performance.mark("start-operation");
// ... operation code
performance.mark("end-operation");
performance.measure("operation", "start-operation", "end-operation");
console.log(performance.getEntriesByType("measure"));
VS Code Debugger Configuration:
json
// .vscode/launch.json
{
  "version": "0.2.0",
  "configurations": [
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Program",
      "program": "${workspaceFolder}/src/index.ts",
      "preLaunchTask": "tsc: build - tsconfig.json",
      "outFiles": ["${workspaceFolder}/dist/**/*.js"],
      "skipFiles": ["<node_internals>/**"]
    },
    {
      "type": "node",
      "request": "launch",
      "name": "Debug Tests",
      "program": "${workspaceFolder}/node_modules/jest/bin/jest",
      "args": ["--runInBand", "--no-cache"],
      "console": "integratedTerminal"
    }
  ]
}

Python Debugging

Python调试

python
undefined
python
undefined

Built-in debugger (pdb)

Built-in debugger (pdb)

import pdb
def calculate_total(items): total = 0 pdb.set_trace() # Debugger starts here
for item in items:
    total += item.price * item.quantity

return total
import pdb
def calculate_total(items): total = 0 pdb.set_trace() # Debugger starts here
for item in items:
    total += item.price * item.quantity

return total

Breakpoint (Python 3.7+)

Breakpoint (Python 3.7+)

def process_order(order): breakpoint() # More convenient than pdb.set_trace() # ... code
def process_order(order): breakpoint() # More convenient than pdb.set_trace() # ... code

Post-mortem debugging

Post-mortem debugging

try: risky_operation() except Exception: import pdb pdb.post_mortem() # Debug at exception point
try: risky_operation() except Exception: import pdb pdb.post_mortem() # Debug at exception point

IPython debugging (ipdb)

IPython debugging (ipdb)

from ipdb import set_trace set_trace() # Better interface than pdb
from ipdb import set_trace set_trace() # Better interface than pdb

Logging for debugging

Logging for debugging

import logging logging.basicConfig(level=logging.DEBUG) logger = logging.getLogger(name)
def fetch_user(user_id): logger.debug(f'Fetching user: {user_id}') user = db.query(User).get(user_id) logger.debug(f'Found user: {user}') return user
import logging logging.basicConfig(level=logging.DEBUG) logger = logging.getLogger(name)
def fetch_user(user_id): logger.debug(f'Fetching user: {user_id}') user = db.query(User).get(user_id) logger.debug(f'Found user: {user}') return user

Profile performance

Profile performance

import cProfile import pstats
cProfile.run('slow_function()', 'profile_stats') stats = pstats.Stats('profile_stats') stats.sort_stats('cumulative') stats.print_stats(10) # Top 10 slowest
undefined
import cProfile import pstats
cProfile.run('slow_function()', 'profile_stats') stats = pstats.Stats('profile_stats') stats.sort_stats('cumulative') stats.print_stats(10) # Top 10 slowest
undefined

Go Debugging

Go调试

go
// Delve debugger
// Install: go install github.com/go-delve/delve/cmd/dlv@latest
// Run: dlv debug main.go

import (
    "fmt"
    "runtime"
    "runtime/debug"
)

// Print stack trace
func debugStack() {
    debug.PrintStack()
}

// Panic recovery with debugging
func processRequest() {
    defer func() {
        if r := recover(); r != nil {
            fmt.Println("Panic:", r)
            debug.PrintStack()
        }
    }()

    // ... code that might panic
}

// Memory profiling
import _ "net/http/pprof"
// Visit http://localhost:6060/debug/pprof/

// CPU profiling
import (
    "os"
    "runtime/pprof"
)

f, _ := os.Create("cpu.prof")
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
// ... code to profile
go
// Delve debugger
// Install: go install github.com/go-delve/delve/cmd/dlv@latest
// Run: dlv debug main.go

import (
    "fmt"
    "runtime"
    "runtime/debug"
)

// Print stack trace
func debugStack() {
    debug.PrintStack()
}

// Panic recovery with debugging
func processRequest() {
    defer func() {
        if r := recover(); r != nil {
            fmt.Println("Panic:", r)
            debug.PrintStack()
        }
    }()

    // ... code that might panic
}

// Memory profiling
import _ "net/http/pprof"
// Visit http://localhost:6060/debug/pprof/

// CPU profiling
import (
    "os"
    "runtime/pprof"
)

f, _ := os.Create("cpu.prof")
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
// ... code to profile

Advanced Debugging Techniques

高级调试技巧

Technique 1: Binary Search Debugging

技巧1:二分查找调试法

bash
undefined
bash
undefined

Git bisect for finding regression

Git bisect for finding regression

git bisect start git bisect bad # Current commit is bad git bisect good v1.0.0 # v1.0.0 was good
git bisect start git bisect bad # Current commit is bad git bisect good v1.0.0 # v1.0.0 was good

Git checks out middle commit

Git checks out middle commit

Test it, then:

Test it, then:

git bisect good # if it works git bisect bad # if it's broken
git bisect good # if it works git bisect bad # if it's broken

Continue until bug found

Continue until bug found

git bisect reset # when done
undefined
git bisect reset # when done
undefined

Technique 2: Differential Debugging

技巧2:差异调试法

Compare working vs broken:
markdown
undefined
对比正常与异常情况:
markdown
undefined

What's Different?

差异点对比

AspectWorkingBroken
EnvironmentDevelopmentProduction
Node version18.16.018.15.0
DataEmpty DB1M records
UserAdminRegular user
BrowserChromeSafari
TimeDuring dayAfter midnight
Hypothesis: Time-based issue? Check timezone handling.
undefined
维度正常情况异常情况
环境开发环境生产环境
Node版本18.16.018.15.0
数据空数据库100万条记录
用户管理员普通用户
浏览器ChromeSafari
时间白天午夜后
假设:是否是时间相关问题?检查时区处理逻辑。
undefined

Technique 3: Trace Debugging

技巧3:调用追踪调试

typescript
// Function call tracing
function trace(
  target: any,
  propertyKey: string,
  descriptor: PropertyDescriptor,
) {
  const originalMethod = descriptor.value;

  descriptor.value = function (...args: any[]) {
    console.log(`Calling ${propertyKey} with args:`, args);
    const result = originalMethod.apply(this, args);
    console.log(`${propertyKey} returned:`, result);
    return result;
  };

  return descriptor;
}

class OrderService {
  @trace
  calculateTotal(items: Item[]): number {
    return items.reduce((sum, item) => sum + item.price, 0);
  }
}
typescript
// Function call tracing
function trace(
  target: any,
  propertyKey: string,
  descriptor: PropertyDescriptor,
) {
  const originalMethod = descriptor.value;

  descriptor.value = function (...args: any[]) {
    console.log(`Calling ${propertyKey} with args:`, args);
    const result = originalMethod.apply(this, args);
    console.log(`${propertyKey} returned:`, result);
    return result;
  };

  return descriptor;
}

class OrderService {
  @trace
  calculateTotal(items: Item[]): number {
    return items.reduce((sum, item) => sum + item.price, 0);
  }
}

Technique 4: Memory Leak Detection

技巧4:内存泄漏检测

typescript
// Chrome DevTools Memory Profiler
// 1. Take heap snapshot
// 2. Perform action
// 3. Take another snapshot
// 4. Compare snapshots

// Node.js memory debugging
if (process.memoryUsage().heapUsed > 500 * 1024 * 1024) {
  console.warn("High memory usage:", process.memoryUsage());

  // Generate heap dump
  require("v8").writeHeapSnapshot();
}

// Find memory leaks in tests
let beforeMemory: number;

beforeEach(() => {
  beforeMemory = process.memoryUsage().heapUsed;
});

afterEach(() => {
  const afterMemory = process.memoryUsage().heapUsed;
  const diff = afterMemory - beforeMemory;

  if (diff > 10 * 1024 * 1024) {
    // 10MB threshold
    console.warn(`Possible memory leak: ${diff / 1024 / 1024}MB`);
  }
});
typescript
// Chrome DevTools Memory Profiler
// 1. Take heap snapshot
// 2. Perform action
// 3. Take another snapshot
// 4. Compare snapshots

// Node.js memory debugging
if (process.memoryUsage().heapUsed > 500 * 1024 * 1024) {
  console.warn("High memory usage:", process.memoryUsage());

  // Generate heap dump
  require("v8").writeHeapSnapshot();
}

// Find memory leaks in tests
let beforeMemory: number;

beforeEach(() => {
  beforeMemory = process.memoryUsage().heapUsed;
});

afterEach(() => {
  const afterMemory = process.memoryUsage().heapUsed;
  const diff = afterMemory - beforeMemory;

  if (diff > 10 * 1024 * 1024) {
    // 10MB threshold
    console.warn(`Possible memory leak: ${diff / 1024 / 1024}MB`);
  }
});

Debugging Patterns by Issue Type

按问题类型分类的调试模式

Pattern 1: Intermittent Bugs

模式1:间歇性bug

markdown
undefined
markdown
undefined

Strategies for Flaky Bugs

偶现bug排查策略

  1. Add extensive logging
    • Log timing information
    • Log all state transitions
    • Log external interactions
  2. Look for race conditions
    • Concurrent access to shared state
    • Async operations completing out of order
    • Missing synchronization
  3. Check timing dependencies
    • setTimeout/setInterval
    • Promise resolution order
    • Animation frame timing
  4. Stress test
    • Run many times
    • Vary timing
    • Simulate load
undefined
  1. 添加详细日志
    • 记录时间信息
    • 记录所有状态转换
    • 记录外部交互
  2. 检查竞态条件
    • 共享状态的并发访问
    • 异步操作执行顺序异常
    • 缺少同步机制
  3. 检查时间依赖
    • setTimeout/setInterval
    • Promise解析顺序
    • 动画帧时序
  4. 压力测试
    • 多次运行
    • 改变时序
    • 模拟负载
undefined

Pattern 2: Performance Issues

模式2:性能问题

markdown
undefined
markdown
undefined

Performance Debugging

性能调试

  1. Profile first
    • Don't optimize blindly
    • Measure before and after
    • Find bottlenecks
  2. Common culprits
    • N+1 queries
    • Unnecessary re-renders
    • Large data processing
    • Synchronous I/O
  3. Tools
    • Browser DevTools Performance tab
    • Lighthouse
    • Python: cProfile, line_profiler
    • Node: clinic.js, 0x
undefined
  1. 先分析再优化
    • 不要盲目优化
    • 优化前后都要测量
    • 找到瓶颈
  2. 常见诱因
    • N+1查询
    • 不必要的重渲染
    • 大数据量处理
    • 同步I/O操作
  3. 工具
    • 浏览器DevTools性能面板
    • Lighthouse
    • Python: cProfile, line_profiler
    • Node: clinic.js, 0x
undefined

Pattern 3: Production Bugs

模式3:生产环境bug

markdown
undefined
markdown
undefined

Production Debugging

生产环境调试

  1. Gather evidence
    • Error tracking (Sentry, Bugsnag)
    • Application logs
    • User reports
    • Metrics/monitoring
  2. Reproduce locally
    • Use production data (anonymized)
    • Match environment
    • Follow exact steps
  3. Safe investigation
    • Don't change production
    • Use feature flags
    • Add monitoring/logging
    • Test fixes in staging
undefined
  1. 收集证据
    • 错误追踪(Sentry, Bugsnag)
    • 应用日志
    • 用户反馈
    • 指标/监控数据
  2. 本地复现
    • 使用生产数据(匿名化处理)
    • 匹配生产环境
    • 遵循精确步骤
  3. 安全排查
    • 不要直接修改生产环境
    • 使用功能开关
    • 添加监控/日志
    • 在预发布环境测试修复方案
undefined

Best Practices

最佳实践

  1. Reproduce First: Can't fix what you can't reproduce
  2. Isolate the Problem: Remove complexity until minimal case
  3. Read Error Messages: They're usually helpful
  4. Check Recent Changes: Most bugs are recent
  5. Use Version Control: Git bisect, blame, history
  6. Take Breaks: Fresh eyes see better
  7. Document Findings: Help future you
  8. Fix Root Cause: Not just symptoms
  1. 先复现问题:无法复现就无法修复
  2. 隔离问题:移除复杂度直到最小可复现案例
  3. 仔细阅读错误信息:它们通常很有帮助
  4. 检查近期变更:大多数bug来自近期改动
  5. 使用版本控制:Git bisect、blame、历史记录
  6. 适当休息:换个视角更容易发现问题
  7. 记录发现:帮助未来的自己
  8. 修复根本原因:不要只修复表面症状

Common Debugging Mistakes

常见调试误区

  • Making Multiple Changes: Change one thing at a time
  • Not Reading Error Messages: Read the full stack trace
  • Assuming It's Complex: Often it's simple
  • Debug Logging in Prod: Remove before shipping
  • Not Using Debugger: console.log isn't always best
  • Giving Up Too Soon: Persistence pays off
  • Not Testing the Fix: Verify it actually works
  • 同时修改多处:一次只改一处
  • 不读错误信息:完整阅读堆栈跟踪
  • 假设问题复杂:通常问题很简单
  • 在生产环境保留调试日志:上线前移除
  • 不使用调试器:console.log并非总是最佳选择
  • 过早放弃:坚持就会有收获
  • 不测试修复方案:验证修复是否真的有效

Quick Debugging Checklist

快速调试检查清单

markdown
undefined
markdown
undefined

When Stuck, Check:

遇到瓶颈时检查:

  • Spelling errors (typos in variable names)
  • Case sensitivity (fileName vs filename)
  • Null/undefined values
  • Array index off-by-one
  • Async timing (race conditions)
  • Scope issues (closure, hoisting)
  • Type mismatches
  • Missing dependencies
  • Environment variables
  • File paths (absolute vs relative)
  • Cache issues (clear cache)
  • Stale data (refresh database)
undefined
  • 拼写错误(变量名打字错误)
  • 大小写敏感(fileName vs filename)
  • Null/undefined值
  • 数组索引越界
  • 异步时序问题(竞态条件)
  • 作用域问题(闭包、变量提升)
  • 类型不匹配
  • �缺失依赖包
  • 环境变量 -[ ] 文件路径(绝对路径vs相对路径)
  • 缓存问题(清除缓存)
  • 数据过期(刷新数据库)
undefined

Resources

参考资源

  • references/debugging-tools-guide.md: Comprehensive tool documentation
  • references/performance-profiling.md: Performance debugging guide
  • references/production-debugging.md: Debugging live systems
  • assets/debugging-checklist.md: Quick reference checklist
  • assets/common-bugs.md: Common bug patterns
  • scripts/debug-helper.ts: Debugging utility functions
  • references/debugging-tools-guide.md: 全面的工具文档
  • references/performance-profiling.md: 性能调试指南
  • references/production-debugging.md: 在线系统调试指南
  • assets/debugging-checklist.md: 快速参考检查清单
  • assets/common-bugs.md: 常见bug模式
  • scripts/debug-helper.ts: 调试工具函数