error-handling-patterns

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Error Handling Patterns

错误处理模式

Error Classification

错误分类

TypeCauseCan Recover?Example
OperationalRuntime problem in a correctly-written programYesNetwork timeout, disk full, invalid input
ProgrammerBug in the codeNoTypeError, null dereference, assertion failure
Operational errors: Anticipate them, handle explicitly, retry if transient, return clear error to caller, log with context. Programmer errors: Crash immediately (fail fast), fix the bug, log full stack trace.
类型原因能否恢复?示例
操作型错误代码本身正确,但运行时出现问题网络超时、磁盘已满、输入无效
编程型错误代码中的BugTypeError、空指针引用、断言失败
操作型错误: 提前预判,显式处理,若为临时错误可重试,向调用者返回清晰的错误信息,并附带上下文记录日志。 编程型错误: 立即终止程序(快速失败),修复Bug,记录完整的堆栈跟踪。

Try/Catch Patterns

Try/Catch 模式

When to Catch

何时捕获错误

  • You can meaningfully recover (retry, fallback, return default)
  • You need to translate the error for the caller
  • You are at a boundary (HTTP handler, event listener, queue consumer)
  • You need to add context before re-throwing
  • 你可以采取有意义的恢复措施(重试、降级、返回默认值)
  • 需要为调用者转换错误信息
  • 处于边界位置(HTTP处理器、事件监听器、队列消费者)
  • 需要添加上下文信息后重新抛出错误

When to Propagate

何时传播错误

  • You cannot recover; let the caller decide
  • The error is already descriptive enough
  • You are in a pure business logic layer (no I/O awareness)
  • 无法恢复,交由调用者决定处理方式
  • 错误信息已经足够描述问题
  • 处于纯业务逻辑层(无I/O相关感知)

Catch, Add Context, Re-throw

捕获错误、添加上下文后重新抛出

typescript
async function getUser(id: string): Promise<User> {
  try {
    return await db.query("SELECT * FROM users WHERE id = $1", [id]);
  } catch (error) {
    throw new DatabaseError(`Failed to fetch user ${id}`, { cause: error });
  }
}
python
def get_user(user_id: str) -> User:
    try:
        return db.query("SELECT * FROM users WHERE id = %s", (user_id,))
    except DatabaseError as e:
        raise UserFetchError(f"Failed to fetch user {user_id}") from e
typescript
async function getUser(id: string): Promise<User> {
  try {
    return await db.query("SELECT * FROM users WHERE id = $1", [id]);
  } catch (error) {
    throw new DatabaseError(`Failed to fetch user ${id}`, { cause: error });
  }
}
python
def get_user(user_id: str) -> User:
    try:
        return db.query("SELECT * FROM users WHERE id = %s", (user_id,))
    except DatabaseError as e:
        raise UserFetchError(f"Failed to fetch user {user_id}") from e

Catch at Boundaries

在边界位置捕获错误

typescript
app.get("/users/:id", async (req, res) => {
  try {
    const user = await getUser(req.params.id);
    res.json(user);
  } catch (error) {
    if (error instanceof NotFoundError) {
      return res.status(404).json({ error: { code: "RESOURCE_NOT_FOUND", message: "User not found." } });
    }
    logger.error("Unhandled error in GET /users/:id", { error, requestId: req.id });
    res.status(500).json({ error: { code: "INTERNAL_ERROR", message: "An unexpected error occurred." } });
  }
});
typescript
app.get("/users/:id", async (req, res) => {
  try {
    const user = await getUser(req.params.id);
    res.json(user);
  } catch (error) {
    if (error instanceof NotFoundError) {
      return res.status(404).json({ error: { code: "RESOURCE_NOT_FOUND", message: "User not found." } });
    }
    logger.error("Unhandled error in GET /users/:id", { error, requestId: req.id });
    res.status(500).json({ error: { code: "INTERNAL_ERROR", message: "An unexpected error occurred." } });
  }
});

Never Swallow Errors

切勿忽略错误

typescript
// BAD: empty catch block
try { await saveData(data); } catch (error) { }

// GOOD: handle or re-throw
try { await saveData(data); } catch (error) {
  logger.error("Failed to save data", { error, data });
  throw error;
}
typescript
// BAD: empty catch block
try { await saveData(data); } catch (error) { }

// GOOD: handle or re-throw
try { await saveData(data); } catch (error) {
  logger.error("Failed to save data", { error, data });
  throw error;
}

Custom Error Classes

自定义错误类

JavaScript / TypeScript

JavaScript / TypeScript

typescript
class AppError extends Error {
  constructor(message: string, public readonly code: string,
    public readonly statusCode = 500, public readonly isOperational = true, cause?: Error) {
    super(message, { cause });
    this.name = this.constructor.name;
  }
}

class NotFoundError extends AppError {
  constructor(resource: string, id: string) {
    super(`${resource} with id ${id} not found`, "RESOURCE_NOT_FOUND", 404);
  }
}

class ValidationError extends AppError {
  constructor(public readonly details: { field: string; message: string }[]) {
    super("Validation failed", "VALIDATION_FAILED", 422);
  }
}

class ExternalServiceError extends AppError {
  constructor(service: string, cause: Error) {
    super(`External service ${service} failed`, "EXTERNAL_SERVICE_ERROR", 502, true, cause);
  }
}
typescript
class AppError extends Error {
  constructor(message: string, public readonly code: string,
    public readonly statusCode = 500, public readonly isOperational = true, cause?: Error) {
    super(message, { cause });
    this.name = this.constructor.name;
  }
}

class NotFoundError extends AppError {
  constructor(resource: string, id: string) {
    super(`${resource} with id ${id} not found`, "RESOURCE_NOT_FOUND", 404);
  }
}

class ValidationError extends AppError {
  constructor(public readonly details: { field: string; message: string }[]) {
    super("Validation failed", "VALIDATION_FAILED", 422);
  }
}

class ExternalServiceError extends AppError {
  constructor(service: string, cause: Error) {
    super(`External service ${service} failed`, "EXTERNAL_SERVICE_ERROR", 502, true, cause);
  }
}

Python

Python

python
class AppError(Exception):
    def __init__(self, message: str, code: str, status_code: int = 500, is_operational: bool = True):
        super().__init__(message)
        self.message, self.code, self.status_code, self.is_operational = message, code, status_code, is_operational

class NotFoundError(AppError):
    def __init__(self, resource: str, resource_id: str):
        super().__init__(f"{resource} with id {resource_id} not found", "RESOURCE_NOT_FOUND", 404)

class ValidationError(AppError):
    def __init__(self, details: list[dict]):
        super().__init__("Validation failed", "VALIDATION_FAILED", 422)
        self.details = details

class ExternalServiceError(AppError):
    def __init__(self, service: str):
        super().__init__(f"External service {service} failed", "EXTERNAL_SERVICE_ERROR", 502)
python
class AppError(Exception):
    def __init__(self, message: str, code: str, status_code: int = 500, is_operational: bool = True):
        super().__init__(message)
        self.message, self.code, self.status_code, self.is_operational = message, code, status_code, is_operational

class NotFoundError(AppError):
    def __init__(self, resource: str, resource_id: str):
        super().__init__(f"{resource} with id {resource_id} not found", "RESOURCE_NOT_FOUND", 404)

class ValidationError(AppError):
    def __init__(self, details: list[dict]):
        super().__init__("Validation failed", "VALIDATION_FAILED", 422)
        self.details = details

class ExternalServiceError(AppError):
    def __init__(self, service: str):
        super().__init__(f"External service {service} failed", "EXTERNAL_SERVICE_ERROR", 502)

Error Boundaries in React

React 错误边界

tsx
import { Component, ErrorInfo, ReactNode } from "react";

interface Props { fallback: ReactNode; children: ReactNode; onError?: (error: Error, info: ErrorInfo) => void; }
interface State { hasError: boolean; }

class ErrorBoundary extends Component<Props, State> {
  constructor(props: Props) { super(props); this.state = { hasError: false }; }
  static getDerivedStateFromError(): State { return { hasError: true }; }
  componentDidCatch(error: Error, info: ErrorInfo) {
    this.props.onError?.(error, info);
    logger.error("React error boundary caught error", { error: error.message, componentStack: info.componentStack });
  }
  render() { return this.state.hasError ? this.props.fallback : this.props.children; }
}

// Usage
<ErrorBoundary fallback={<p>Something went wrong. Please refresh the page.</p>}
  onError={(error) => reportToMonitoring(error)}>
  <Dashboard />
</ErrorBoundary>
  • Wrap each independent UI section in its own boundary
  • Provide a meaningful fallback (not a blank page)
  • Report errors to your monitoring service
  • Error boundaries only catch rendering errors, not event handlers or async code
tsx
import { Component, ErrorInfo, ReactNode } from "react";

interface Props { fallback: ReactNode; children: ReactNode; onError?: (error: Error, info: ErrorInfo) => void; }
interface State { hasError: boolean; }

class ErrorBoundary extends Component<Props, State> {
  constructor(props: Props) { super(props); this.state = { hasError: false }; }
  static getDerivedStateFromError(): State { return { hasError: true }; }
  componentDidCatch(error: Error, info: ErrorInfo) {
    this.props.onError?.(error, info);
    logger.error("React error boundary caught error", { error: error.message, componentStack: info.componentStack });
  }
  render() { return this.state.hasError ? this.props.fallback : this.props.children; }
}

// Usage
<ErrorBoundary fallback={<p>Something went wrong. Please refresh the page.</p>}
  onError={(error) => reportToMonitoring(error)}>
  <Dashboard />
</ErrorBoundary>
  • 为每个独立的UI区域包裹各自的错误边界
  • 提供有意义的降级展示(不要显示空白页面)
  • 将错误上报至监控服务
  • 错误边界仅捕获渲染错误,无法捕获事件处理器或异步代码中的错误

Retry with Exponential Backoff

指数退避重试

Formula:
delay = min(base_delay * 2^attempt + random_jitter, max_delay)
typescript
async function withRetry<T>(
  fn: () => Promise<T>,
  opts = { maxAttempts: 3, baseDelayMs: 1000, maxDelayMs: 30000 }
): Promise<T> {
  let lastError: Error;
  for (let attempt = 0; attempt < opts.maxAttempts; attempt++) {
    try { return await fn(); }
    catch (error) {
      lastError = error as Error;
      if (attempt === opts.maxAttempts - 1) break;
      const delay = Math.min(opts.baseDelayMs * 2 ** attempt + Math.random() * 1000, opts.maxDelayMs);
      logger.warn(`Attempt ${attempt + 1} failed, retrying in ${delay}ms`, { error: lastError.message });
      await new Promise((r) => setTimeout(r, delay));
    }
  }
  throw lastError!;
}
公式:
delay = min(base_delay * 2^attempt + random_jitter, max_delay)
typescript
async function withRetry<T>(
  fn: () => Promise<T>,
  opts = { maxAttempts: 3, baseDelayMs: 1000, maxDelayMs: 30000 }
): Promise<T> {
  let lastError: Error;
  for (let attempt = 0; attempt < opts.maxAttempts; attempt++) {
    try { return await fn(); }
    catch (error) {
      lastError = error as Error;
      if (attempt === opts.maxAttempts - 1) break;
      const delay = Math.min(opts.baseDelayMs * 2 ** attempt + Math.random() * 1000, opts.maxDelayMs);
      logger.warn(`Attempt ${attempt + 1} failed, retrying in ${delay}ms`, { error: lastError.message });
      await new Promise((r) => setTimeout(r, delay));
    }
  }
  throw lastError!;
}

Retry Rules

重试规则

  • Only retry transient errors (network, 429, 503), never 400 or 401
  • Always use jitter to prevent thundering herd
  • Set a maximum number of attempts (3-5 typical)
  • Set a maximum delay cap
  • Log every retry attempt with the error and delay
  • 仅对临时错误进行重试(网络错误、429、503),切勿重试400或401错误
  • 始终添加随机抖动以避免惊群效应
  • 设置最大重试次数(通常3-5次)
  • 设置最大延迟上限
  • 记录每次重试的错误信息和延迟时间

Circuit Breaker Pattern

熔断器模式

CLOSED --[failure threshold reached]--> OPEN --[timeout expires]--> HALF-OPEN
HALF-OPEN --[success]--> CLOSED    |    HALF-OPEN --[failure]--> OPEN
StateBehavior
ClosedRequests pass through normally. Failures are counted.
OpenRequests fail immediately without calling the service.
Half-OpenA limited number of test requests are allowed through.
typescript
class CircuitBreaker {
  private state: "closed" | "open" | "half-open" = "closed";
  private failureCount = 0;
  private lastFailureTime = 0;

  constructor(private failureThreshold = 5, private resetTimeoutMs = 30000) {}

  async execute<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      if (Date.now() - this.lastFailureTime > this.resetTimeoutMs) this.state = "half-open";
      else throw new Error("Circuit breaker is open. Service unavailable.");
    }
    try {
      const result = await fn();
      this.failureCount = 0; this.state = "closed";
      return result;
    } catch (error) {
      this.failureCount++; this.lastFailureTime = Date.now();
      if (this.failureCount >= this.failureThreshold) this.state = "open";
      throw error;
    }
  }
}
CLOSED --[failure threshold reached]--> OPEN --[timeout expires]--> HALF-OPEN
HALF-OPEN --[success]--> CLOSED    |    HALF-OPEN --[failure]--> OPEN
状态行为
关闭状态请求正常通过,记录失败次数。
打开状态直接拒绝请求,不调用服务。
半开状态允许少量测试请求通过,验证服务是否恢复正常。
typescript
class CircuitBreaker {
  private state: "closed" | "open" | "half-open" = "closed";
  private failureCount = 0;
  private lastFailureTime = 0;

  constructor(private failureThreshold = 5, private resetTimeoutMs = 30000) {}

  async execute<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      if (Date.now() - this.lastFailureTime > this.resetTimeoutMs) this.state = "half-open";
      else throw new Error("Circuit breaker is open. Service unavailable.");
    }
    try {
      const result = await fn();
      this.failureCount = 0; this.state = "closed";
      return result;
    } catch (error) {
      this.failureCount++; this.lastFailureTime = Date.now();
      if (this.failureCount >= this.failureThreshold) this.state = "open";
      throw error;
    }
  }
}

User-Facing Error Messages

面向用户的错误提示消息

RuleGoodBad
Be clear about what happened"Your payment could not be processed.""Error 500."
Be actionable"Please check your card details and try again.""Something went wrong."
Avoid technical jargon"We could not connect to the server.""ECONNREFUSED 10.0.0.1:5432"
Do not blame the user"We could not find that page.""You entered the wrong URL."
Do not expose internal details"Please try again later.""NullPointerException in UserService.java"
Provide a way forward"Contact support if this continues."(nothing)
Template:
[What happened]. [What the user can do]. [How to get help if needed].
规则示例(良好)示例(糟糕)
清晰说明发生了什么"你的付款无法完成处理。""错误500。"
提供可操作的建议"请检查你的银行卡信息后重试。""出问题了。"
避免技术术语"我们无法连接到服务器。""ECONNREFUSED 10.0.0.1:5432"
不要指责用户"我们找不到该页面。""你输入了错误的URL。"
不要暴露内部细节"请稍后重试。""NullPointerException in UserService.java"
提供后续帮助途径"如果问题持续,请联系客服。"(无任何内容)
模板:
[发生的问题]。[用户可采取的操作]。[如需帮助的途径]。

Logging Error Context

错误上下文日志记录

typescript
// Good: structured context
logger.error("Failed to process payment", {
  error: error.message, stack: error.stack,
  orderId: order.id, userId: user.id, amount: order.total,
  requestId: req.id, correlationId: req.headers["x-correlation-id"],
});

// Bad: string concatenation
logger.error(`Error: ${error} for order ${order.id}`);
FieldPurposeExample
error.message
What went wrong"Connection refused"
error.stack
Where it happenedFull stack trace
requestId
Trace a single request across systems"req_abc123"
correlationId
Trace a workflow across services"corr_xyz789"
userId
Who was affected"user_456"
operation
What was being attempted"createOrder"
  • Never log passwords, tokens, credit card numbers, or PII
  • Always include a request or correlation ID
  • Log at the appropriate level (error for failures, warn for recoverable)
  • Include enough context to reproduce the issue
typescript
// Good: structured context
logger.error("Failed to process payment", {
  error: error.message, stack: error.stack,
  orderId: order.id, userId: user.id, amount: order.total,
  requestId: req.id, correlationId: req.headers["x-correlation-id"],
});

// Bad: string concatenation
logger.error(`Error: ${error} for order ${order.id}`);
字段用途示例
error.message
描述问题内容"Connection refused"
error.stack
定位问题发生位置完整堆栈跟踪
requestId
在系统间追踪单个请求"req_abc123"
correlationId
在多个服务间追踪整个工作流"corr_xyz789"
userId
标识受影响的用户"user_456"
operation
描述尝试执行的操作"createOrder"
  • 切勿记录密码、令牌、信用卡号或个人身份信息(PII)
  • 始终包含请求ID或关联ID
  • 选择合适的日志级别(错误用于失败场景,警告用于可恢复场景)
  • 包含足够的上下文信息以重现问题

Fail-Fast vs Graceful Degradation

快速失败 vs 优雅降级

QuestionFail-FastDegrade Gracefully
Would continuing corrupt data?Yes
Is the feature critical to the core workflow?Yes
Is it a configuration or startup issue?Yes
Is the failed component optional?Yes
Can the user still complete their primary task?Yes
Is there a reasonable fallback?Yes
typescript
// Fail fast on missing config
for (const name of ["DATABASE_URL", "JWT_SECRET", "REDIS_URL"]) {
  if (!process.env[name]) throw new Error(`Missing required env var: ${name}`);
}

// Graceful degradation for optional service
async function getRecommendations(userId: string): Promise<Product[]> {
  try { return await recommendationService.getForUser(userId); }
  catch (error) {
    logger.warn("Recommendation service unavailable, using fallback", { error: error.message, userId });
    return getDefaultRecommendations();
  }
}
问题快速失败优雅降级
继续执行会导致数据损坏吗?
该功能是核心工作流的关键部分吗?
是配置或启动阶段的问题吗?
故障组件是可选的吗?
用户仍能完成主要任务吗?
存在合理的降级方案吗?
typescript
// Fail fast on missing config
for (const name of ["DATABASE_URL", "JWT_SECRET", "REDIS_URL"]) {
  if (!process.env[name]) throw new Error(`Missing required env var: ${name}`);
}

// Graceful degradation for optional service
async function getRecommendations(userId: string): Promise<Product[]> {
  try { return await recommendationService.getForUser(userId); }
  catch (error) {
    logger.warn("Recommendation service unavailable, using fallback", { error: error.message, userId });
    return getDefaultRecommendations();
  }
}

Error Codes and Catalogs

错误码与错误目录

typescript
const ErrorCatalog = {
  // Authentication (1xxx)
  AUTH_TOKEN_EXPIRED:      { code: 1001, status: 401, message: "Authentication token has expired." },
  AUTH_TOKEN_INVALID:      { code: 1002, status: 401, message: "Authentication token is invalid." },
  AUTH_INSUFFICIENT_PERMS: { code: 1003, status: 403, message: "Insufficient permissions." },
  // Validation (2xxx)
  VALIDATION_REQUIRED:     { code: 2001, status: 422, message: "Required field is missing." },
  VALIDATION_FORMAT:       { code: 2002, status: 422, message: "Field format is invalid." },
  VALIDATION_RANGE:        { code: 2003, status: 422, message: "Value is out of allowed range." },
  // Resources (3xxx)
  RESOURCE_NOT_FOUND:      { code: 3001, status: 404, message: "Resource not found." },
  RESOURCE_CONFLICT:       { code: 3002, status: 409, message: "Resource conflict." },
  // External Services (4xxx)
  EXT_SERVICE_UNAVAILABLE: { code: 4001, status: 503, message: "External service unavailable." },
  EXT_SERVICE_TIMEOUT:     { code: 4002, status: 504, message: "External service timeout." },
  // Internal (5xxx)
  INTERNAL_UNEXPECTED:     { code: 5001, status: 500, message: "An unexpected error occurred." },
} as const;
  • Group codes by category with numeric prefixes
  • Every error returned by the API must have a catalog entry
  • Document error codes in the API reference
  • Never reuse or reassign error codes
typescript
const ErrorCatalog = {
  // Authentication (1xxx)
  AUTH_TOKEN_EXPIRED:      { code: 1001, status: 401, message: "Authentication token has expired." },
  AUTH_TOKEN_INVALID:      { code: 1002, status: 401, message: "Authentication token is invalid." },
  AUTH_INSUFFICIENT_PERMS: { code: 1003, status: 403, message: "Insufficient permissions." },
  // Validation (2xxx)
  VALIDATION_REQUIRED:     { code: 2001, status: 422, message: "Required field is missing." },
  VALIDATION_FORMAT:       { code: 2002, status: 422, message: "Field format is invalid." },
  VALIDATION_RANGE:        { code: 2003, status: 422, message: "Value is out of allowed range." },
  // Resources (3xxx)
  RESOURCE_NOT_FOUND:      { code: 3001, status: 404, message: "Resource not found." },
  RESOURCE_CONFLICT:       { code: 3002, status: 409, message: "Resource conflict." },
  // External Services (4xxx)
  EXT_SERVICE_UNAVAILABLE: { code: 4001, status: 503, message: "External service unavailable." },
  EXT_SERVICE_TIMEOUT:     { code: 4002, status: 504, message: "External service timeout." },
  // Internal (5xxx)
  INTERNAL_UNEXPECTED:     { code: 5001, status: 500, message: "An unexpected error occurred." },
} as const;
  • 按类别分组错误码,使用数字前缀区分
  • API返回的每个错误都必须在目录中有对应的条目
  • 在API参考文档中记录错误码
  • 切勿重复使用或重新分配错误码