senior-backend

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Senior Backend Engineer

资深后端工程师

Overview

概述

Design and implement robust, scalable backend systems with a focus on API design, service architecture, data management, and operational excellence. This skill covers RESTful and GraphQL API patterns, message-driven architecture, caching strategies, rate limiting, health checks, and full observability with OpenTelemetry.

Announce at start: "I'm using the senior-backend skill for backend system design and implementation."

设计并实现健壮、可扩展的后端系统，重点关注API设计、服务架构、数据管理和运维卓越性。本技能覆盖RESTful和GraphQL API模式、消息驱动架构、缓存策略、限流、健康检查，以及基于OpenTelemetry的全链路可观测性。

启动时声明： "我将使用senior-backend技能完成后端系统设计与实现工作。"

Phase 1: API Design

第一阶段：API设计

Goal: Define the contract before writing implementation code.

目标： 在编写实现代码前先定义接口契约。

Actions

动作

Define resource models and relationships
Design endpoint structure (REST) or schema (GraphQL)
Establish authentication and authorization strategy
Define rate limiting and throttling policies
Create API documentation (OpenAPI/GraphQL schema)

定义资源模型及关联关系
设计端点结构（REST）或Schema（GraphQL）
制定认证与授权策略
定义限流与节流规则
生成API文档（OpenAPI/GraphQL Schema）

API Style Decision Table

API风格决策表

Factor	REST	GraphQL	gRPC
Multiple consumers with different data needs	Poor fit	Strong fit	Poor fit
Simple CRUD operations	Strong fit	Overkill	Overkill
Real-time subscriptions	Requires WebSocket add-on	Built-in	Built-in (streaming)
Service-to-service	Good	Overkill	Strong fit
Public API	Strong fit	Good	Poor fit (tooling)
Mobile with bandwidth constraints	Overfetching risk	Strong fit	Strong fit

考量因素	REST	GraphQL	gRPC
多消费者存在差异化数据需求	不适用	非常适用	不适用
简单CRUD操作	非常适用	过度设计	过度设计
实时订阅能力	需要额外集成WebSocket	原生支持	原生支持（流式）
服务间通信	良好	过度设计	非常适用
公开API	非常适用	良好	不适用（工具链支持不足）
带宽受限的移动端场景	存在过度请求风险	非常适用	非常适用

STOP — Do NOT proceed to Phase 2 until:

停止 — 满足以下条件前请勿进入第二阶段：

Resource models are defined
Endpoint structure or schema is documented
Auth strategy is chosen
API contract is reviewable (OpenAPI/GraphQL schema)

资源模型已定义完成
端点结构或Schema已文档化
认证策略已选定
API契约可评审（OpenAPI/GraphQL Schema）

Phase 2: Implementation

第二阶段：功能实现

Goal: Build the service layer with clear separation of concerns.

目标： 构建关注点清晰分离的服务层。

Actions

动作

Set up project structure with clear layering
Implement data access layer (repositories/DAOs)
Build service layer with business logic
Create API controllers/resolvers
Add middleware (auth, logging, error handling, CORS)
Implement caching strategy

搭建分层清晰的项目结构
实现数据访问层（repository/DAO）
编写包含业务逻辑的服务层
开发API控制器/解析器
添加中间件（认证、日志、错误处理、CORS）
实现缓存策略

RESTful URL Structure

RESTful URL结构规范

GET    /api/v1/users              # List users (paginated)
GET    /api/v1/users/:id          # Get single user
POST   /api/v1/users              # Create user
PUT    /api/v1/users/:id          # Full update
PATCH  /api/v1/users/:id          # Partial update
DELETE /api/v1/users/:id          # Delete user
GET    /api/v1/users/:id/orders   # Nested resources
POST   /api/v1/users/:id/activate # State transitions

GET    /api/v1/users              # 用户列表（分页）
GET    /api/v1/users/:id          # 获取单个用户
POST   /api/v1/users              # 创建用户
PUT    /api/v1/users/:id          # 全量更新
PATCH  /api/v1/users/:id          # 部分更新
DELETE /api/v1/users/:id          # 删除用户
GET    /api/v1/users/:id/orders   # 嵌套资源
POST   /api/v1/users/:id/activate # 状态流转

HTTP Status Code Decision Table

HTTP状态码决策表

Code	Meaning	When to Use
200	OK	Successful GET, PUT, PATCH
201	Created	Successful POST creating resource
204	No Content	Successful DELETE
400	Bad Request	Validation errors
401	Unauthorized	Missing or invalid auth
403	Forbidden	Auth valid but insufficient permissions
404	Not Found	Resource does not exist
409	Conflict	Duplicate or state conflict
422	Unprocessable Entity	Semantically invalid input
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Unexpected server failure

状态码	含义	使用场景
200	OK	GET、PUT、PATCH请求成功
201	Created	POST创建资源成功
204	No Content	DELETE请求成功
400	Bad Request	参数校验失败
401	Unauthorized	缺失认证信息或认证无效
403	Forbidden	认证有效但权限不足
404	Not Found	资源不存在
409	Conflict	重复创建或状态冲突
422	Unprocessable Entity	输入参数语义无效
429	Too Many Requests	触发限流规则
500	Internal Server Error	非预期的服务端故障

Response Format

响应格式规范

json

// Success (single)
{ "data": { "id": "123", "name": "Alice" }, "meta": { "requestId": "req_abc123" } }

// Success (collection)
{ "data": [...], "meta": { "page": 1, "pageSize": 20, "totalCount": 150, "totalPages": 8 } }

// Error
{ "error": { "code": "VALIDATION_ERROR", "message": "Invalid input", "details": [...] } }

json

// 单资源成功响应
{ "data": { "id": "123", "name": "Alice" }, "meta": { "requestId": "req_abc123" } }

// 集合资源成功响应
{ "data": [...], "meta": { "page": 1, "pageSize": 20, "totalCount": 150, "totalPages": 8 } }

// 错误响应
{ "error": { "code": "VALIDATION_ERROR", "message": "无效输入", "details": [...] } }

Caching Strategy Decision Table

缓存策略决策表

Strategy	Description	Use Case
Cache-Aside	App checks cache, falls back to DB	General purpose
Write-Through	Write to cache and DB simultaneously	Strong consistency
Write-Behind	Write to cache, async write to DB	High write throughput
Read-Through	Cache loads from DB on miss	Transparent caching

策略	描述	适用场景
Cache-Aside	应用先查缓存，未命中回查数据库	通用场景
Write-Through	数据同时写入缓存和数据库	强一致性要求场景
Write-Behind	先写缓存，异步同步到数据库	高写入吞吐量场景
Read-Through	缓存未命中时主动从数据库加载	透明缓存场景

STOP — Do NOT proceed to Phase 3 until:

停止 — 满足以下条件前请勿进入第三阶段：

Project structure follows layered architecture
Input validation is at the edge (Zod, Joi, class-validator)
Error handling returns structured error responses
Caching strategy is implemented with invalidation plan

项目结构遵循分层架构规范
输入校验在边缘层完成（Zod、Joi、class-validator）
错误处理返回结构化错误响应
缓存策略已实现并配套失效机制

Phase 3: Hardening

第三阶段：生产加固

Goal: Prepare the service for production operation.

目标： 完成服务生产环境运行所需的准备工作。

Actions

动作

Add comprehensive error handling
Implement health checks and readiness probes
Set up observability (traces, metrics, logs)
Load test critical paths
Document runbooks for operational scenarios

完善全链路错误处理
实现健康检查与就绪探针
搭建可观测性体系（链路追踪、指标、日志）
对核心路径进行压测
编写运维场景的运行手册

Health Check Endpoints

健康检查端点规范

json

// GET /health — lightweight liveness check
{ "status": "healthy" }

// GET /health/ready — readiness with dependency checks
{
  "status": "healthy",
  "checks": {
    "database": { "status": "healthy", "latency": "5ms" },
    "redis": { "status": "healthy", "latency": "2ms" },
    "queue": { "status": "healthy", "latency": "8ms" }
  },
  "uptime": "72h15m",
  "version": "1.4.2"
}

json

// GET /health — 轻量存活检查
{ "status": "healthy" }

// GET /health/ready — 包含依赖检查的就绪探针
{
  "status": "healthy",
  "checks": {
    "database": { "status": "healthy", "latency": "5ms" },
    "redis": { "status": "healthy", "latency": "2ms" },
    "queue": { "status": "healthy", "latency": "8ms" }
  },
  "uptime": "72h15m",
  "version": "1.4.2"
}

Observability: RED Method Metrics

可观测性：RED方法指标

Metric	Description	Implementation
Rate	Requests per second	Counter incremented per request
Errors	Error rate per second	Counter incremented per error
Duration	Latency distribution	Histogram (p50, p95, p99)

指标	描述	实现方式
Rate	每秒请求量	每个请求触发计数器累加
Errors	每秒错误量	每个错误触发计数器累加
Duration	延迟分布	直方图统计（p50、p95、p99分位）

Structured Logging Format

结构化日志格式规范

json

{
  "timestamp": "2025-01-15T10:30:00.123Z",
  "level": "info",
  "message": "User created",
  "service": "user-service",
  "traceId": "abc123",
  "spanId": "def456",
  "userId": "usr_123",
  "duration": 45
}

json

{
  "timestamp": "2025-01-15T10:30:00.123Z",
  "level": "info",
  "message": "用户创建成功",
  "service": "user-service",
  "traceId": "abc123",
  "spanId": "def456",
  "userId": "usr_123",
  "duration": 45
}

Rate Limiting Algorithm Decision Table

限流算法决策表

Algorithm	Pros	Cons	Best For
Fixed Window	Simple, low memory	Burst at boundaries	Internal APIs
Sliding Window	Smooth distribution	More memory	Public APIs
Token Bucket	Controlled bursts	Slightly complex	Industry standard
Leaky Bucket	Constant output	No burst allowed	Strict rate control

算法	优势	劣势	最佳适用场景
固定窗口	实现简单、内存占用低	窗口边界存在流量突刺风险	内部API
滑动窗口	流量分布平滑	内存占用更高	公开API
令牌桶	可控制流量突刺	实现稍复杂	行业通用标准场景
漏桶	输出速率恒定	不允许流量突刺	严格速率控制场景

STOP — Hardening complete when:

停止 — 满足以下条件时加固完成：

Event-Driven Architecture Patterns

事件驱动架构模式

Message Queue Pattern Decision Table

消息队列模式决策表

Pattern	Use Case	Example
Pub/Sub	Broadcast to multiple consumers	User registered -> email, analytics, CRM
Work Queue	Distribute tasks across workers	Image processing, PDF generation
Request/Reply	Async request with response	Price calculation service
Dead Letter	Handle failed messages	Retry policy exceeded

模式	适用场景	示例
发布/订阅	广播消息到多个消费者	用户注册 -> 触发邮件、 analytics、CRM更新
工作队列	分布式任务分发	图片处理、PDF生成
请求/响应	异步请求需返回结果	价格计算服务
死信队列	处理消费失败的消息	超过重试次数的消息

Event Schema

事件Schema规范

json

{
  "eventId": "evt_abc123",
  "eventType": "user.created",
  "timestamp": "2025-01-15T10:30:00Z",
  "version": "1.0",
  "source": "user-service",
  "data": { "userId": "usr_123", "email": "alice@example.com" },
  "metadata": { "correlationId": "corr_xyz789", "causationId": "cmd_def456" }
}

json

{
  "eventId": "evt_abc123",
  "eventType": "user.created",
  "timestamp": "2025-01-15T10:30:00Z",
  "version": "1.0",
  "source": "user-service",
  "data": { "userId": "usr_123", "email": "alice@example.com" },
  "metadata": { "correlationId": "corr_xyz789", "causationId": "cmd_def456" }
}

GraphQL Anti-Patterns

GraphQL反模式

Anti-Pattern	Problem	Fix
N+1 queries	Performance degradation	DataLoader for batching
Unbounded queries	DoS vulnerability	Enforce depth and complexity limits
Over-fetching in resolvers	Wasted DB queries	Select only requested fields

反模式	问题	解决方案
N+1查询	性能下降	使用DataLoader做批量查询
无边界查询	DoS安全风险	强制查询深度和复杂度限制
解析器过度查询	浪费数据库查询资源	仅查询请求的字段

Anti-Patterns / Common Mistakes

反模式/常见错误

Anti-Pattern	Why It Is Wrong	Correct Approach
Exposing database IDs directly	Security risk, coupling to DB	Use UUIDs or prefixed IDs
Synchronous external service calls in request path	Single point of failure, latency	Async with queues or circuit breaker
N+1 query patterns	Linear performance degradation	Eager loading or DataLoader
Catching and swallowing errors	Silent failures, impossible debugging	Log and propagate with context
Shared mutable state across handlers	Race conditions, unpredictable behavior	Stateless request handling
Skipping input validation	Injection, data corruption	Validate at the edge, always
Generic 500 for all errors	Poor developer experience	Specific error codes and messages
No API versioning	Breaking changes affect all consumers	Version from day one ( `/v1/` )

反模式	错误原因	正确做法
直接暴露数据库ID	安全风险、与数据库耦合	使用UUID或带前缀的ID
请求路径中同步调用外部服务	单点故障、高延迟	队列异步调用或加熔断器
N+1查询模式	性能线性下降	预加载或使用DataLoader
捕获并吞掉错误	静默失败、无法排查问题	记录日志并附带上下文抛出
处理器间共享可变状态	竞态条件、行为不可预测	无状态请求处理
省略输入校验	注入风险、数据损坏	始终在边缘层做校验
所有错误都返回通用500	开发者体验差	返回具体错误码和信息
不做API版本控制	破坏性变更影响所有消费者	从第一天就做版本控制（ `/v1/` ）

Documentation Lookup (Context7)

文档查询（Context7）

Use

mcp__context7__resolve-library-id

then

mcp__context7__query-docs

for up-to-date docs. Returned docs override memorized knowledge.

```
express
```
— for middleware patterns, routing, or request/response API
```
fastify
```
— for plugin system, hooks, or schema validation
```
nestjs
```
— for decorators, modules, providers, or guards
```
prisma
```
— for schema syntax, client API, or migration commands

先调用

mcp__context7__resolve-library-id

再调用

mcp__context7__query-docs

获取最新文档，返回的文档优先级高于记忆知识。

```
express
```
— 查询中间件模式、路由、请求/响应API相关内容
```
fastify
```
— 查询插件系统、钩子、Schema校验相关内容
```
nestjs
```
— 查询装饰器、模块、provider、守卫相关内容
```
prisma
```
— 查询Schema语法、客户端API、迁移命令相关内容

Integration Points

集成点

Skill	Relationship
`senior-architect`	Architecture decisions guide backend service boundaries
`security-review`	Backend security follows OWASP and auth patterns
`performance-optimization`	Backend performance uses caching and query tuning
`testing-strategy`	Backend test strategy defines integration test approach
`code-review`	Review verifies API design and error handling
`acceptance-testing`	API behavior becomes acceptance criteria
`senior-fullstack`	Backend serves the full-stack tRPC layer

技能	关联关系
`senior-architect`	架构决策指导后端服务边界划分
`security-review`	后端安全遵循OWASP和认证模式规范
`performance-optimization`	后端性能优化使用缓存和查询调优
`testing-strategy`	后端测试策略定义集成测试方案
`code-review`	评审验证API设计和错误处理逻辑
`acceptance-testing`	API行为作为验收标准
`senior-fullstack`	后端为全栈tRPC层提供服务

Key Principles

核心原则

API versioning from day one (
```
/v1/
```
)
Input validation at the edge (Zod, Joi, class-validator)
Idempotency keys for non-GET endpoints
Graceful shutdown (drain connections, finish in-flight requests)
Circuit breaker for external service calls
Database migrations versioned and reversible
Secrets in environment variables, never in code

从第一天就做API版本控制（
```
/v1/
```
）
输入校验在边缘层完成（Zod、Joi、class-validator）
非GET接口使用幂等键
优雅停机（释放连接、完成处理中的请求）
外部服务调用加熔断器
数据库迁移版本化且可回滚
密钥存储在环境变量中，绝对不写入代码

Skill Type

技能类型

FLEXIBLE — Adapt API style and architecture to the project context. The three-phase process (design, implement, harden) is strongly recommended. Health checks, structured logging, and error handling are non-negotiable for production services.

灵活适配 — 根据项目上下文调整API风格和架构。强烈推荐遵循三阶段流程（设计、实现、加固）。健康检查、结构化日志和错误处理是生产服务的强制要求。