backend-principle-eng-nodejs-pro-max

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Backend Principle Eng Node.js Pro Max

Node.js首席后端工程专家指南(Pro Max版)

Principal-level guidance for Node.js backend systems and runtime behavior. Optimized for Bun runtime with Node 20 LTS compatibility.
针对Node.js后端系统与运行时行为的首席级指导方案。已针对Bun runtime优化,兼容Node 20 LTS。

When to Apply

适用场景

  • Designing or refactoring Node.js services and platform components
  • Reviewing runtime, event loop, and concurrency behavior
  • Diagnosing latency spikes, memory leaks, and throughput regressions
  • Planning scalability, cost, or reliability improvements
  • 设计或重构Node.js服务与平台组件
  • 评审运行时、事件循环与并发行为
  • 诊断延迟突增、内存泄漏与吞吐量退化问题
  • 规划可扩展性、成本或可靠性提升方案

Priority Model (highest to lowest)

优先级模型(从高到低)

PriorityCategoryGoalSignals
1Correctness & ContractsNo wrong answersValidation, invariants, idempotency
2Reliability & ResilienceSurvive failuresTimeouts, retries, graceful degradation
3Security & PrivacyZero trust by defaultAuthz, secrets, minimal exposure
4Performance & EfficiencyPredictable latencyEvent loop health, bounded queues
5Observability & OperabilityFast triageTracing, metrics, runbooks
6Data & ConsistencyIntegrity over timeSafe migrations, outbox
7Scalability & EvolutionSafe growthStatelessness, partitioning
8Developer Experience & TestingSustainable velocityCI gates, deterministic tests
优先级类别目标评估信号
1正确性与契约无错误输出校验、不变量、幂等性
2可靠性与韧性故障下持续运行超时、重试、优雅降级
3安全性与隐私默认零信任授权、密钥管理、最小暴露
4性能与效率可预测延迟事件循环健康度、有界队列
5可观测性与可运维性快速问题排查链路追踪、指标、运行手册
6数据与一致性长期数据完整性安全迁移、事务发件箱模式
7可扩展性与演进安全增长无状态化、数据分区
8开发者体验与测试可持续交付速度CI门禁、确定性测试

Quick Reference (Rules)

速查规则

1. Correctness & Contracts (CRITICAL)

1. 正确性与契约(核心优先级)

  • api-contracts
    - Versioned schemas and explicit validation
  • input-validation
    - Validate at boundaries, reject unknowns
  • idempotency
    - Safe retries with idempotency keys
  • invariants
    - Enforce domain rules in service and database
  • api-contracts
    - 版本化Schema与显式校验
  • input-validation
    - 在边界处校验,拒绝未知输入
  • idempotency
    - 使用幂等键实现安全重试
  • invariants
    - 在服务与数据库中强制实施领域规则

2. Reliability & Resilience (CRITICAL)

2. 可靠性与韧性(核心优先级)

  • timeouts
    - Set per dependency; no unbounded waits
  • retries
    - Bounded with jitter; avoid retry storms
  • circuit-breakers
    - Fail fast for degraded dependencies
  • bulkheads
    - Isolate heavy dependencies and queues
  • load-shedding
    - Graceful degradation under load
  • timeouts
    - 为每个依赖设置超时;禁止无界等待
  • retries
    - 带抖动的有限重试;避免重试风暴
  • circuit-breakers
    - 对降级依赖快速失败
  • bulkheads
    - 隔离高负载依赖与队列
  • load-shedding
    - 高负载下优雅降级

3. Security & Privacy (CRITICAL)

3. 安全性与隐私(核心优先级)

  • authz
    - Enforce at every service boundary
  • secrets
    - Use vault/KMS; never in code or logs
  • data-min
    - Redact PII by default
  • crypto
    - TLS everywhere; strong defaults
  • authz
    - 在每个服务边界强制实施授权
  • secrets
    - 使用密钥管理服务/密钥管理系统;绝不在代码或日志中存储密钥
  • data-min
    - 默认脱敏个人可识别信息(PII)
  • crypto
    - 全链路TLS;使用强安全默认配置

4. Performance & Efficiency (HIGH)

4. 性能与效率(高优先级)

  • event-loop
    - Monitor lag; avoid blocking sync work
  • streams
    - Use backpressure-aware streams for large payloads
  • pooling
    - Right-size DB/HTTP pools; avoid starvation
  • cache
    - TTL and stampede protection for hot reads
  • profiling
    - Measure before optimizing
  • event-loop
    - 监控延迟;避免阻塞同步操作
  • streams
    - 对大负载使用支持背压的流处理
  • pooling
    - 合理配置数据库/HTTP连接池;避免资源饥饿
  • cache
    - 热点读场景使用带TTL与缓存击穿保护的缓存
  • profiling
    - 先测量再优化

5. Observability & Operability (HIGH)

5. 可观测性与可运维性(高优先级)

  • structured-logs
    - JSON logs with trace ids
  • metrics
    - RED/USE metrics plus business KPIs
  • tracing
    - Propagate context end-to-end
  • alerts
    - SLO-based with runbooks
  • structured-logs
    - 带追踪ID的JSON格式日志
  • metrics
    - RED/USE指标 + 业务关键绩效指标(KPI)
  • tracing
    - 端到端传播上下文
  • alerts
    - 基于服务水平目标(SLO)的告警,附带运行手册

6. Data & Consistency (HIGH)

6. 数据与一致性(高优先级)

  • transactions
    - Clear boundaries; avoid cross-service tx
  • schema-evolution
    - Backward compatible migrations
  • outbox
    - Reliable event publishing
  • transactions
    - 明确事务边界;避免跨服务事务
  • schema-evolution
    - 向后兼容的Schema迁移
  • outbox
    - 可靠的事件发布机制

7. Scalability & Evolution (MEDIUM)

7. 可扩展性与演进(中优先级)

  • stateless
    - Externalize state, scale horizontally
  • partitioning
    - Shard by stable keys
  • versioning
    - API and event versioning
  • backpressure
    - Bounded queues, explicit limits
  • stateless
    - 外部化状态,支持水平扩容
  • partitioning
    - 基于稳定键进行分片
  • versioning
    - API与事件版本化
  • backpressure
    - 有界队列、显式限制

8. Developer Experience & Testing (MEDIUM)

8. 开发者体验与测试(中优先级)

  • tests
    - Unit, integration, contract, load tests
  • determinism
    - Hermetic tests, fixed seeds, stable time
  • lint
    - Static analysis and formatting
  • tests
    - 单元测试、集成测试、契约测试、负载测试
  • determinism
    - 封闭环境测试、固定随机种子、稳定时间模拟
  • lint
    - 静态分析与代码格式化

Execution Workflow

执行流程

  1. Clarify product goals, SLOs, latency and cost budgets
  2. Map data flow, dependencies, and event loop risks
  3. Choose storage and consistency model (document tradeoffs)
  4. Define contracts: API schemas, events, and idempotency
  5. Implement with safe defaults, observability, and resilience
  6. Validate with tests, load, and failure scenarios
  7. Review risks and publish runbooks
  1. 明确产品目标、服务水平目标(SLO)、延迟与成本预算
  2. 梳理数据流、依赖关系与事件循环风险
  3. 选择存储与一致性模型(明确权衡点)
  4. 定义契约:API Schema、事件与幂等规则
  5. 使用安全默认配置、可观测性与韧性机制实现功能
  6. 通过测试、负载与故障场景验证
  7. 评审风险并发布运行手册

Runtime Guidance

运行时指导

See
references/node-core.md
for event loop, memory, and Bun-first runtime patterns.
有关事件循环、内存与Bun优先的运行时模式,请查看
references/node-core.md