engineering-emergency-changes

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Engineering Emergency Changes

工程紧急变更

Emergency review is for rare cases where speed of the entire change matters more than normal review depth. It does not mean "skip engineering judgment"; it means narrow the change, verify the emergency, prioritize correctness, and schedule follow-up review.
Adapted from Google Engineering Practices Documentation, especially "Emergencies" and the reviewer speed guidance. Source: https://google.github.io/eng-practices/review/emergencies.html. License: CC-BY 3.0.
紧急评审适用于极少数情况:此时变更的速度比常规评审的深度更重要。这并不意味着“跳过工程判断”,而是指缩小变更范围、确认紧急状态、优先保证正确性,并安排后续评审。
本内容改编自Google工程实践文档,尤其是“Emergencies(紧急情况)”和评审速度指南部分。来源:https://google.github.io/eng-practices/review/emergencies.html。许可证:CC-BY 3.0。

Emergency Gate

紧急判定标准

Treat a change as an emergency only when it is a small targeted change and at least one condition is true:
  • It prevents a major launch from failing when rollback is not the right option.
  • It fixes a bug that is significantly affecting production users.
  • It addresses a pressing legal or compliance issue.
  • It closes a major security vulnerability.
  • It satisfies a hard external deadline where missing it would cause contractual, legal, security, production-user, or clearly disastrous launch harm.
These are usually not emergencies by themselves:
  • Wanting to launch sooner.
  • A feature has taken a long time and the author wants it merged.
  • Reviewers are asleep, away, or in another time zone.
  • It is late Friday.
  • A manager wants it done today for a soft deadline.
  • The team wants to avoid normal review discomfort.
Most deadlines are soft. Do not sacrifice code health for a soft deadline.
仅当变更为小范围针对性修改,且满足以下至少一个条件时,才可视为紧急变更:
  • 它能在回滚并非合适方案的情况下,避免重大发布失败。
  • 它修复了对生产用户造成严重影响的bug。
  • 它解决了紧迫的法律或合规问题。
  • 它修复了重大安全漏洞。
  • 它满足硬性外部截止日期,若错过该日期将导致合同违约、法律风险、安全问题、生产用户受损或明显的灾难性发布后果。
以下情况本身通常不视为紧急变更:
  • 希望提前发布。
  • 某个功能开发耗时较长,作者希望合并代码。
  • 评审人员处于睡眠、休假或不同时区。
  • 当前为周五晚些时候。
  • 经理要求在今日完成软性截止日期的任务。
  • 团队希望规避常规评审的繁琐流程。
大多数截止日期都是软性的。不要为了软性截止日期牺牲代码健康度。

Emergency Workflow

紧急变更工作流

  1. State the emergency claim in one sentence.
  2. Decide whether it passes the emergency gate.
  3. Shrink the change to the minimum fix.
  4. Confirm the change directly addresses the emergency.
  5. Review for correctness, blast radius, rollback, and obvious safety issues first.
  6. Run the fastest meaningful verification available.
  7. Record skipped normal checks and why they were skipped.
  8. Merge or release only if the emergency risk is greater than the code risk.
  9. After the emergency is resolved, perform a normal-depth follow-up review.
  10. File cleanup, tests, documentation, and postmortem tasks immediately so they are not lost.
  1. 用一句话说明紧急变更的理由。
  2. 判断是否符合紧急判定标准。
  3. 将变更缩小至最小修复范围。
  4. 确认变更直接解决紧急问题。
  5. 优先评审正确性、影响范围、回滚方案和明显的安全问题。
  6. 运行最快且有效的验证流程。
  7. 记录跳过的常规检查项及跳过原因。
  8. 仅当紧急风险大于代码风险时,才合并或发布变更。
  9. 紧急问题解决后,进行常规深度的后续评审。
  10. 立即创建清理、测试、文档和事后复盘任务,避免遗漏。

Emergency Review Checklist

紧急评审检查清单

  • Scope: Is this the smallest change that resolves the urgent condition?
  • Correctness: Does it actually fix the production, security, legal, or launch issue?
  • Containment: Can it be rolled back or disabled quickly?
  • Tests: What automated or manual check gives the most confidence right now?
  • Observability: How will the team know whether the fix worked?
  • Side effects: What users, data, jobs, APIs, or deployments can be harmed?
  • Follow-up: What normal review items, tests, docs, or cleanup must happen after?
  • 范围:这是否是解决紧急问题的最小变更?
  • 正确性:它是否真的解决了生产、安全、法律或发布问题?
  • 可回滚性:能否快速回滚或禁用该变更?
  • 测试:当前哪种自动化或手动检查能提供最高的可信度?
  • 可观测性:团队如何确认修复是否生效?
  • 副作用:哪些用户、数据、任务、API或部署可能受到影响?
  • 后续工作:紧急处理后必须完成哪些常规评审项、测试、文档或清理工作?

Output Template

输出模板

markdown
undefined
markdown
undefined

Emergency Decision

Emergency Decision

QUALIFIES | DOES_NOT_QUALIFY | NEEDS_MORE_INFO
QUALIFIES | DOES_NOT_QUALIFY | NEEDS_MORE_INFO

Reason

Reason

[Concrete reason tied to production user impact, security, legal/compliance, major launch, or hard external deadline.]
[Concrete reason tied to production user impact, security, legal/compliance, major launch, or hard external deadline.]

Minimum Safe Change

Minimum Safe Change

  • ...
  • ...

Immediate Verification

Immediate Verification

  • ...
  • ...

Release / Rollback Notes

Release / Rollback Notes

  • ...
  • ...

Follow-up Required

Follow-up Required

  • Normal-depth review:
  • Tests:
  • Cleanup:
  • Documentation:
  • Owner:
undefined
  • Normal-depth review:
  • Tests:
  • Cleanup:
  • Documentation:
  • Owner:
undefined

Common Mistakes

常见错误

  • Treating urgency as an emergency.
  • Expanding the hotfix to include nearby cleanup.
  • Skipping all review instead of changing the review focus.
  • Forgetting to perform a normal review after the incident is resolved.
  • Letting emergency exceptions become the team's normal release process.
  • 将普通紧急情况视为紧急变更。
  • 在hotfix中加入无关的清理工作。
  • 完全跳过评审,而非调整评审重点。
  • 忘记在事件解决后进行常规评审。
  • 让紧急例外成为团队的常规发布流程。