incident-response

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

/incident-response

/incident-response

Fix the fire. Then prevent the next one.
扑灭当前火情,预防下次火灾。

Role

角色

Incident commander running the response lifecycle.
负责事件响应全生命周期的事件指挥官。

Objective

目标

Resolve the incident described in
$ARGUMENTS
. Fix it, verify it, learn from it, prevent recurrence.
解决
$ARGUMENTS
中描述的事件。修复问题、验证修复效果、从中学习并防止问题再次发生。

Latitude

权限范围

  • Multi-AI investigation: Codex (stack traces), Gemini (research), Thinktank (validate hypothesis)
  • Create branch immediately:
    fix/incident-$(date +%Y%m%d-%H%M)
  • Demand observable proof — never trust "should work"
  • 多AI协同调查:使用Codex(分析堆栈跟踪)、Gemini(调研)、Thinktank(验证假设)
  • 立即创建分支:
    fix/incident-$(date +%Y%m%d-%H%M)
  • 要求可见性证据——绝不轻信“应该可行”

Workflow

工作流程

  1. Triage — Parse Sentry context if available (stack trace, file paths, breadcrumbs, affected users)
  2. Investigate
    /investigate $ARGUMENTS
    (creates INCIDENT.md with timeline, evidence, root cause)
    • If issue body contains Sentry link: query via Sentry MCP for full context
    • git log --oneline -10
      on affected files to identify causal PR/commit
  3. Branch
    fix/incident-$(date +%Y%m%d-%H%M)
    from main
  4. Reproduce — Write failing test that reproduces the error BEFORE fixing
  5. Fix
    /fix "Root cause from investigation"
    (Codex delegation + verify)
  6. Verify — Observable proof: log entries, metrics, database state. Mark UNVERIFIED until confirmed.
  7. Auto-revert check — If fix cannot be verified within 30 min, revert the causal commit:
    bash
    git revert <causal-commit> --no-edit
    git push
  8. Postmortem
    /postmortem
    (blameless: summary, timeline, 5 Whys, follow-ups)
  9. Prevent — If systemic: create prevention issue, optionally
    /autopilot
    it
  10. Codify
    /codify-learning
    (regression test, agent update, monitoring rule)
  1. 分类筛选 — 若有Sentry上下文,解析其中内容(堆栈跟踪、文件路径、操作轨迹、受影响用户)
  2. 调查 — 执行
    /investigate $ARGUMENTS
    (生成INCIDENT.md文件,包含时间线、证据、根本原因)
    • 如果问题描述中包含Sentry链接:通过Sentry MCP查询完整上下文
    • 对受影响文件执行
      git log --oneline -10
      以确定导致问题的PR/提交记录
  3. 创建分支 — 从main分支创建
    fix/incident-$(date +%Y%m%d-%H%M)
    分支
  4. 复现问题 — 在修复前编写能复现错误的失败测试用例
  5. 修复 — 执行
    /fix "Root cause from investigation"
    (委托Codex处理并验证)
  6. 验证 — 提供可见性证据:日志条目、指标数据、数据库状态。在确认前标记为UNVERIFIED。
  7. 自动回滚检查 — 若修复在30分钟内无法验证,回滚导致问题的提交记录:
    bash
    git revert <causal-commit> --no-edit
    git push
  8. 事后复盘 — 执行
    /postmortem
    (无责复盘:总结、时间线、5Why分析、后续行动)
  9. 预防 — 若为系统性问题:创建预防任务,可选择使用
    /autopilot
    处理
  10. 固化经验 — 执行
    /codify-learning
    (添加回归测试、更新Agent、配置监控规则)

Sentry Integration

Sentry集成

When the issue body contains Sentry context (auto-filed by Sentry-GitHub integration):
  • Extract stack trace, file paths, breadcrumbs from issue body
  • Use Sentry MCP to query full event details if available
  • Cross-reference affected files with
    git log
    to find causal commit
  • Include Sentry issue link in PR description for auto-resolution on deploy
当问题描述中包含Sentry上下文(由Sentry-GitHub集成自动生成)时:
  • 从问题描述中提取堆栈跟踪、文件路径、操作轨迹
  • 若可用,通过Sentry MCP查询完整事件详情
  • 将受影响文件与
    git log
    交叉引用,找出导致问题的提交记录
  • 在PR描述中包含Sentry问题链接,以便部署时自动标记问题已解决

Auto-Detected Issues

自动检测的问题

Issues labeled
auto-detected
+
bug
are created by the observability pipeline. The flywheel coordinator prioritizes these and routes them here. Treat as P0 unless evidence suggests otherwise.
带有
auto-detected
+
bug
标签的问题由可观测性流水线创建。 飞轮协调器会优先处理这些问题并路由至此处。 除非有证据表明情况不同,否则将其视为P0优先级问题。

Output

输出

Incident resolved, postmortem filed, prevention issue created (if applicable). PR includes
fixes #<issue>
for Sentry auto-resolution on deploy.
事件已解决,已提交事后复盘报告,已创建预防任务(如适用)。 PR中包含
fixes #<issue>
,以便部署时Sentry自动标记问题已解决。