debug-investigator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDebug Investigator
调试调查框架
Structured debugging methodology that replaces ad-hoc exploration with hypothesis-driven
investigation. Captures symptoms, builds a deterministic feedback loop, analyzes evidence
(stacktraces, logs, state), generates ranked hypotheses, designs bisection strategies,
identifies instrumentation points, and produces minimal reproductions — documenting every
step so dead ends are never revisited.
When to use this skill vs native debugging: The base model handles straightforward debugging (clear stacktraces, obvious errors) natively. Use this skill for non-obvious bugs requiring systematic investigation: intermittent failures, bugs with no clear stacktrace, performance regressions, or issues requiring git bisection and hypothesis ranking.
这是一套结构化调试方法论,用基于假设的调查取代随机探索式调试。它涵盖症状捕获、构建确定性反馈循环、分析证据(stacktraces、日志、系统状态)、生成排序后的假设、设计二分排查策略、确定插桩点,以及生成最小复现用例——并记录每一步操作,避免重复走入死胡同。
**何时使用该工具而非原生调试能力:**基础模型可原生处理简单调试场景(清晰的stacktraces、明显的错误)。当遇到需要系统化调查的非显性bug时,请使用本工具:间歇性故障、无清晰stacktraces的bug、性能退化,或需要git bisect和假设排序的问题。
Reference Files
参考文件
| File | Contents | Load When |
|---|---|---|
| Exception taxonomy, traceback reading, common Python/JS error signatures | Stacktrace or exception present |
| Bug category catalog, probability ranking, confirmation/refutation tests | Always |
| git bisect workflow, binary search debugging, narrowing techniques | Bug appeared after a change |
| Log pattern extraction, anomaly detection, timeline correlation | Log output available |
| Strategic logging placement, breakpoint strategy, state inspection techniques | Investigation plan needed |
| 文件路径 | 内容描述 | 加载时机 |
|---|---|---|
| 异常分类体系、traceback解读、常见Python/JS错误特征 | 存在Stacktrace或异常时加载 |
| Bug类别目录、概率排序方法、假设验证/推翻测试方案 | 始终加载 |
| git bisect工作流、二分查找调试法、范围缩小技巧 | 变更后出现的Bug场景 |
| 日志模式提取、异常检测、时间线关联分析 | 有日志输出可用时加载 |
| 策略性日志放置、断点设置方案、状态检查技巧 | 需要制定调查计划时加载 |
Prerequisites
前置条件
- git — for bisection and history analysis
- Access to source code — cannot debug opaque binaries
- Reproducible environment — or at minimum, error output (stacktrace, logs)
- git — 用于二分排查和历史版本分析
- 源代码访问权限 — 无法调试无源码的二进制文件
- 可复现的环境 — 或至少提供错误输出(stacktrace、日志)
Project Context
项目上下文
Before deep investigation, check for repo-local agent context:
- for
docs/agents/domain.md,CONTEXT.md, and ADR lookup rulesCONTEXT-MAP.md - or relevant context-local glossary for domain vocabulary
CONTEXT.md - and context-local ADRs for decisions near the failing area
docs/adr/
Use the project glossary in hypotheses, repro names, and prevention recommendations. If the
repo lacks these files, continue normally; do not block debugging on context setup.
在深入调查之前,请检查仓库内的Agent上下文:
- 查看获取
docs/agents/domain.md、CONTEXT.md以及ADR查询规则CONTEXT-MAP.md - 查看或相关本地上下文术语表,了解领域词汇
CONTEXT.md - 查看及本地上下文ADR,了解故障区域附近的决策记录
docs/adr/
在假设、复现用例命名和预防建议中使用项目术语表。如果仓库缺少这些文件,可正常继续调试,无需因上下文设置而受阻。
Workflow
工作流
Phase 1: Symptom Capture
阶段1:症状捕获
Before touching code, document the observable problem:
- What is happening? — Describe the observed behavior precisely. "It crashes" is
insufficient. "Raises on line 42 of
KeyError('user_id')when callingauth.pywith a valid session token" is actionable.get_current_user() - What should happen? — Define the expected behavior. If unknown, state that.
- Reproducibility — Always, intermittent (with frequency), or one-time? Intermittent bugs require different strategies than deterministic ones.
- Recency — When did this start? Correlate with recent changes: . If the bug appeared after a specific commit, bisection is the fastest path.
git log --oneline -20 - Environment — Python version, OS, dependency versions, configuration differences between working and broken environments.
在修改代码之前,先记录可观察到的问题:
- 当前现象? — 精准描述观察到的行为。仅说"程序崩溃"不够具体,类似"当使用有效会话令牌调用时,
get_current_user()第42行抛出auth.py"这样的描述才具备可操作性。KeyError('user_id') - 预期行为? — 定义预期的正确行为。若未知,请明确说明。
- 可复现性 — 总是可复现、间歇性复现(需注明频率),还是仅出现过一次?间歇性bug需要与确定性bug不同的调试策略。
- 出现时间 — 该问题何时首次出现?结合近期变更进行关联:执行查看。如果bug是在某个特定提交后出现的,二分排查是最快的解决路径。
git log --oneline -20 - 环境信息 — Python版本、操作系统、依赖包版本,以及正常环境与故障环境之间的配置差异。
Phase 2: Build a Feedback Loop
阶段2:构建反馈循环
Create a fast, deterministic pass/fail signal for the reported bug before ranking hypotheses
or changing production code. The loop must reproduce the user's symptom, not a nearby failure.
Try these seams in order:
- Failing test at the smallest public interface that reaches the bug.
- CLI or script invocation with fixture input and asserted output.
- Curl or HTTP request against a local server with asserted response, logs, or state.
- Browser automation for UI bugs with DOM, console, and network assertions.
- Replayed trace, event payload, HAR, or log fixture through the real code path.
- Throwaway harness that boots the minimal subsystem needed to trigger the path.
- Property, fuzz, or stress loop for intermittent failures.
- harness when the bug appeared between known good and bad revisions.
git bisect run
Improve the loop before moving on:
- Make it faster by narrowing setup and caching expensive fixtures.
- Make it sharper by asserting the exact symptom.
- Make it more deterministic by pinning time, seeds, filesystem paths, and network access.
- For intermittent bugs, raise reproduction rate with repeated runs, concurrency, stress, or timing probes until the failure is frequent enough to debug.
If no credible loop can be built, stop and state what was tried. Request the missing artifact:
environment access, captured payloads, logs, screen recording with timestamps, or permission
for temporary instrumentation. Do not proceed to speculative fixes.
在对假设排序或修改生产代码之前,为报告的bug创建一个快速、确定性的通过/失败验证机制。该机制必须复现用户报告的症状,而非其他相近的故障。
请按以下顺序尝试构建验证机制:
- 在能触达bug的最小公共接口处编写失败测试用例。
- 使用固定输入和断言输出的CLI或脚本调用。
- 针对本地服务器发起Curl或HTTP请求,断言响应、日志或状态。
- 针对UI bug使用浏览器自动化工具,断言DOM、控制台和网络请求。
- 通过真实代码路径重放trace、事件payload、HAR或日志固定数据。
- 编写一次性测试框架,启动触发故障路径所需的最小子系统。
- 针对间歇性故障使用属性测试、模糊测试或压力测试循环。
- 当bug出现在已知正常版本与故障版本之间时,使用框架。
git bisect run
在进入下一阶段前优化验证机制:
- 通过缩小环境配置范围和缓存昂贵的固定数据来提升速度。
- 通过断言精准症状来提升准确性。
- 通过固定时间、随机种子、文件系统路径和网络访问来提升确定性。
- 针对间歇性故障,通过重复运行、并发执行、压力测试或时间探测来提高复现率,直到故障频率足够高以便调试。
如果无法构建可靠的验证机制,请停止操作并记录已尝试的步骤。请求获取缺失的资源:环境访问权限、捕获的payload、日志、带时间戳的屏幕录制,或临时插桩的权限。请勿进行推测性修复。
Phase 3: Evidence Analysis
阶段3:证据分析
Examine all available evidence before forming hypotheses:
-
Stacktrace interpretation — If a traceback exists, read it bottom-up. The last frame is where the error manifested, but the cause is often several frames up. Identify:
- Exception type and message
- The frame where the error originated vs. where it was raised
- Any familiar patterns (see )
references/stacktrace-patterns.md
-
Log pattern extraction — Search logs for:
- Temporal anomalies (timestamps out of sequence, gaps)
- Repeated errors (same error appearing in bursts)
- State transitions that didn't complete
- Correlation with external events (deploys, config changes)
-
State inspection — If the system is running, inspect:
- Variable values at the failure point
- Database state (missing rows, unexpected values)
- Configuration values (environment variables, config files)
- External dependency status (API availability, DB connectivity)
-
Code diff analysis — If the bug is recent:
- — what changed?
git diff HEAD~5 - Focus on files touched by the error's call chain
- Look for typos, wrong variable names, missing null checks
在形成假设之前,先检查所有可用证据:
-
Stacktrace解读 — 如果存在traceback,请从下往上阅读。最后一帧是错误显现的位置,但根源通常在前面几帧中。需要识别:
- 异常类型和消息
- 错误起源的帧与错误抛出的帧
- 任何熟悉的模式(参考)
references/stacktrace-patterns.md
-
日志模式提取 — 在日志中搜索:
- 时间异常(时间戳顺序错误、间隔缺失)
- 重复出现的错误(同一错误批量出现)
- 未完成的状态转换
- 与外部事件的关联(部署、配置变更)
-
状态检查 — 如果系统仍在运行,请检查:
- 故障点的变量值
- 数据库状态(缺失行、意外值)
- 配置值(环境变量、配置文件)
- 外部依赖状态(API可用性、数据库连接性)
-
代码差异分析 — 如果bug是近期出现的:
- 执行— 查看哪些内容发生了变更?
git diff HEAD~5 - 重点关注错误调用链涉及的文件
- 查找拼写错误、变量名误用、缺失空值检查等问题
- 执行
Phase 4: Hypothesis Generation
阶段4:假设生成
Generate ranked hypotheses — never start fixing without a hypothesis:
-
List 3-5 hypotheses ranked by likelihood. Each hypothesis must include:
- A concrete claim about what is wrong
- What evidence supports it
- What evidence would confirm it (a test you can run)
- What evidence would refute it
-
Rank by likelihood using:
- Proximity to recent changes (most bugs are in new code)
- Simplicity (typos before race conditions)
- Evidence fit (does the hypothesis explain ALL symptoms?)
-
Common bug categories (see):
references/hypothesis-templates.md- State bugs: wrong value, missing initialization, stale cache
- Logic bugs: off-by-one, wrong operator, inverted condition
- Integration bugs: API contract mismatch, serialization error
- Concurrency bugs: race condition, deadlock, resource starvation
- Environment bugs: missing dependency, wrong config, version mismatch
生成排序后的假设——在没有明确假设前,切勿开始修复:
-
列出3-5个假设并按可能性排序。每个假设必须包含:
- 关于问题根源的具体断言
- 支持该假设的证据
- 可验证该假设的测试步骤
- 可推翻该假设的测试步骤
-
按可能性排序的依据:
- 与近期变更的相关性(大多数bug出现在新代码中)
- 简单性(优先考虑拼写错误而非竞态条件)
- 证据匹配度(该假设是否能解释所有症状?)
-
常见bug类别(参考):
references/hypothesis-templates.md- 状态类bug:值错误、初始化缺失、缓存过期
- 逻辑类bug:边界错误、运算符误用、条件反转
- 集成类bug:API契约不匹配、序列化错误
- 并发类bug:竞态条件、死锁、资源饥饿
- 环境类bug:依赖缺失、配置错误、版本不匹配
Phase 5: Investigation Plan
阶段5:调查计划
Design specific steps to test each hypothesis:
- Test H1 first — Always test the most likely hypothesis first. Design a single action that will confirm or refute it.
- Bisection — If the bug appeared after a change and H1 fails:
- Identify the known-good and known-bad commits
- Run
git bisect start <bad> <good> - Define the test command for each commit
- See for workflow
references/bisection-guide.md
- Isolation — Remove variables one at a time:
- Simplify input data
- Disable features/plugins
- Replace external calls with hardcoded values
- Run in a clean environment
- Instrumentation — Add targeted logging/breakpoints:
- At function entry/exit points in the call chain
- Before and after state mutations
- At decision points (if/else branches)
- See
references/instrumentation-points.md
设计具体步骤测试每个假设:
- 优先测试H1 — 始终先测试可能性最高的假设。设计一个可直接验证或推翻该假设的操作。
- 二分排查 — 如果bug是在变更后出现且H1不成立:
- 确定已知正常和已知故障的提交版本
- 执行
git bisect start <bad> <good> - 定义每个提交版本的测试命令
- 参考了解工作流
references/bisection-guide.md
- 隔离测试 — 逐个移除变量:
- 简化输入数据
- 禁用功能/插件
- 用硬编码值替换外部调用
- 在干净环境中运行
- 插桩调试 — 添加针对性的日志/断点:
- 在调用链的函数入口/出口处
- 状态变更前后
- 决策点(if/else分支)
- 参考
references/instrumentation-points.md
Phase 6: Execution
阶段6:执行调查
Execute the investigation plan, updating hypotheses as evidence arrives:
- Test one variable at a time — Changing multiple things simultaneously makes results uninterpretable.
- Record results — Document what each test revealed, even negative results. Dead-end documentation prevents revisiting failed paths.
- Update probabilities — After each test, re-rank hypotheses. If H1 is refuted, H2 becomes the new priority.
- Know when to escalate — If all hypotheses are exhausted, the bug is in a category you haven't considered. Step back and re-examine assumptions.
执行调查计划,根据新证据更新假设:
- 每次仅测试一个变量 — 同时变更多个变量会导致结果无法解读。
- 记录结果 — 记录每个测试的发现,即使是否定结果。记录死胡同可避免重复无效操作。
- 更新概率排序 — 每次测试后,重新排序假设。如果H1被推翻,H2成为新的优先测试项。
- 知晓何时升级问题 — 如果所有假设都被排除,说明bug属于未考虑到的类别。请退一步重新审视所有假设前提。
Phase 7: Resolution Documentation
阶段7:问题解决文档
After finding the root cause:
- Root cause — What was actually wrong, precisely.
- Fix — What was changed and why.
- Prevention — How to prevent recurrence (test, lint rule, type check, etc.).
- Lessons — What was learned that applies beyond this specific bug.
找到根源后:
- 根因 — 精准描述实际存在的问题。
- 修复方案 — 说明修改内容及原因。
- 预防措施 — 说明如何避免问题复发(测试用例、lint规则、类型检查等)。
- 经验总结 — 总结可应用于其他bug调试的通用经验。
Output Format
输出格式
undefinedundefinedDebug Investigation: {Brief Description}
调试调查:{简要描述}
Symptom
症状
Observed: {What is happening — precise description}
Expected: {What should happen}
Reproducibility: {Always | Intermittent (~N% of attempts) | Once}
First noticed: {Date/time or triggering event}
Environment: {Relevant versions and configuration}
观察到的现象: {精准描述当前问题}
预期行为: {应该出现的正确行为}
可复现性: {总是可复现 | 间歇性复现(约N%的尝试) | 仅出现一次}
首次发现时间: {日期/时间或触发事件}
环境信息: {相关版本和配置}
Evidence Analysis
证据分析
Stacktrace
Stacktrace
- Exception: {type}: {message}
- Origin: {file}:{line} in {function}
- Call chain: {caller} → {caller} → {failure point}
- Key insight: {What the traceback reveals about the cause}
- 异常信息: {类型}: {消息}
- 起源位置: {文件}:{行} 位于 {函数}
- 调用链: {调用者} → {调用者} → {故障点}
- 关键发现: {traceback揭示的问题线索}
Logs
日志
- Anomaly: {What is unusual}
- Timeline: {When the anomaly started}
- Correlation: {Related events}
- 异常点: {异常内容}
- 时间线: {异常开始时间}
- 关联事件: {相关事件}
Code Changes
代码变更
- Recent commits: {relevant commits since last known-good state}
- Files in error path: {which changed files appear in the traceback}
- 近期提交: {自上次正常状态后的相关提交}
- 错误路径涉及文件: {traceback中出现的变更文件}
Hypotheses
假设列表
| # | Hypothesis | Likelihood | Confirming Test | Refuting Test |
|---|---|---|---|---|
| H1 | {Specific claim} | High | {What to check} | {What would disprove} |
| H2 | {Specific claim} | Medium | {What to check} | {What would disprove} |
| H3 | {Specific claim} | Low | {What to check} | {What would disprove} |
| # | 假设内容 | 可能性 | 验证测试步骤 | 推翻测试步骤 |
|---|---|---|---|---|
| H1 | {具体断言} | 高 | {验证操作} | {推翻操作} |
| H2 | {具体断言} | 中 | {验证操作} | {推翻操作} |
| H3 | {具体断言} | 低 | {验证操作} | {推翻操作} |
Investigation Plan
调查计划
Step 1: Test H1 — {action}
步骤1:测试H1 — {操作内容}
- Command/action: {specific step}
- If confirmed: {next action — fix}
- If refuted: proceed to Step 2
- 命令/操作: {具体步骤}
- 若验证成立: {下一步操作 — 修复}
- 若验证不成立: 进入步骤2
Step 2: Bisection
步骤2:二分排查
- Good commit: {hash}
- Bad commit: {hash}
- Test: {command to verify each commit}
- Command:
git bisect start {bad} {good}
- 正常版本: {哈希值}
- 故障版本: {哈希值}
- 测试方法: {验证每个版本的命令}
- 执行命令:
git bisect start {bad} {good}
Step 3: Isolation
步骤3:隔离测试
- Remove: {variable to eliminate}
- Expected change: {what should happen}
- 移除变量: {要排除的变量}
- 预期变化: {应出现的结果}
Instrumentation Points
插桩点
- {file}:{line} — log {variable/state} to observe {what}
- {file}:{line} — breakpoint to inspect {what}
- {文件}:{行} — 记录 {变量/状态} 以观察 {目标内容}
- {文件}:{行} — 设置断点以检查 {目标内容}
Minimal Reproduction
最小复现用例
undefinedundefinedMinimal code that triggers the bug
触发bug的最简代码
{code}
undefined{code}
undefinedResolution
问题解决
Root cause: {What was wrong}
Fix: {What was changed — file:line, diff summary}
Prevention: {Test added, lint rule, type annotation, etc.}
Lessons: {What generalizes beyond this bug}
text
undefined根因: {实际问题}
修复方案: {修改内容 — 文件:行,变更摘要}
预防措施: {新增测试用例、lint规则、类型注解等}
经验总结: {可推广到其他场景的经验}
text
undefinedConfiguring Scope
范围配置
| Mode | Scope | Depth | When to Use |
|---|---|---|---|
| Single error | H1 test + fix | Clear stacktrace, obvious cause |
| Full investigation | 3 hypotheses + bisection plan | Default for non-obvious bugs |
| Systemic analysis | 5+ hypotheses + instrumentation + reproduction | Intermittent bugs, no stacktrace, production issues |
| 模式 | 范围 | 深度 | 使用场景 |
|---|---|---|---|
| 单个错误 | H1测试 + 修复 | 清晰stacktrace、明显根因 |
| 完整调查 | 3个假设 + 二分排查计划 | 非显性bug默认模式 |
| 系统性分析 | 5个以上假设 + 插桩 + 复现用例 | 间歇性bug、无stacktrace、生产环境问题 |
Calibration Rules
校准规则
- Hypotheses before code changes. Never start modifying code without at least one explicit hypothesis. "Let me try this" is not debugging — it's guessing.
- One variable at a time. Each investigation step should change exactly one thing. If you change two things and the bug disappears, you don't know which fixed it.
- Document dead ends. Failed hypotheses are valuable — they narrow the search space. Record what was tested and what was learned.
- Simplest explanation first. Test typos, wrong variable names, and missing imports before considering race conditions, compiler bugs, or cosmic rays.
- Feedback loop before hypotheses. If you cannot reproduce the bug with a controlled pass/fail signal, any fix is speculative. Invest in the loop first.
- Root cause, not symptoms. A fix that addresses the symptom (adding a null check) without understanding the root cause (why was it null?) leaves the real bug alive.
- 先有假设再改代码。在没有明确假设前,切勿修改代码。"我试试这个"不是调试——是猜测。
- 每次仅变更一个变量。每个调查步骤应只变更一个内容。如果同时变更两个内容后bug消失,你无法确定哪个操作起了作用。
- 记录死胡同。失败的假设很有价值——它们缩小了搜索范围。记录已测试的内容和学到的经验。
- 优先验证最简单的解释。在考虑竞态条件、编译器bug或极端偶发情况之前,先测试拼写错误、变量名误用和缺失导入等简单问题。
- 先构建反馈循环再生成假设。如果无法通过可控的通过/失败机制复现bug,任何修复都是推测性的。请优先投入精力构建反馈循环。
- 解决根源而非症状。仅解决症状的修复(例如添加空值检查)而不理解根源(为什么会出现空值?)会留下真正的bug隐患。
Error Handling
异常处理
| Problem | Resolution |
|---|---|
| No stacktrace available | Focus on log analysis and state inspection. Use instrumentation to generate diagnostic output. |
| Bug is intermittent | Add persistent logging at key decision points. Run under stress (high load, concurrent requests) to increase reproduction rate. |
| Cannot reproduce locally | Compare environments systematically: versions, config, data, timing. Use |
| Multiple hypotheses equally likely | Design a single test that distinguishes between them. Binary decision: "If X, then H1; if Y, then H2." |
| Fix attempted but bug persists | The hypothesis was wrong. Revert the fix, update hypothesis rankings, and proceed to the next hypothesis. Do not stack fixes. |
| Bug is in a dependency | Confirm with a minimal reproduction that uses only the dependency. Check issue trackers. Pin to last known-good version while awaiting upstream fix. |
| 问题 | 解决方案 |
|---|---|
| 无stacktrace可用 | 重点分析日志和系统状态。使用插桩生成诊断输出。 |
| bug间歇性出现 | 在关键决策点添加持久化日志。在压力环境下运行(高负载、并发请求)以提高复现率。 |
| 本地无法复现 | 系统化对比环境:版本、配置、数据、时间。使用 |
| 多个假设可能性相同 | 设计一个可区分不同假设的测试。二元决策:"如果出现X,则H1成立;如果出现Y,则H2成立。" |
| 尝试修复后bug仍存在 | 假设错误。回滚修复,更新假设排序,继续测试下一个假设。请勿叠加多个修复。 |
| bug出现在依赖包中 | 使用仅包含该依赖的最小复现用例确认问题。查看依赖的issue追踪器。在等待上游修复期间,固定到最后一个正常版本。 |
When NOT to Investigate
无需调查的场景
Push back if:
- The error message already contains the fix ("missing module X" → install X)
- The issue is a known environment setup problem (wrong Python version, missing env var)
- The "bug" is actually a feature request or design disagreement — redirect to ADR or discussion
- The code is not under the user's control (third-party SaaS, managed service) — file a support ticket instead
- The user wants to debug generated/minified code — debug the source, not the output
undefined遇到以下情况请拒绝调试请求:
- 错误信息已包含修复方案(例如"缺少模块X" → 安装X)
- 问题是已知的环境配置问题(错误的Python版本、缺失环境变量)
- "bug"实际上是功能请求或设计分歧——引导至ADR或讨论环节
- 代码不在用户控制范围内(第三方SaaS、托管服务)——改为提交支持工单
- 用户想要调试生成/压缩后的代码——请调试源代码而非输出代码
undefined