agent-introspection-debugging
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgent Introspection Debugging
Agent内省调试
Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
当Agent运行反复失败、无进展消耗token、循环调用相同工具,或偏离预期任务时,请使用此技能。
这是一个工作流技能,而非隐藏运行时。它指导Agent在升级求助人工之前,系统性地完成自我调试。
When to Activate
何时启用
- Maximum tool call / loop-limit failures
- Repeated retries with no forward progress
- Context growth or prompt drift that starts degrading output quality
- File-system or environment state mismatch between expectation and reality
- Tool failures that are likely recoverable with diagnosis and a smaller corrective action
- 工具调用/循环次数达上限触发失败
- 多次重试无任何进展
- 上下文膨胀或Prompt漂移导致输出质量下降
- 文件系统或环境状态与预期不符
- 可通过诊断和小型修正操作恢复的工具故障
Scope Boundaries
适用范围边界
Activate this skill for:
- capturing failure state before retrying blindly
- diagnosing common agent-specific failure patterns
- applying contained recovery actions
- producing a structured human-readable debug report
Do not use this skill as the primary source for:
- feature verification after code changes; use
verification-loop - framework-specific debugging when a narrower ECC skill already exists
- runtime promises the current harness cannot enforce automatically
以下场景可启用本技能:
- 盲目重试前先捕获故障状态
- 诊断Agent常见的特定故障模式
- 执行可控的恢复操作
- 生成结构化的、人类可读的调试报告
以下场景不建议将本技能作为核心方案:
- 代码变更后的功能验证,请使用
verification-loop - 已有更细分的ECC技能覆盖的特定框架调试场景
- 当前运行框架无法自动生效的运行时约定
Four-Phase Loop
四阶段循环
Phase 1: Failure Capture
阶段1:故障捕获
Before trying to recover, record the failure precisely.
Capture:
- error type, message, and stack trace when available
- last meaningful tool call sequence
- what the agent was trying to do
- current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
- current environment assumptions: cwd, branch, relevant service state, expected files
Minimum capture template:
markdown
undefined尝试恢复前,先精准记录故障信息。
需捕获的内容:
- 可用的错误类型、错误信息和堆栈跟踪
- 最近的有效工具调用序列
- Agent当时正在执行的目标
- 当前上下文压力:重复Prompt、过大的粘贴日志、重复计划、失控的笔记内容
- 当前环境假设:工作目录(cwd)、分支、相关服务状态、预期存在的文件
最低要求捕获模板:
markdown
undefinedFailure Capture
故障捕获
- Session / task:
- Goal in progress:
- Error:
- Last successful step:
- Last failed tool / command:
- Repeated pattern seen:
- Environment assumptions to verify:
undefined- 会话/任务:
- 进行中的目标:
- 错误信息:
- 最后一个成功步骤:
- 最后失败的工具/命令:
- 观察到的重复模式:
- 待验证的环境假设:
undefinedPhase 2: Root-Cause Diagnosis
阶段2:根因诊断
Match the failure to a known pattern before changing anything.
| Pattern | Likely Cause | Check |
|---|---|---|
| Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
| Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
| service unavailable or wrong port | verify service health, URL, and port assumptions |
| retry storm or missing backoff | count repeated calls and inspect retry spacing |
| file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
| tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
Diagnosis questions:
- is this a logic failure, state failure, environment failure, or policy failure?
- did the agent lose the real objective and start optimizing the wrong subtask?
- is the failure deterministic or transient?
- what is the smallest reversible action that would validate the diagnosis?
修改任何内容前,先将故障与已知模式匹配。
| 故障模式 | 可能原因 | 检查项 |
|---|---|---|
| 工具调用次数达上限/重复执行相同命令 | 循环或无退出的观察路径 | 检查最近N次工具调用是否存在重复 |
| 上下文溢出/推理能力下降 | 无限制的笔记、重复计划、过大的日志 | 检查近期上下文是否存在重复内容和低信息密度的 bulk 数据 |
| 服务不可用或端口错误 | 验证服务健康状态、URL和端口假设 |
| 重试风暴或缺少退避机制 | 统计重复调用次数,检查重试间隔 |
| 写入后文件丢失/差异过时 | 竞争条件、工作目录错误或分支漂移 | 重新检查路径、工作目录、git状态和文件实际存在性 |
| 「修复」后测试仍失败 | 假设错误 | 定位具体的失败用例,重新推导bug原因 |
诊断问题:
- 这是逻辑故障、状态故障、环境故障还是策略故障?
- Agent是否遗忘了真实目标,开始优化错误的子任务?
- 故障是确定性的还是偶发的?
- 能验证诊断结论的最小可逆操作是什么?
Phase 3: Contained Recovery
阶段3:可控恢复
Recover with the smallest action that changes the diagnosis surface.
Safe recovery actions:
- stop repeated retries and restate the hypothesis
- trim low-signal context and keep only the active goal, blockers, and evidence
- re-check the actual filesystem / branch / process state
- narrow the task to one failing command, one file, or one test
- switch from speculative reasoning to direct observation
- escalate to a human when the failure is high-risk or externally blocked
Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
Contained recovery checklist:
markdown
undefined使用能改变诊断判断的最小操作完成恢复。
安全恢复操作:
- 停止重复重试,重述假设
- 裁剪低信息密度的上下文,仅保留活跃目标、阻塞点和证据
- 重新检查实际的文件系统/分支/进程状态
- 将任务范围缩小到单个失败命令、单个文件或单条测试用例
- 从推测推理切换为直接观察
- 故障风险高或被外部因素阻塞时,升级求助人工
不要声明不支持的自动修复操作,比如「重置Agent状态」或「更新运行框架配置」,除非你确实通过当前环境的真实工具执行了这些操作。
可控恢复检查清单:
markdown
undefinedRecovery Action
恢复操作
- Diagnosis chosen:
- Smallest action taken:
- Why this is safe:
- What evidence would prove the fix worked:
undefined- 选定的诊断结论:
- 执行的最小操作:
- 操作安全性说明:
- 证明修复生效的证据:
undefinedPhase 4: Introspection Report
阶段4:内省报告
End with a report that makes the recovery legible to the next agent or human.
markdown
undefined最后输出报告,让后续接手的Agent或人工能清晰了解恢复过程。
markdown
undefinedAgent Self-Debug Report
Agent自调试报告
- Session / task:
- Failure:
- Root cause:
- Recovery action:
- Result: success | partial | blocked
- Token / time burn risk:
- Follow-up needed:
- Preventive change to encode later:
undefined- 会话/任务:
- 故障信息:
- 根因:
- 恢复操作:
- 结果:成功|部分成功|阻塞
- Token/时间消耗风险:
- 需要后续跟进的事项:
- 后续可落地的预防变更:
undefinedRecovery Heuristics
恢复启发规则
Prefer these interventions in order:
- Restate the real objective in one sentence.
- Verify the world state instead of trusting memory.
- Shrink the failing scope.
- Run one discriminating check.
- Only then retry.
Bad pattern:
- retrying the same action three times with slightly different wording
Good pattern:
- capture failure
- classify the pattern
- run one direct check
- change the plan only if the check supports it
优先按以下顺序执行干预措施:
- 用一句话重述真实目标
- 验证实际状态,而非信任记忆
- 缩小故障范围
- 执行一次区分度校验
- 完成以上步骤后再重试
错误模式:
- 仅微调措辞就重复执行同一操作三次以上
正确模式:
- 捕获故障
- 分类模式
- 执行一次直接校验
- 仅在校验支持调整的前提下修改计划
Integration with ECC
与ECC集成
- Use after recovery if code was changed.
verification-loop - Use when the failure pattern is worth turning into an instinct or later skill.
continuous-learning-v2 - Use when the issue is not technical failure but decision ambiguity.
council - Use if the failure came from conflicting local state or repo drift.
workspace-surface-audit
- 如果修改了代码,恢复后使用
verification-loop - 故障模式值得沉淀为经验或后续技能时,使用
continuous-learning-v2 - 问题不属于技术故障,而是决策模糊时,使用
council - 故障来自本地状态冲突或代码库漂移时,使用
workspace-surface-audit
Output Standard
输出标准
When this skill is active, do not end with “I fixed it” alone.
Always provide:
- the failure pattern
- the root-cause hypothesis
- the recovery action
- the evidence that the situation is now better or still blocked
本技能激活时,不要仅以「我修复了」作为结束。
请始终提供以下信息:
- 故障模式
- 根因假设
- 恢复操作
- 证明情况已改善或仍被阻塞的证据