adk-debugger

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ADK Debugger Skill

ADK Debugger Skill

What is ADK Debugging?

什么是ADK调试?

Every ADK agent records its behavior as traces and logs — every conversation turn, tool call, LLM reasoning step, and error. These are the source of truth for understanding what your agent did and why.
The ADK CLI provides all the tools you need to debug. All commands support
--format json
for structured output, which you should always use when consuming output programmatically.
每个ADK Agent都会将其行为记录为追踪信息和日志——包括每一轮对话、工具调用、LLM推理步骤以及错误信息。这些信息是了解Agent行为及原因的可靠依据。
ADK CLI提供了调试所需的全部工具。所有命令都支持
--format json
参数以生成结构化输出,当你以编程方式处理输出时,应始终使用该参数。

When to Use This Skill

何时使用该Skill

Use this skill when the developer asks about:
  • Bot not working — not responding, wrong responses, unexpected behavior
  • Tool issues — wrong tool called, tool errors, hallucinated parameters
  • Workflow problems — stuck workflows, steps not executing, state issues
  • Reading traces/logs — how to query, filter, and interpret debug output
  • LLM misbehavior — hallucinations, refusals, looping, poor extraction
  • Build/deploy failures — validation errors, schema mismatches
  • Config issues — agent.json vs agent.local.json, integration setup
  • Post-fix verification — confirming a fix worked, writing regression evals
Trigger questions:
  • "My bot isn't responding"
  • "The wrong tool was called"
  • "My workflow is stuck"
  • "How do I read traces?"
  • "How do I check logs?"
  • "The LLM is hallucinating"
  • "Something broke after my last change"
  • "My deploy failed"
  • "
    adk check
    found errors"
  • "Summarize this trace"
  • "What happened in trace X?"
  • "Give me an overview of this conversation turn"
  • "Why did the bot do X in this trace?"
  • "Walk me through what happened"
  • "How do I debug this?"
当开发者询问以下问题时,使用此Skill:
  • Bot无法正常工作——无响应、回复错误、行为异常
  • 工具问题——调用错误工具、工具报错、参数幻觉
  • 工作流问题——工作流停滞、步骤未执行、状态异常
  • 读取追踪/日志——如何查询、过滤和解读调试输出
  • LLM行为异常——幻觉、拒绝回答、循环、信息提取不佳
  • 构建/部署失败——验证错误、 schema不匹配
  • 配置问题——agent.json与agent.local.json差异、集成设置
  • 修复后验证——确认修复生效、编写回归评估
触发问题:
  • "我的Bot没有响应"
  • "调用了错误的工具"
  • "我的工作流停滞了"
  • "如何读取追踪信息?"
  • "如何查看日志?"
  • "LLM出现幻觉了"
  • "上次修改后出问题了"
  • "我的部署失败了"
  • "
    adk check
    发现了错误"
  • "总结这个追踪信息"
  • "追踪X里发生了什么?"
  • "给我概述这一轮对话"
  • "为什么Bot在这个追踪里执行了X操作?"
  • "带我梳理一下发生的过程"
  • "我该如何调试这个问题?"

Available Documentation

可用文档

FileContents
references/traces-and-logs.md
CLI debugging tools, log querying, trace structure, span types,
onTrace
hooks, reproduction with
adk chat
references/common-failures.md
Runtime failure patterns — validation, bot not responding, tool errors, workflow stuck, integration failures, build errors, config confusion
references/llm-debugging.md
LLM behavior issues — wrong tool, hallucinated params, refusals, token limits, looping, reading model reasoning
references/debug-workflow.md
The systematic 8-step debug loop: validate → reproduce → logs → traces → classify → fix → verify → prevent
references/trace-summarization.md
How to fetch, walk, and summarize traces as free-form natural-language narratives — adapting depth to context
文件内容
references/traces-and-logs.md
CLI调试工具、日志查询、追踪结构、Span类型、
onTrace
钩子、使用
adk chat
复现问题
references/common-failures.md
运行时故障模式——验证错误、Bot无响应、工具报错、工作流停滞、集成失败、构建错误、配置混淆
references/llm-debugging.md
LLM行为问题——错误工具调用、参数幻觉、拒绝回答、令牌限制、循环、解读模型推理过程
references/debug-workflow.md
系统化8步调试循环:验证→复现→日志→追踪→分类→修复→验证→预防
references/trace-summarization.md
如何获取、梳理并以自然语言总结追踪信息——根据上下文调整详细程度

How to Answer

如何解答

  1. "How do I read traces/logs?" → Read
    traces-and-logs.md
    for CLI commands and trace structure
  2. Something is broken, known pattern → Read
    common-failures.md
    for the matching failure pattern
  3. LLM is misbehaving → Read
    llm-debugging.md
    for the matching behavior issue
  4. Systematic investigation needed → Read
    debug-workflow.md
    and follow the 8-step loop
  5. "Summarize this trace" / "What happened?" → Read
    trace-summarization.md
    for how to fetch, walk, and narrate traces
  6. After fixing, need to prevent regression → Point to the
    adk-evals
    skill for writing evals

  1. “如何读取追踪信息/日志?” → 查看
    traces-and-logs.md
    获取CLI命令及追踪结构相关内容
  2. 已知模式的故障问题 → 查看
    common-failures.md
    找到匹配的故障模式
  3. LLM行为异常 → 查看
    llm-debugging.md
    找到匹配的行为问题
  4. 需要系统化排查 → 查看
    debug-workflow.md
    并遵循8步循环
  5. “总结这个追踪信息” / “发生了什么?” → 查看
    trace-summarization.md
    了解如何获取、梳理并描述追踪信息
  6. 修复后需要预防回归 → 指向
    adk-evals
    Skill编写评估用例

Quick Reference

快速参考

The Debug Loop

调试循环

symptom → validate (adk check) → reproduce (adk chat) → logs (adk logs) → traces (adk traces) → root cause → fix → verify
症状 → 验证(adk check)→ 复现(adk chat)→ 日志(adk logs)→ 追踪(adk traces)→ 根本原因 → 修复 → 验证

CLI Commands (always use
--format json
)

CLI命令(始终使用
--format json

bash
adk check --format json                         # offline validation
adk logs error --format json                     # recent errors
adk logs --follow --format json                  # stream live
adk traces --format json                         # recent traces
adk traces --conversation-id <id> --format json  # specific conversation
adk chat --single "msg" --format json            # test message
adk dev --non-interactive --format json          # structured dev output
bash
adk check --format json                         # 离线验证
adk logs error --format json                     # 近期错误
adk logs --follow --format json                  # 实时流输出
adk traces --format json                         # 近期追踪信息
adk traces --conversation-id <id> --format json  # 指定对话的追踪信息
adk chat --single "msg" --format json            # 测试消息
adk dev --non-interactive --format json          # 结构化开发输出

Span Types

Span类型

TypeWhat It Shows
think
LLM reasoning — why it chose an action
tool_call
Tool invocation — name, input, output, success/error
code_execution_exception
Runtime error — message and stack trace
end
Conversation turn completed

类型展示内容
think
LLM推理过程——为何选择某一操作
tool_call
工具调用——名称、输入、输出、成功/失败状态
code_execution_exception
运行时错误——消息及堆栈追踪
end
对话轮次完成

Prerequisites Check

前置检查

Before debugging, verify:
  • Project valid? Run
    adk check --format json
    — fix any reported issues first
  • Dev server running?
    adk dev
    (or
    adk dev --non-interactive --format json
    for structured output)
  • Bot linked?
    agent.json
    exists with
    botId
    and
    workspaceId
    (created by
    adk link
    )
  • Dev bot created?
    agent.local.json
    has
    devId
    (set automatically by the first
    adk dev
    run)
  • Integration configured? Check Control Panel at localhost:3001 for unconfigured integrations

调试前,请验证:
  • 项目是否有效? 运行
    adk check --format json
    ——先修复所有报告的问题
  • 开发服务器是否运行?
    adk dev
    (或使用
    adk dev --non-interactive --format json
    获取结构化输出)
  • Bot是否已关联? 存在包含
    botId
    workspaceId
    agent.json
    (由
    adk link
    创建)
  • 是否创建了开发Bot?
    agent.local.json
    包含
    devId
    (首次运行
    adk dev
    时自动设置)
  • 集成是否配置完成? 检查localhost:3001控制面板中是否有未配置的集成

Critical Patterns

关键模式

Run
adk check
before debugging runtime issues
bash
undefined
在调试运行时问题前先执行
adk check
bash
undefined

CORRECT — catch config/schema problems offline first

正确做法——先离线排查配置/schema问题

adk check --format json
adk check --format json

Then debug runtime issues

再调试运行时问题


❌ **Skipping offline validation**

```bash

❌ **跳过离线验证**

```bash

WRONG — jumping straight to runtime debugging wastes time on config issues

错误做法——直接调试运行时问题会浪费时间在配置问题上

adk traces --format json # might be chasing a config problem

---

✅ **Use `--format json` on all CLI commands**

```bash
adk traces --format json # 可能在追踪一个配置问题

---

✅ **所有CLI命令都使用`--format json`**

```bash

CORRECT — structured output for reliable parsing

正确做法——结构化输出便于可靠解析

adk logs error --format json adk traces --format json adk chat --single "test" --format json

❌ **Parsing human-readable output**

```bash
adk logs error --format json adk traces --format json adk chat --single "test" --format json

❌ **解析人类可读格式的输出**

```bash

WRONG — human-readable format is for display, not parsing

错误做法——人类可读格式仅用于展示,不适合解析

adk logs error adk traces

---

✅ **Use `adk logs error` to filter errors**

```bash
adk logs error adk traces

---

✅ **使用`adk logs error`过滤错误信息**

```bash

CORRECT — focused error scan

正确做法——聚焦错误扫描

adk logs error --format json adk logs warning since=1h --format json

❌ **Scrolling through all output**

```bash
adk logs error --format json adk logs warning since=1h --format json

❌ **滚动查看全部输出**

```bash

WRONG — too much noise, easy to miss the actual error

错误做法——噪音太多,容易错过实际错误

adk logs --format json # 50 entries of everything

---

✅ **Use `onTrace` hooks for programmatic monitoring**

```typescript
// CORRECT — structured, automated trace analysis
hooks: {
  onTrace: ({ trace }) => {
    if (trace.type === "tool_call" && !trace.success) {
      console.error(`[TOOL ERROR] ${trace.tool_name}`, trace.error);
    }
  }
}
Only checking console output
typescript
// WRONG — console.log in handlers misses the structured trace data
handler: async (input) => {
  console.log("tool called");  // not useful for debugging
}

Write a regression eval after fixing
typescript
// CORRECT — prevents the bug from coming back
export default new Eval({
  name: 'fix-order-lookup',
  type: 'regression',
  conversation: [{ user: 'Look up order 123', assert: { tools: [{ called: 'lookupOrder' }] } }],
})
Fixing and moving on
// WRONG — the same bug will return and you'll debug it again

adk logs --format json # 50条全量日志

---

✅ **使用`onTrace`钩子实现程序化监控**

```typescript
// 正确做法——结构化、自动化的追踪分析
hooks: {
  onTrace: ({ trace }) => {
    if (trace.type === "tool_call" && !trace.success) {
      console.error(`[TOOL ERROR] ${trace.tool_name}`, trace.error);
    }
  }
}
仅查看控制台输出
typescript
// 错误做法——处理器中的console.log无法获取结构化追踪数据
handler: async (input) => {
  console.log("tool called");  // 对调试无用
}

修复后编写回归评估用例
typescript
// 正确做法——防止问题复发
export default new Eval({
  name: 'fix-order-lookup',
  type: 'regression',
  conversation: [{ user: 'Look up order 123', assert: { tools: [{ called: 'lookupOrder' }] } }],
})
修复后直接继续开发
// 错误做法——同样的问题会再次出现,你需要再次调试

Example Questions

示例问题

Basic:
  • "My bot isn't responding — how do I figure out why?"
  • "How do I check for errors in my ADK project?"
  • "What's the difference between agent.json and agent.local.json?"
Intermediate:
  • "The bot called createTicket instead of lookupTicket — how do I fix this?"
  • "My workflow starts but the second step never runs"
  • "How do I see what the LLM was thinking when it made a decision?"
  • "Integration actions are failing with auth errors"
Advanced:
  • "How do I set up onTrace hooks for automated error detection?"
  • "The model loops on the same tool call — how do I add a guardrail?"
  • "How do I monitor tool call performance with timing metrics?"
  • "How do I systematically debug a multi-step workflow failure?"

基础问题:
  • "我的Bot没有响应——我该如何找出原因?"
  • "如何检查ADK项目中的错误?"
  • "agent.json和agent.local.json有什么区别?"
中级问题:
  • "Bot调用了createTicket而不是lookupTicket——我该如何修复?"
  • "我的工作流启动了但第二步从未执行"
  • "如何查看LLM做决策时的推理过程?"
  • "集成操作因认证错误失败"
高级问题:
  • "如何设置onTrace钩子实现自动化错误检测?"
  • "模型循环调用同一个工具——我该如何添加防护机制?"
  • "如何通过计时指标监控工具调用性能?"
  • "如何系统化调试多步骤工作流故障?"

Response Format

响应格式

When helping a developer debug:
  1. Check prerequisites — verify dev server, config files, project validation
  2. Start with
    adk check --format json
    — rule out offline issues
  3. Reproduce — use
    adk chat --single "msg" --format json
    to create a clean reproduction
  4. Read the evidence
    adk logs error --format json
    for quick scan,
    adk traces --format json
    for details
  5. Identify the root cause — point to the specific span, log entry, or config issue
  6. Suggest a targeted fix — reference the appropriate failure pattern doc
  7. Verify — re-run the reproduction, confirm clean output
  8. Suggest a regression eval — point to the
    adk-evals
    skill
帮助开发者调试时:
  1. 检查前置条件——验证开发服务器、配置文件、项目有效性
  2. 先执行
    adk check --format json
    ——排除离线问题
  3. 复现问题——使用
    adk chat --single "msg" --format json
    创建干净的复现环境
  4. 查看证据——使用
    adk logs error --format json
    快速扫描错误,使用
    adk traces --format json
    查看详细信息
  5. 确定根本原因——指出具体的Span、日志条目或配置问题
  6. 建议针对性修复——参考对应的故障模式文档
  7. 验证修复——重新执行复现步骤,确认输出正常
  8. 建议编写回归评估用例——指向
    adk-evals
    Skill