agent-debugger

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent Debugger

Agent Debugger

A debugger for AI agents. Set breakpoints, inspect state, evaluate expressions, test fixes in-place.
一款面向AI Agent的调试工具。支持设置断点、检查状态、执行表达式、就地测试修复方案。

Philosophy

设计理念

The debugger is a scalpel, not a flashlight. You don't turn it on to look around. You turn it on to make one precise cut — confirm or kill a specific hypothesis about why the program is broken. If you're "exploring" in the debugger, you've already lost.
Every session starts before the debugger. Read the code. Read the traceback. Form a theory. Know exactly what breakpoint you'll set and what eval you'll run before you type a single command. The debugger is the experiment, not the investigation.
eval
is the only command that matters.
vars
,
step
,
stack
,
source
— these are all setup. The eval is the actual experiment. It's where you test your hypothesis against reality. Everything else is scaffolding to get you to the right eval at the right moment.
Half of all bugs don't need a debugger. Read the traceback. Read the code. Check the types. Grep for the error message. Look at git blame. Most bugs surrender to careful reading. Reach for the debugger only when the bug depends on runtime state you can't determine statically.
调试器是一把手术刀,而非手电筒。你打开它不是为了四处探索,而是为了精准切入——验证或推翻关于程序故障原因的特定假设。如果你在调试器中“探索”,那你已经走错方向了。
每次调试会话的准备工作都要在启动调试器之前完成。 阅读代码、查看回溯信息、形成假设。在输入任何命令之前,明确知道要设置哪个断点、要执行哪个eval表达式。调试器是验证假设的实验工具,而非用于探索的调查工具。
eval
是唯一真正关键的命令。
vars
step
stack
source
——这些都只是准备工作。eval才是真正的实验环节,是你用实际运行状态验证假设的地方。其他所有操作都是为了让你在正确的时机执行正确的eval表达式。
半数bug无需使用调试器。 阅读回溯信息、查看代码、检查类型、搜索错误信息、查看git blame记录。大多数bug都会在仔细阅读后暴露。只有当bug依赖于无法静态确定的运行时状态时,才需要使用调试器。

The Rules

调试规则

  1. Read first, debug second. Never start a debug session without reading the relevant code and forming a hypothesis. The debugger confirms theories — it doesn't generate them.
  2. One breakpoint, one question. Each breakpoint should answer a specific question. "Is
    x
    a string here?" "Is
    balance
    negative after this call?" "Does this branch execute?" If you can't articulate the question, you're not ready to debug.
  3. Eval, don't dump.
    vars
    dumps everything and answers nothing.
    eval "type(data['age'])"
    answers exactly one question. Prefer eval. Always.
  4. Never step through loops. A loop with 100 iterations is 100 step commands. A conditional breakpoint is 1 command. Use
    --break "file:line:i == 50"
    to jump straight to the iteration that matters.
  5. Two strikes, new theory. If your hypothesis was wrong twice, stop. Your mental model of the code is broken, not the debugger session. Close, re-read the code, form a completely different theory, then start a new session with different breakpoints. Continuing to probe the same area has exponentially diminishing returns.
  6. Test the fix before writing it. The debugger gives you a live REPL in the exact context of the bug. Use
    eval
    to run your proposed fix expression before editing any code. If it works in eval, it'll work in the code.
  7. Prove the fix, write the test. After fixing, re-run the program to verify. Then write the smallest possible test that catches the bug. A fix without a test is a fix that will regress.
  8. Close the session. Always. A stale session blocks the next one.
  1. 先阅读,后调试。 永远不要在未阅读相关代码、未形成假设的情况下启动调试会话。调试器用于验证理论,而非生成理论。
  2. 一个断点,一个问题。 每个断点都应对应一个明确的问题。比如“这里的
    x
    是字符串类型吗?”“调用这个方法后
    balance
    会变成负数吗?”“这个分支会执行吗?”如果你无法清晰表述问题,说明你还没准备好调试。
  3. 用eval,勿dump。
    vars
    会输出所有变量,但无法回答任何具体问题。
    eval "type(data['age'])"
    能精准回答一个问题。永远优先使用eval。
  4. 绝不逐行遍历循环。 一个有100次迭代的循环需要100次step命令,而一个条件断点只需1次命令。使用
    --break "file:line:i == 50"
    直接跳转到关键的迭代步骤。
  5. 两次错误假设,重新思考。 如果你的假设连续两次错误,请立即停止。说明你对代码的认知模型存在偏差,而非调试会话的问题。关闭调试器,重新阅读代码,形成全新的假设,然后用不同的断点启动新的会话。继续在同一区域试探的回报会越来越低。
  6. 先验证修复方案,再修改代码。 调试器提供了bug发生时的实时REPL环境。在修改代码之前,先用
    eval
    执行你提出的修复表达式。如果在eval中有效,那么在代码中也会有效。
  7. 验证修复,编写测试。 修复完成后,重新运行程序验证效果。然后编写最小化的测试用例来捕获该bug。没有测试的修复很可能会再次出现回归问题。
  8. 关闭会话。 务必执行此操作。闲置的会话会影响下一次调试。

Bootstrap

快速启动

  • If
    agent-debugger
    is available globally, use it directly.
  • Otherwise, use
    npx -y agent-debugger
    (zero-install, no prompts).
  • agent-debugger
    已全局安装,可直接使用。
  • 否则,使用
    npx -y agent-debugger
    (零安装,无交互提示)。

Commands

命令列表

bash
undefined
bash
undefined

If installed globally:

若已全局安装:

agent-debugger start <script> --break file:line[:condition] [--runtime path] [--args ...]
agent-debugger start <script> --break file:line[:condition] [--runtime path] [--args ...]

If not installed:

若未安装:

npx -y agent-debugger start <script> --break file:line[:condition] [--runtime path] [--args ...] agent-debugger eval <expression> # Run any expression in the current frame agent-debugger vars # List local variables (prefer eval) agent-debugger step [into|out] # Step over / into function / out of function agent-debugger continue # Run to next breakpoint or termination agent-debugger stack # Show call stack agent-debugger break file:line[:cond] # Add breakpoint mid-session agent-debugger source # Show source around current line agent-debugger status # Show session state and location agent-debugger close # Kill session, clean up

Multiple `--break` flags supported. Conditions are expressions: `--break "app.py:42:len(items) > 10"`.
npx -y agent-debugger start <script> --break file:line[:condition] [--runtime path] [--args ...] agent-debugger eval <expression> # 在当前栈帧中执行任意表达式 agent-debugger vars # 列出局部变量(优先使用eval) agent-debugger step [into|out] # 单步跳过/进入函数/跳出函数 agent-debugger continue # 运行到下一个断点或程序结束 agent-debugger stack # 显示调用栈 agent-debugger break file:line[:cond] # 在会话中添加断点 agent-debugger source # 显示当前行附近的源代码 agent-debugger status # 显示会话状态和当前位置 agent-debugger close # 终止会话并清理资源

支持多个`--break`参数。条件为表达式格式:`--break "app.py:42:len(items) > 10"`。

Supported Languages

支持的语言

LanguageExtensionAdapterRequirement
Python.pydebugpy
pip install debugpy
JavaScript/TypeScript.js/.tsNode InspectorNode.js
Go.goDelve
go install github.com/go-delve/delve/cmd/dlv@latest
Rust/C/C++.rs/.c/.cppCodeLLDB
CODELLDB_PATH
env var
语言扩展名适配器依赖要求
Python.pydebugpy
pip install debugpy
JavaScript/TypeScript.js/.tsNode InspectorNode.js
Go.goDelve
go install github.com/go-delve/delve/cmd/dlv@latest
Rust/C/C++.rs/.c/.cppCodeLLDB需设置
CODELLDB_PATH
环境变量

The Playbook

调试实战指南

These are not suggestions. These are the right way to handle each class of bug.
以下是各类bug的标准调试流程,而非建议。

Type Bugs

类型错误

A value has the wrong type somewhere in the pipeline. Don't step through — go straight to the suspect and ask.
bash
agent-debugger start app.py --break "app.py:25"
agent-debugger eval "type(data['age'])"                  # <class 'str'> — found it
agent-debugger eval "int(data['age'])"                   # 35 — fix is safe
agent-debugger close
Two commands after the breakpoint. Done.
数据在流转过程中出现类型错误。无需逐行调试,直接定位可疑位置并验证。
bash
agent-debugger start app.py --break "app.py:25"
agent-debugger eval "type(data['age'])"                  # <class 'str'> — 找到问题
agent-debugger eval "int(data['age'])"                   # 35 — 转换安全
agent-debugger close
设置断点后只需两条命令,完成调试。

Data Pipeline Bugs

数据管道错误

Something in a batch is wrong. Don't look at individual records — assert the shape of the whole batch.
bash
agent-debugger start etl.py --break "etl.py:90"          # after the transformation
agent-debugger eval "all(isinstance(v, int) for v in result.values())"   # False
agent-debugger eval "[k for k,v in result.items() if not isinstance(v, int)]"  # ['quantity']
agent-debugger close
One breakpoint, two evals. The first asks "is anything wrong?", the second asks "what exactly?"
批量数据中存在异常。无需查看单个记录,直接断言整个数据集的结构。
bash
agent-debugger start etl.py --break "etl.py:90"          # 转换完成后
agent-debugger eval "all(isinstance(v, int) for v in result.values())"   # False
agent-debugger eval "[k for k,v in result.items() if not isinstance(v, int)]"  # ['quantity']
agent-debugger close
一个断点,两次eval。第一次验证“是否存在问题”,第二次定位“具体问题是什么”。

Loop Bugs (The Wolf Fence)

循环错误(狼栅栏法)

A loop processes N items and something goes wrong at an unknown iteration. Binary search it.
bash
agent-debugger start app.py --break "app.py:45:i == 500"    # midpoint
agent-debugger eval "is_valid(result)"                       # True → bug is after 500
agent-debugger close

agent-debugger start app.py --break "app.py:45:i == 750"    # narrow
agent-debugger eval "is_valid(result)"                       # False → bug is between 500-750
agent-debugger close

agent-debugger start app.py --break "app.py:45:i == 625"    # narrow again
~10 iterations to find the bug in 1000 items. Not 1000 step commands.
循环处理N个项目,某一次迭代出现错误但未知具体位置。使用二分法排查。
bash
agent-debugger start app.py --break "app.py:45:i == 500"    # 中间位置
agent-debugger eval "is_valid(result)"                       # True → 问题出在500次迭代之后
agent-debugger close

agent-debugger start app.py --break "app.py:45:i == 750"    # 缩小范围
agent-debugger eval "is_valid(result)"                       # False → 问题出在500-750次迭代之间
agent-debugger close

agent-debugger start app.py --break "app.py:45:i == 625"    # 继续缩小范围
对于1000次迭代的循环,只需约10次操作即可定位问题,而非1000次step命令。

Invariant Violations

不变量违反错误

You know what should never happen. Tell the debugger to catch the exact moment it does.
bash
undefined
你明确知道哪些情况绝对不应该发生,让调试器在问题发生的瞬间捕获它。
bash
undefined

"balance should never go negative"

"账户余额绝不能为负数"

agent-debugger start bank.py --break "bank.py:68:account.balance < 0"
agent-debugger start bank.py --break "bank.py:68:account.balance < 0"

"every value should be numeric"

"所有值都应为数值类型"

agent-debugger start pipeline.py --break "pipeline.py:30:not isinstance(value, (int, float))"
agent-debugger start pipeline.py --break "pipeline.py:30:not isinstance(value, (int, float))"

"list should never exceed 100 items"

"列表长度绝不能超过100"

agent-debugger start app.py --break "app.py:55:len(results) > 100"

If it hits, you've caught the crime in progress. If it doesn't hit, your theory was wrong — move on.
agent-debugger start app.py --break "app.py:55:len(results) > 100"

如果触发断点,说明你当场捕获了问题。如果未触发,说明你的假设错误——换个方向继续排查。

Recursion / Deep Call Chains

递归/深层调用链错误

The stack tells you how you arrived. The eval tells you why you're wrong.
bash
agent-debugger start tree.py --break "tree.py:22"
agent-debugger stack                    # see the recursion depth
agent-debugger eval "current_depth"     # 3
agent-debugger eval "max_depth"         # 3 — off-by-one, should be <, not <=
agent-debugger close
调用栈能告诉你程序的执行路径,eval能告诉你问题的原因。
bash
agent-debugger start tree.py --break "tree.py:22"
agent-debugger stack                    # 查看递归深度
agent-debugger eval "current_depth"     # 3
agent-debugger eval "max_depth"         # 3 — 差一错误,应该用<而非<=
agent-debugger close

"Where Does This Bad Data Come From?"

“错误数据来自哪里?”

You found bad data downstream. Pivot upstream.
bash
agent-debugger start app.py --break "handler.py:55"
agent-debugger eval "data['age']"          # '35' — string, wrong. But handler didn't create this.
agent-debugger close                       # pivot to the source

agent-debugger start app.py --break "loader.py:22"
agent-debugger eval "raw_row"              # CSV parser returns strings. Root cause.
agent-debugger close
Don't fix the symptom at the handler. Fix the cause at the loader.
在下游发现错误数据,向上游追溯根源。
bash
agent-debugger start app.py --break "handler.py:55"
agent-debugger eval "data['age']"          # '35' — 字符串类型,不符合要求。但handler并未生成该数据。
agent-debugger close                       # 转向数据源排查

agent-debugger start app.py --break "loader.py:22"
agent-debugger eval "raw_row"              # CSV解析器返回字符串类型。找到根源。
agent-debugger close
不要在handler层修复表面问题,要在loader层修复根源问题。

"Which of These 3 Functions Is the Culprit?"

“三个函数中哪个是罪魁祸首?”

Set breakpoints at all suspects. The runtime tells you which one fires.
bash
agent-debugger start app.py \
  --break "auth.py:30" \
  --break "validate.py:55" \
  --break "handler.py:80"
在所有可疑函数处设置断点,运行时会告诉你哪个函数触发了问题。
bash
agent-debugger start app.py \
  --break "auth.py:30" \
  --break "validate.py:55" \
  --break "handler.py:80"

Hits validate.py:55 — now you know where to focus

触发validate.py:55断点 — 现在知道要聚焦此处

agent-debugger eval "request.payload" agent-debugger close
undefined
agent-debugger eval "request.payload" agent-debugger close
undefined

Testing a Fix In-Place

就地测试修复方案

You think you know the fix. Prove it before editing.
bash
undefined
你认为找到了修复方法,在修改代码之前先验证它。
bash
undefined

Paused at the crash: total + data['age'] where age is a string

崩溃位置:total + data['age'],其中age是字符串类型

agent-debugger eval "total + int(data['age'])" # 90 — works agent-debugger eval "int(data['age'])" # 35 — safe cast
agent-debugger eval "total + int(data['age'])" # 90 — 执行成功 agent-debugger eval "int(data['age'])" # 35 — 转换安全

Prove it works for the entire dataset

验证修复方案对整个数据集有效

agent-debugger eval "sum(int(d['age']) if isinstance(d['age'], str) else d['age'] for d in users)" agent-debugger close
agent-debugger eval "sum(int(d['age']) if isinstance(d['age'], str) else d['age'] for d in users)" agent-debugger close

NOW edit the code, with confidence

现在可以自信地修改代码了

undefined
undefined

Falsifying Your Theory

推翻假设

Design evals that would break your hypothesis, not confirm it. Confirmation bias is the #1 debugging trap.
bash
undefined
设计能推翻你假设的eval表达式,而非仅仅验证它。确认偏差是调试的头号陷阱。
bash
undefined

Theory: "age is a string only in the third record"

假设:“只有第三条记录的age是字符串类型”

BAD — only confirms

错误做法 — 仅验证假设

agent-debugger eval "isinstance(data['age'], str)" # True. But so what?
agent-debugger eval "isinstance(data['age'], str)" # True。但这说明不了什么。

GOOD — tries to disprove

正确做法 — 尝试推翻假设

agent-debugger eval "isinstance(users[0]['age'], str)" # False — not all records agent-debugger eval "isinstance(users[1]['age'], str)" # False — so it IS specific to record 3 agent-debugger eval "users[2]" # {'name': 'Charlie', 'age': '35'} — source data is wrong
undefined
agent-debugger eval "isinstance(users[0]['age'], str)" # False — 并非所有记录都有问题 agent-debugger eval "isinstance(users[1]['age'], str)" # False — 确实只有第三条记录有问题 agent-debugger eval "users[2]" # {'name': 'Charlie', 'age': '35'} — 数据源错误
undefined

Never Do This

绝对禁止的操作

Never step blindly. If you're running
step
more than 3 times in a row, you need a breakpoint, not more steps.
Never start without reading code. The debugger doesn't find bugs. You find bugs by reading code and forming theories. The debugger just confirms them.
Never dump vars when you have a question.
vars
is for the rare case when you genuinely don't know what variables exist. If you have a theory,
eval
tests it directly.
Never debug timing bugs with the debugger. Pausing execution changes timing. Race conditions disappear under observation. Use logging.
Never keep going after 2 failed hypotheses. Close. Re-read. Rethink. Your mental model is wrong, and more debugger commands won't fix your mental model.
Never leave a session open.
agent-debugger close
. Always. Every time.
Never fix without verifying. Run the program after the fix. If you can, toggle the fix to prove causation. Then write a test.
绝不盲目单步调试。 如果你连续执行
step
超过3次,说明你需要的是断点,而非更多的单步操作。
绝不未读代码就启动调试。 调试器无法自动找到bug。你通过阅读代码、形成假设来定位bug,调试器只是用于验证假设。
绝不在有明确问题时使用vars。
vars
仅适用于你完全不知道存在哪些变量的罕见情况。如果你有假设,直接用
eval
验证。
绝不使用调试器调试时序问题。 暂停执行会改变时序,竞态条件会在调试时消失。应使用日志排查。
绝不在两次假设错误后继续调试。 关闭调试器,重新阅读代码,重新思考。你的认知模型存在偏差,更多的调试命令无法修正你的认知。
绝不让会话闲置。 务必执行
agent-debugger close
。每次都要这么做。
绝不修复后不验证。 修复后运行程序验证效果。如果可能,切换修复状态来证明因果关系。然后编写测试用例。

Notes

注意事项

  • Use absolute paths for breakpoints
  • One session at a time —
    close
    before starting another
  • Python requires
    debugpy
    (
    pip install debugpy
    )
  • Program stdout goes to the daemon — use
    eval
    to inspect output values
  • 断点路径请使用绝对路径
  • 同一时间只能运行一个会话 — 启动新会话前请执行
    close
  • Python环境需要安装
    debugpy
    pip install debugpy
  • 程序的标准输出会发送到守护进程 — 使用
    eval
    来检查输出值