agent-architecture-audit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgent Architecture Audit
Agent架构审计
A diagnostic workflow for agent systems that hide failures behind wrapper layers, stale memory, retry loops, or transport/rendering mutations.
这是针对Agent系统的诊断工作流,用于排查隐藏在包装层、过期内存、重试循环或传输/渲染变异背后的故障。
When to Activate
激活时机
MANDATORY for:
- Releasing any agent or LLM-powered application to production
- Shipping features with tool calling, memory, or multi-step workflows
- Agent behavior degrades after adding wrapper layers
- User reports "the agent is getting worse" or "tools are flaky"
- Same model works in playground but breaks inside your wrapper
- Debugging agent behavior for more than 15 minutes without finding root cause
Especially critical when:
- You've added new prompt layers, tool definitions, or memory systems
- Different agents in your system behave inconsistently
- The model was fine yesterday but is hallucinating today
- You suspect hidden repair/retry loops silently mutating responses
Do not use for:
- General code debugging — use
agent-introspection-debugging - Code review — use language-specific reviewer agents
- Security scanning — use or
security-reviewsecurity-review/scan - Agent performance benchmarking — use
agent-eval - Writing new features — use the appropriate workflow skill
强制使用场景:
- 将任何Agent或LLM驱动应用发布到生产环境时
- 交付包含工具调用、内存管理或多步骤工作流的功能时
- 添加包装层后Agent行为出现退化
- 用户反馈“Agent表现越来越差”或“工具不稳定”
- 模型在测试环境正常,但在包装层中运行异常
- 调试Agent行为超过15分钟仍未找到根本原因时
尤其关键的场景:
- 新增了提示层、工具定义或内存系统
- 系统中不同Agent的行为不一致
- 模型之前运行正常,但现在出现幻觉输出
- 怀疑存在隐藏的修复/重试循环在静默修改响应
请勿用于:
- 通用代码调试 — 使用
agent-introspection-debugging - 代码评审 — 使用特定语言的评审Agent
- 安全扫描 — 使用或
security-reviewsecurity-review/scan - Agent性能基准测试 — 使用
agent-eval - 编写新功能 — 使用对应的工作流技能
The 12-Layer Stack
12层Agent栈
Every agent system has these layers. Any of them can corrupt the answer:
| # | Layer | What Goes Wrong |
|---|---|---|
| 1 | System prompt | Conflicting instructions, instruction bloat |
| 2 | Session history | Stale context injection from previous turns |
| 3 | Long-term memory | Pollution across sessions, old topics in new conversations |
| 4 | Distillation | Compressed artifacts re-entering as pseudo-facts |
| 5 | Active recall | Redundant re-summary layers wasting context |
| 6 | Tool selection | Wrong tool routing, model skips required tools |
| 7 | Tool execution | Hallucinated execution — claims to call but doesn't |
| 8 | Tool interpretation | Misread or ignored tool output |
| 9 | Answer shaping | Format corruption in final response |
| 10 | Platform rendering | Transport-layer mutation (UI, API, CLI mutates valid answers) |
| 11 | Hidden repair loops | Silent fallback/retry agents running second LLM pass |
| 12 | Persistence | Expired state or cached artifacts reused as live evidence |
每个Agent系统都包含以下层级,任何一层都可能导致输出异常:
| 序号 | 层级 | 常见问题 |
|---|---|---|
| 1 | System prompt | 指令冲突、指令冗余 |
| 2 | Session history | 注入来自之前会话的过期上下文 |
| 3 | Long-term memory | 跨会话污染、新对话中出现旧话题 |
| 4 | Distillation | 压缩后的产物以伪事实形式重新进入系统 |
| 5 | Active recall | 冗余的重总结层浪费上下文空间 |
| 6 | Tool selection | 工具路由错误、模型跳过必要工具 |
| 7 | Tool execution | 幻觉执行——声称调用工具但实际未执行 |
| 8 | Tool interpretation | 误读或忽略工具输出 |
| 9 | Answer shaping | 最终响应格式损坏 |
| 10 | Platform rendering | 传输层变异(UI、API、CLI修改有效响应) |
| 11 | Hidden repair loops | 静默运行的 fallback/重试Agent执行第二次LLM调用 |
| 12 | Persistence | 过期状态或缓存产物被当作实时证据复用 |
Common Failure Patterns
常见故障模式
1. Wrapper Regression
1. 包装层退化
The base model produces correct answers, but the wrapper layers make it worse.
Symptoms:
- Model works fine in playground or direct API call, breaks in your agent
- Added a new prompt layer, existing behavior degraded
- Agent sounds confident but is confidently wrong
- "It was working before the last update"
基础模型输出正确结果,但包装层导致性能下降。
症状:
- 模型在测试环境或直接API调用中正常,但在Agent中运行异常
- 添加新提示层后,现有行为退化
- Agent输出语气自信但内容错误
- “上次更新前还正常”
2. Memory Contamination
2. 内存污染
Old topics leak into new conversations through history, memory retrieval, or distillation.
Symptoms:
- Agent brings up unrelated past topics
- User corrections don't stick (old memory overwrites new)
- Same-session artifacts re-enter as pseudo-facts
- Memory grows without bound, degrading response quality over time
旧话题通过历史记录、内存检索或蒸馏过程泄露到新对话中。
症状:
- Agent提及无关的过往话题
- 用户的修正无法生效(旧内存覆盖新内容)
- 同一会话的产物以伪事实形式重新进入系统
- 内存无限制增长,导致响应质量随时间下降
3. Tool Discipline Failure
3. 工具规则失效
Tools are declared in the prompt but not enforced in code. The model skips them or hallucinates execution.
Symptoms:
- "Must use tool X" in prompt, but model answers without calling it
- Tool results look correct but were never actually executed
- Different tools fight over the same responsibility
- Model uses tool when it shouldn't, or skips it when it must
提示中声明了工具使用规则,但代码未强制执行,模型跳过工具或产生幻觉执行。
症状:
- 提示中要求“必须使用工具X”,但模型未调用工具直接回答
- 工具结果看似正确但实际未执行
- 不同工具职责重叠冲突
- 模型在不应使用工具时调用,或必须使用时跳过
4. Rendering/Transport Corruption
4. 渲染/传输损坏
The agent's internal answer is correct, but the platform layer mutates it during delivery.
Symptoms:
- Logs show correct answer, user sees broken output
- Markdown rendering, JSON parsing, or streaming fragments corrupt valid responses
- Hidden fallback agent quietly replaces the answer before delivery
- Output differs between terminal and UI
Agent内部生成的响应正确,但平台层在交付过程中修改了内容。
症状:
- 日志显示正确响应,但用户看到异常输出
- Markdown渲染、JSON解析或流式片段损坏有效响应
- 隐藏的 fallback Agent在交付前悄悄替换响应
- 终端与UI中的输出不一致
5. Hidden Agent Layers
5. 隐藏Agent层
Silent repair, retry, summarization, or recall agents run without explicit contracts.
Symptoms:
- Output changes between internal generation and user delivery
- "Auto-fix" loops run a second LLM pass the user doesn't know about
- Multiple agents modify the same output without coordination
- Answers get "smoothed" or "corrected" by invisible layers
静默运行的修复、重试、总结或召回Agent未遵循明确约定。
症状:
- 内部生成结果与用户最终收到的输出不一致
- 用户不知情的情况下,“自动修复”循环执行第二次LLM调用
- 多个Agent无协调地修改同一输出
- 响应被不可见层“平滑”或“修正”
Audit Workflow
审计工作流
Phase 1: Scope
阶段1:范围定义
Define what you're auditing:
- Target system — what agent application?
- Entrypoints — how do users interact with it?
- Model stack — which LLM(s) and providers?
- Symptoms — what does the user report?
- Time window — when did it start?
- Layers to audit — which of the 12 layers apply?
明确审计对象:
- 目标系统 — 哪个Agent应用?
- 入口点 — 用户如何与它交互?
- 模型栈 — 使用了哪些LLM及提供商?
- 症状 — 用户反馈了什么问题?
- 时间窗口 — 问题何时开始出现?
- 审计层级 — 12层中哪些需要审计?
Phase 2: Evidence Collection
阶段2:证据收集
Gather evidence from the codebase:
- Source code — agent loop, tool router, memory admission, prompt assembly
- Logs — historical session traces, tool call records
- Config — prompt templates, tool schemas, provider settings
- Memory files — SOPs, knowledge bases, session archives
Use to search for anti-patterns:
rgbash
undefined从代码库中收集证据:
- 源代码 — Agent循环、工具路由、内存准入、提示组装逻辑
- 日志 — 历史会话追踪、工具调用记录
- 配置 — 提示模板、工具 schema、提供商设置
- 内存文件 — 标准操作流程、知识库、会话存档
使用搜索反模式:
rgbash
undefinedTool requirements expressed only in prompt text (not code)
仅在提示文本中声明工具要求(未在代码中实现)
rg "must.*tool|必须.*工具|required.*call" --type md
rg "must.*tool|必须.*工具|required.*call" --type md
Tool execution without validation
无验证的工具执行
rg "tool_call|toolCall|tool_use" --type py --type ts
rg "tool_call|toolCall|tool_use" --type py --type ts
Hidden LLM calls outside main agent loop
主Agent循环外的隐藏LLM调用
rg "completion|chat.create|messages.create|llm.invoke"
rg "completion|chat.create|messages.create|llm.invoke"
Memory admission without user-correction priority
未优先处理用户修正的内存准入逻辑
rg "memory.*admit|long.*term.*update|persist.*memory" --type py --type ts
rg "memory.*admit|long.*term.*update|persist.*memory" --type py --type ts
Fallback loops that run additional LLM calls
执行额外LLM调用的 fallback 循环
rg "fallback|retry.*llm|repair.*prompt|re-?prompt" --type py --type ts
rg "fallback|retry.*llm|repair.*prompt|re-?prompt" --type py --type ts
Silent output mutation
静默输出修改
rg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts
undefinedrg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts
undefinedPhase 3: Failure Mapping
阶段3:故障映射
For each finding, document:
- Symptom — what the user sees
- Mechanism — how the wrapper causes it
- Source layer — which of the 12 layers
- Root cause — the deepest cause
- Evidence — file:line or log:row reference
- Confidence — 0.0 to 1.0
针对每个检测结果,记录:
- 症状 — 用户看到的现象
- 机制 — 包装层如何导致问题
- 来源层级 — 属于12层中的哪一层
- 根本原因 — 最底层的原因
- 证据 — 文件:行号或日志:行号引用
- 置信度 — 0.0 到 1.0
Phase 4: Fix Strategy
阶段4:修复策略
Default fix order (code-first, not prompt-first):
- Code-gate tool requirements — enforce in code, not just prompt text
- Remove or narrow hidden repair agents — make fallback explicit with contracts
- Reduce context duplication — same info through prompt + history + memory + distillation
- Tighten memory admission — user corrections > agent assertions
- Tighten distillation triggers — don't compress what shouldn't be compressed
- Reduce rendering mutation — pass-through, don't transform
- Convert to typed JSON envelopes — structured internal flow, not freeform prose
默认修复顺序(代码优先,而非提示优先):
- 代码强制工具要求 — 在代码中强制执行,而非仅依赖提示文本
- 移除或缩小隐藏修复Agent范围 — 使 fallback 逻辑明确并遵循约定
- 减少上下文重复 — 避免同一信息同时出现在提示、历史记录、内存和蒸馏产物中
- 收紧内存准入规则 — 用户修正优先级高于Agent断言
- 收紧蒸馏触发条件 — 不压缩不应被压缩的内容
- 减少渲染变异 — 直接传递响应,不进行转换
- 改用类型化JSON信封 — 使用结构化内部流程,而非自由文本
Severity Model
严重度模型
| Level | Meaning | Action |
|---|---|---|
| Agent can confidently produce wrong operational behavior | Fix before next release |
| Agent frequently degrades correctness or stability | Fix this sprint |
| Correctness usually survives but output is fragile or wasteful | Plan for next cycle |
| Mostly cosmetic or maintainability issues | Backlog |
| 级别 | 含义 | 行动 |
|---|---|---|
| Agent会持续输出错误的操作行为 | 在下一次发布前修复 |
| Agent频繁降低正确性或稳定性 | 本迭代修复 |
| 通常能保持正确性,但输出脆弱或存在资源浪费 | 规划到下一周期修复 |
| 主要是外观或可维护性问题 | 放入待办清单 |
Output Format
输出格式
Present findings to the user in this order:
- Severity-ranked findings (most critical first)
- Architecture diagnosis (which layer corrupted what, and why)
- Ordered fix plan (code-first, not prompt-first)
Do not lead with compliments or summaries. If the system is broken, say so directly.
按以下顺序向用户呈现检测结果:
- 按严重度排序的检测结果(最关键的在前)
- 架构诊断(哪一层导致了什么问题,以及原因)
- 有序修复计划(代码优先,而非提示优先)
不要以赞美或总结开头。如果系统存在问题,直接说明。
Quick Diagnostic Questions
快速诊断问题
When auditing an agent system, answer these:
| # | Question | If Yes → |
|---|---|---|
| 1 | Can the model skip a required tool and still answer? | Tool not code-gated |
| 2 | Does old conversation content appear in new turns? | Memory contamination |
| 3 | Is the same info in system prompt AND memory AND history? | Context duplication |
| 4 | Does the platform run a second LLM pass before delivery? | Hidden repair loop |
| 5 | Does the output differ between internal generation and user delivery? | Rendering corruption |
| 6 | Are "must use tool X" rules only in prompt text? | Tool discipline failure |
| 7 | Can the agent's own monologue become persistent memory? | Memory poisoning |
审计Agent系统时,回答以下问题:
| 序号 | 问题 | 如果是 → |
|---|---|---|
| 1 | 模型是否可以跳过必要工具仍能回答? | 工具未通过代码强制 |
| 2 | 旧对话内容是否出现在新会话中? | 内存污染 |
| 3 | 同一信息是否同时存在于System prompt、内存和历史记录中? | 上下文重复 |
| 4 | 平台是否在交付前执行第二次LLM调用? | 隐藏修复循环 |
| 5 | 内部生成结果与用户收到的输出是否不一致? | 渲染损坏 |
| 6 | “必须使用工具X”的规则是否仅存在于提示文本中? | 工具规则失效 |
| 7 | Agent自身的独白是否会成为持久化内存? | 内存中毒 |
Anti-Patterns to Avoid
需避免的反模式
- Avoid blaming the model before falsifying wrapper-layer regressions.
- Avoid blaming memory without showing the contamination path.
- Do not let a clean current state erase a dirty historical incident.
- Do not treat markdown prose as a trustworthy internal protocol.
- Do not accept "must use tool" in prompt text when code never enforces it.
- Keep findings direct, evidence-backed, and severity-ranked.
- 在排除包装层退化前,不要归咎于模型。
- 在未找到污染路径前,不要归咎于内存。
- 不要让当前的干净状态掩盖过往的故障记录。
- 不要将Markdown文本视为可靠的内部协议。
- 当代码从未强制执行时,不要接受仅在提示文本中声明的“必须使用工具”规则。
- 检测结果要直接、有证据支持,并按严重度排序。
Report Schema
报告Schema
Audits should produce structured reports following this shape:
json
{
"schema_version": "ecc.agent-architecture-audit.report.v1",
"executive_verdict": {
"overall_health": "high_risk",
"primary_failure_mode": "string",
"most_urgent_fix": "string"
},
"scope": {
"target_name": "string",
"model_stack": ["string"],
"layers_to_audit": ["string"]
},
"findings": [
{
"severity": "critical|high|medium|low",
"title": "string",
"mechanism": "string",
"source_layer": "string",
"root_cause": "string",
"evidence_refs": ["file:line"],
"confidence": 0.0,
"recommended_fix": "string"
}
],
"ordered_fix_plan": [
{ "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
]
}审计应生成符合以下结构的结构化报告:
json
{
"schema_version": "ecc.agent-architecture-audit.report.v1",
"executive_verdict": {
"overall_health": "high_risk",
"primary_failure_mode": "string",
"most_urgent_fix": "string"
},
"scope": {
"target_name": "string",
"model_stack": ["string"],
"layers_to_audit": ["string"]
},
"findings": [
{
"severity": "critical|high|medium|low",
"title": "string",
"mechanism": "string",
"source_layer": "string",
"root_cause": "string",
"evidence_refs": ["file:line"],
"confidence": 0.0,
"recommended_fix": "string"
}
],
"ordered_fix_plan": [
{ "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
]
}Related Skills
相关技能
- — Debug agent runtime failures (loops, timeouts, state errors)
agent-introspection-debugging - — Benchmark agent performance head-to-head
agent-eval - — Security audit for code and configuration
security-review - — Set up autonomous agent operations
autonomous-agent-harness - — Build agent harnesses from scratch
agent-harness-construction
- — 调试Agent运行时故障(循环、超时、状态错误)
agent-introspection-debugging - — 基准测试Agent性能
agent-eval - — 代码与配置安全审计
security-review - — 搭建自主Agent运行环境
autonomous-agent-harness - — 从零构建Agent运行框架
agent-harness-construction