agent-architecture-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent Architecture Audit

Agent架构审计

A diagnostic workflow for agent systems that hide failures behind wrapper layers, stale memory, retry loops, or transport/rendering mutations.
这是针对Agent系统的诊断工作流,用于排查隐藏在包装层、过期内存、重试循环或传输/渲染变异背后的故障。

When to Activate

激活时机

MANDATORY for:
  • Releasing any agent or LLM-powered application to production
  • Shipping features with tool calling, memory, or multi-step workflows
  • Agent behavior degrades after adding wrapper layers
  • User reports "the agent is getting worse" or "tools are flaky"
  • Same model works in playground but breaks inside your wrapper
  • Debugging agent behavior for more than 15 minutes without finding root cause
Especially critical when:
  • You've added new prompt layers, tool definitions, or memory systems
  • Different agents in your system behave inconsistently
  • The model was fine yesterday but is hallucinating today
  • You suspect hidden repair/retry loops silently mutating responses
Do not use for:
  • General code debugging — use
    agent-introspection-debugging
  • Code review — use language-specific reviewer agents
  • Security scanning — use
    security-review
    or
    security-review/scan
  • Agent performance benchmarking — use
    agent-eval
  • Writing new features — use the appropriate workflow skill
强制使用场景:
  • 将任何Agent或LLM驱动应用发布到生产环境时
  • 交付包含工具调用、内存管理或多步骤工作流的功能时
  • 添加包装层后Agent行为出现退化
  • 用户反馈“Agent表现越来越差”或“工具不稳定”
  • 模型在测试环境正常,但在包装层中运行异常
  • 调试Agent行为超过15分钟仍未找到根本原因时
尤其关键的场景:
  • 新增了提示层、工具定义或内存系统
  • 系统中不同Agent的行为不一致
  • 模型之前运行正常,但现在出现幻觉输出
  • 怀疑存在隐藏的修复/重试循环在静默修改响应
请勿用于:
  • 通用代码调试 — 使用
    agent-introspection-debugging
  • 代码评审 — 使用特定语言的评审Agent
  • 安全扫描 — 使用
    security-review
    security-review/scan
  • Agent性能基准测试 — 使用
    agent-eval
  • 编写新功能 — 使用对应的工作流技能

The 12-Layer Stack

12层Agent栈

Every agent system has these layers. Any of them can corrupt the answer:
#LayerWhat Goes Wrong
1System promptConflicting instructions, instruction bloat
2Session historyStale context injection from previous turns
3Long-term memoryPollution across sessions, old topics in new conversations
4DistillationCompressed artifacts re-entering as pseudo-facts
5Active recallRedundant re-summary layers wasting context
6Tool selectionWrong tool routing, model skips required tools
7Tool executionHallucinated execution — claims to call but doesn't
8Tool interpretationMisread or ignored tool output
9Answer shapingFormat corruption in final response
10Platform renderingTransport-layer mutation (UI, API, CLI mutates valid answers)
11Hidden repair loopsSilent fallback/retry agents running second LLM pass
12PersistenceExpired state or cached artifacts reused as live evidence
每个Agent系统都包含以下层级,任何一层都可能导致输出异常:
序号层级常见问题
1System prompt指令冲突、指令冗余
2Session history注入来自之前会话的过期上下文
3Long-term memory跨会话污染、新对话中出现旧话题
4Distillation压缩后的产物以伪事实形式重新进入系统
5Active recall冗余的重总结层浪费上下文空间
6Tool selection工具路由错误、模型跳过必要工具
7Tool execution幻觉执行——声称调用工具但实际未执行
8Tool interpretation误读或忽略工具输出
9Answer shaping最终响应格式损坏
10Platform rendering传输层变异(UI、API、CLI修改有效响应)
11Hidden repair loops静默运行的 fallback/重试Agent执行第二次LLM调用
12Persistence过期状态或缓存产物被当作实时证据复用

Common Failure Patterns

常见故障模式

1. Wrapper Regression

1. 包装层退化

The base model produces correct answers, but the wrapper layers make it worse.
Symptoms:
  • Model works fine in playground or direct API call, breaks in your agent
  • Added a new prompt layer, existing behavior degraded
  • Agent sounds confident but is confidently wrong
  • "It was working before the last update"
基础模型输出正确结果,但包装层导致性能下降。
症状:
  • 模型在测试环境或直接API调用中正常,但在Agent中运行异常
  • 添加新提示层后,现有行为退化
  • Agent输出语气自信但内容错误
  • “上次更新前还正常”

2. Memory Contamination

2. 内存污染

Old topics leak into new conversations through history, memory retrieval, or distillation.
Symptoms:
  • Agent brings up unrelated past topics
  • User corrections don't stick (old memory overwrites new)
  • Same-session artifacts re-enter as pseudo-facts
  • Memory grows without bound, degrading response quality over time
旧话题通过历史记录、内存检索或蒸馏过程泄露到新对话中。
症状:
  • Agent提及无关的过往话题
  • 用户的修正无法生效(旧内存覆盖新内容)
  • 同一会话的产物以伪事实形式重新进入系统
  • 内存无限制增长,导致响应质量随时间下降

3. Tool Discipline Failure

3. 工具规则失效

Tools are declared in the prompt but not enforced in code. The model skips them or hallucinates execution.
Symptoms:
  • "Must use tool X" in prompt, but model answers without calling it
  • Tool results look correct but were never actually executed
  • Different tools fight over the same responsibility
  • Model uses tool when it shouldn't, or skips it when it must
提示中声明了工具使用规则,但代码未强制执行,模型跳过工具或产生幻觉执行。
症状:
  • 提示中要求“必须使用工具X”,但模型未调用工具直接回答
  • 工具结果看似正确但实际未执行
  • 不同工具职责重叠冲突
  • 模型在不应使用工具时调用,或必须使用时跳过

4. Rendering/Transport Corruption

4. 渲染/传输损坏

The agent's internal answer is correct, but the platform layer mutates it during delivery.
Symptoms:
  • Logs show correct answer, user sees broken output
  • Markdown rendering, JSON parsing, or streaming fragments corrupt valid responses
  • Hidden fallback agent quietly replaces the answer before delivery
  • Output differs between terminal and UI
Agent内部生成的响应正确,但平台层在交付过程中修改了内容。
症状:
  • 日志显示正确响应,但用户看到异常输出
  • Markdown渲染、JSON解析或流式片段损坏有效响应
  • 隐藏的 fallback Agent在交付前悄悄替换响应
  • 终端与UI中的输出不一致

5. Hidden Agent Layers

5. 隐藏Agent层

Silent repair, retry, summarization, or recall agents run without explicit contracts.
Symptoms:
  • Output changes between internal generation and user delivery
  • "Auto-fix" loops run a second LLM pass the user doesn't know about
  • Multiple agents modify the same output without coordination
  • Answers get "smoothed" or "corrected" by invisible layers
静默运行的修复、重试、总结或召回Agent未遵循明确约定。
症状:
  • 内部生成结果与用户最终收到的输出不一致
  • 用户不知情的情况下,“自动修复”循环执行第二次LLM调用
  • 多个Agent无协调地修改同一输出
  • 响应被不可见层“平滑”或“修正”

Audit Workflow

审计工作流

Phase 1: Scope

阶段1:范围定义

Define what you're auditing:
  • Target system — what agent application?
  • Entrypoints — how do users interact with it?
  • Model stack — which LLM(s) and providers?
  • Symptoms — what does the user report?
  • Time window — when did it start?
  • Layers to audit — which of the 12 layers apply?
明确审计对象:
  • 目标系统 — 哪个Agent应用?
  • 入口点 — 用户如何与它交互?
  • 模型栈 — 使用了哪些LLM及提供商?
  • 症状 — 用户反馈了什么问题?
  • 时间窗口 — 问题何时开始出现?
  • 审计层级 — 12层中哪些需要审计?

Phase 2: Evidence Collection

阶段2:证据收集

Gather evidence from the codebase:
  • Source code — agent loop, tool router, memory admission, prompt assembly
  • Logs — historical session traces, tool call records
  • Config — prompt templates, tool schemas, provider settings
  • Memory files — SOPs, knowledge bases, session archives
Use
rg
to search for anti-patterns:
bash
undefined
从代码库中收集证据:
  • 源代码 — Agent循环、工具路由、内存准入、提示组装逻辑
  • 日志 — 历史会话追踪、工具调用记录
  • 配置 — 提示模板、工具 schema、提供商设置
  • 内存文件 — 标准操作流程、知识库、会话存档
使用
rg
搜索反模式:
bash
undefined

Tool requirements expressed only in prompt text (not code)

仅在提示文本中声明工具要求(未在代码中实现)

rg "must.*tool|必须.*工具|required.*call" --type md
rg "must.*tool|必须.*工具|required.*call" --type md

Tool execution without validation

无验证的工具执行

rg "tool_call|toolCall|tool_use" --type py --type ts
rg "tool_call|toolCall|tool_use" --type py --type ts

Hidden LLM calls outside main agent loop

主Agent循环外的隐藏LLM调用

rg "completion|chat.create|messages.create|llm.invoke"
rg "completion|chat.create|messages.create|llm.invoke"

Memory admission without user-correction priority

未优先处理用户修正的内存准入逻辑

rg "memory.*admit|long.*term.*update|persist.*memory" --type py --type ts
rg "memory.*admit|long.*term.*update|persist.*memory" --type py --type ts

Fallback loops that run additional LLM calls

执行额外LLM调用的 fallback 循环

rg "fallback|retry.*llm|repair.*prompt|re-?prompt" --type py --type ts
rg "fallback|retry.*llm|repair.*prompt|re-?prompt" --type py --type ts

Silent output mutation

静默输出修改

rg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts
undefined
rg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts
undefined

Phase 3: Failure Mapping

阶段3:故障映射

For each finding, document:
  • Symptom — what the user sees
  • Mechanism — how the wrapper causes it
  • Source layer — which of the 12 layers
  • Root cause — the deepest cause
  • Evidence — file:line or log:row reference
  • Confidence — 0.0 to 1.0
针对每个检测结果,记录:
  • 症状 — 用户看到的现象
  • 机制 — 包装层如何导致问题
  • 来源层级 — 属于12层中的哪一层
  • 根本原因 — 最底层的原因
  • 证据 — 文件:行号或日志:行号引用
  • 置信度 — 0.0 到 1.0

Phase 4: Fix Strategy

阶段4:修复策略

Default fix order (code-first, not prompt-first):
  1. Code-gate tool requirements — enforce in code, not just prompt text
  2. Remove or narrow hidden repair agents — make fallback explicit with contracts
  3. Reduce context duplication — same info through prompt + history + memory + distillation
  4. Tighten memory admission — user corrections > agent assertions
  5. Tighten distillation triggers — don't compress what shouldn't be compressed
  6. Reduce rendering mutation — pass-through, don't transform
  7. Convert to typed JSON envelopes — structured internal flow, not freeform prose
默认修复顺序(代码优先,而非提示优先):
  1. 代码强制工具要求 — 在代码中强制执行,而非仅依赖提示文本
  2. 移除或缩小隐藏修复Agent范围 — 使 fallback 逻辑明确并遵循约定
  3. 减少上下文重复 — 避免同一信息同时出现在提示、历史记录、内存和蒸馏产物中
  4. 收紧内存准入规则 — 用户修正优先级高于Agent断言
  5. 收紧蒸馏触发条件 — 不压缩不应被压缩的内容
  6. 减少渲染变异 — 直接传递响应,不进行转换
  7. 改用类型化JSON信封 — 使用结构化内部流程,而非自由文本

Severity Model

严重度模型

LevelMeaningAction
critical
Agent can confidently produce wrong operational behaviorFix before next release
high
Agent frequently degrades correctness or stabilityFix this sprint
medium
Correctness usually survives but output is fragile or wastefulPlan for next cycle
low
Mostly cosmetic or maintainability issuesBacklog
级别含义行动
critical
Agent会持续输出错误的操作行为在下一次发布前修复
high
Agent频繁降低正确性或稳定性本迭代修复
medium
通常能保持正确性,但输出脆弱或存在资源浪费规划到下一周期修复
low
主要是外观或可维护性问题放入待办清单

Output Format

输出格式

Present findings to the user in this order:
  1. Severity-ranked findings (most critical first)
  2. Architecture diagnosis (which layer corrupted what, and why)
  3. Ordered fix plan (code-first, not prompt-first)
Do not lead with compliments or summaries. If the system is broken, say so directly.
按以下顺序向用户呈现检测结果:
  1. 按严重度排序的检测结果(最关键的在前)
  2. 架构诊断(哪一层导致了什么问题,以及原因)
  3. 有序修复计划(代码优先,而非提示优先)
不要以赞美或总结开头。如果系统存在问题,直接说明。

Quick Diagnostic Questions

快速诊断问题

When auditing an agent system, answer these:
#QuestionIf Yes →
1Can the model skip a required tool and still answer?Tool not code-gated
2Does old conversation content appear in new turns?Memory contamination
3Is the same info in system prompt AND memory AND history?Context duplication
4Does the platform run a second LLM pass before delivery?Hidden repair loop
5Does the output differ between internal generation and user delivery?Rendering corruption
6Are "must use tool X" rules only in prompt text?Tool discipline failure
7Can the agent's own monologue become persistent memory?Memory poisoning
审计Agent系统时,回答以下问题:
序号问题如果是 →
1模型是否可以跳过必要工具仍能回答?工具未通过代码强制
2旧对话内容是否出现在新会话中?内存污染
3同一信息是否同时存在于System prompt、内存和历史记录中?上下文重复
4平台是否在交付前执行第二次LLM调用?隐藏修复循环
5内部生成结果与用户收到的输出是否不一致?渲染损坏
6“必须使用工具X”的规则是否仅存在于提示文本中?工具规则失效
7Agent自身的独白是否会成为持久化内存?内存中毒

Anti-Patterns to Avoid

需避免的反模式

  • Avoid blaming the model before falsifying wrapper-layer regressions.
  • Avoid blaming memory without showing the contamination path.
  • Do not let a clean current state erase a dirty historical incident.
  • Do not treat markdown prose as a trustworthy internal protocol.
  • Do not accept "must use tool" in prompt text when code never enforces it.
  • Keep findings direct, evidence-backed, and severity-ranked.
  • 在排除包装层退化前,不要归咎于模型。
  • 在未找到污染路径前,不要归咎于内存。
  • 不要让当前的干净状态掩盖过往的故障记录。
  • 不要将Markdown文本视为可靠的内部协议。
  • 当代码从未强制执行时,不要接受仅在提示文本中声明的“必须使用工具”规则。
  • 检测结果要直接、有证据支持,并按严重度排序。

Report Schema

报告Schema

Audits should produce structured reports following this shape:
json
{
  "schema_version": "ecc.agent-architecture-audit.report.v1",
  "executive_verdict": {
    "overall_health": "high_risk",
    "primary_failure_mode": "string",
    "most_urgent_fix": "string"
  },
  "scope": {
    "target_name": "string",
    "model_stack": ["string"],
    "layers_to_audit": ["string"]
  },
  "findings": [
    {
      "severity": "critical|high|medium|low",
      "title": "string",
      "mechanism": "string",
      "source_layer": "string",
      "root_cause": "string",
      "evidence_refs": ["file:line"],
      "confidence": 0.0,
      "recommended_fix": "string"
    }
  ],
  "ordered_fix_plan": [
    { "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
  ]
}
审计应生成符合以下结构的结构化报告:
json
{
  "schema_version": "ecc.agent-architecture-audit.report.v1",
  "executive_verdict": {
    "overall_health": "high_risk",
    "primary_failure_mode": "string",
    "most_urgent_fix": "string"
  },
  "scope": {
    "target_name": "string",
    "model_stack": ["string"],
    "layers_to_audit": ["string"]
  },
  "findings": [
    {
      "severity": "critical|high|medium|low",
      "title": "string",
      "mechanism": "string",
      "source_layer": "string",
      "root_cause": "string",
      "evidence_refs": ["file:line"],
      "confidence": 0.0,
      "recommended_fix": "string"
    }
  ],
  "ordered_fix_plan": [
    { "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
  ]
}

Related Skills

相关技能

  • agent-introspection-debugging
    — Debug agent runtime failures (loops, timeouts, state errors)
  • agent-eval
    — Benchmark agent performance head-to-head
  • security-review
    — Security audit for code and configuration
  • autonomous-agent-harness
    — Set up autonomous agent operations
  • agent-harness-construction
    — Build agent harnesses from scratch
  • agent-introspection-debugging
    — 调试Agent运行时故障(循环、超时、状态错误)
  • agent-eval
    — 基准测试Agent性能
  • security-review
    — 代码与配置安全审计
  • autonomous-agent-harness
    — 搭建自主Agent运行环境
  • agent-harness-construction
    — 从零构建Agent运行框架