agent-architecture-audit

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Agent Architecture Audit

Agent架构审计

A diagnostic workflow for agent systems that hide failures behind wrapper layers, stale memory, retry loops, or transport/rendering mutations.

这是针对Agent系统的诊断工作流，用于排查隐藏在包装层、过期内存、重试循环或传输/渲染变异背后的故障。

When to Activate

激活时机

MANDATORY for:

Releasing any agent or LLM-powered application to production
Shipping features with tool calling, memory, or multi-step workflows
Agent behavior degrades after adding wrapper layers
User reports "the agent is getting worse" or "tools are flaky"
Same model works in playground but breaks inside your wrapper
Debugging agent behavior for more than 15 minutes without finding root cause

Especially critical when:

You've added new prompt layers, tool definitions, or memory systems
Different agents in your system behave inconsistently
The model was fine yesterday but is hallucinating today
You suspect hidden repair/retry loops silently mutating responses

Do not use for:

General code debugging — use
```
agent-introspection-debugging
```
Code review — use language-specific reviewer agents
Security scanning — use
```
security-review
```
or
```
security-review/scan
```
Agent performance benchmarking — use
```
agent-eval
```
Writing new features — use the appropriate workflow skill

强制使用场景：

将任何Agent或LLM驱动应用发布到生产环境时
交付包含工具调用、内存管理或多步骤工作流的功能时
添加包装层后Agent行为出现退化
用户反馈“Agent表现越来越差”或“工具不稳定”
模型在测试环境正常，但在包装层中运行异常
调试Agent行为超过15分钟仍未找到根本原因时

尤其关键的场景：

新增了提示层、工具定义或内存系统
系统中不同Agent的行为不一致
模型之前运行正常，但现在出现幻觉输出
怀疑存在隐藏的修复/重试循环在静默修改响应

请勿用于：

通用代码调试 — 使用
```
agent-introspection-debugging
```
代码评审 — 使用特定语言的评审Agent
安全扫描 — 使用
```
security-review
```
或
```
security-review/scan
```
Agent性能基准测试 — 使用
```
agent-eval
```
编写新功能 — 使用对应的工作流技能

The 12-Layer Stack

12层Agent栈

Every agent system has these layers. Any of them can corrupt the answer:

#	Layer	What Goes Wrong
1	System prompt	Conflicting instructions, instruction bloat
2	Session history	Stale context injection from previous turns
3	Long-term memory	Pollution across sessions, old topics in new conversations
4	Distillation	Compressed artifacts re-entering as pseudo-facts
5	Active recall	Redundant re-summary layers wasting context
6	Tool selection	Wrong tool routing, model skips required tools
7	Tool execution	Hallucinated execution — claims to call but doesn't
8	Tool interpretation	Misread or ignored tool output
9	Answer shaping	Format corruption in final response
10	Platform rendering	Transport-layer mutation (UI, API, CLI mutates valid answers)
11	Hidden repair loops	Silent fallback/retry agents running second LLM pass
12	Persistence	Expired state or cached artifacts reused as live evidence

每个Agent系统都包含以下层级，任何一层都可能导致输出异常：

序号	层级	常见问题
1	System prompt	指令冲突、指令冗余
2	Session history	注入来自之前会话的过期上下文
3	Long-term memory	跨会话污染、新对话中出现旧话题
4	Distillation	压缩后的产物以伪事实形式重新进入系统
5	Active recall	冗余的重总结层浪费上下文空间
6	Tool selection	工具路由错误、模型跳过必要工具
7	Tool execution	幻觉执行——声称调用工具但实际未执行
8	Tool interpretation	误读或忽略工具输出
9	Answer shaping	最终响应格式损坏
10	Platform rendering	传输层变异（UI、API、CLI修改有效响应）
11	Hidden repair loops	静默运行的 fallback/重试Agent执行第二次LLM调用
12	Persistence	过期状态或缓存产物被当作实时证据复用

Common Failure Patterns

常见故障模式

1. Wrapper Regression

1. 包装层退化

The base model produces correct answers, but the wrapper layers make it worse.

Symptoms:

Model works fine in playground or direct API call, breaks in your agent
Added a new prompt layer, existing behavior degraded
Agent sounds confident but is confidently wrong
"It was working before the last update"

基础模型输出正确结果，但包装层导致性能下降。

症状：

模型在测试环境或直接API调用中正常，但在Agent中运行异常
添加新提示层后，现有行为退化
Agent输出语气自信但内容错误
“上次更新前还正常”

2. Memory Contamination

2. 内存污染

Old topics leak into new conversations through history, memory retrieval, or distillation.

Symptoms:

Agent brings up unrelated past topics
User corrections don't stick (old memory overwrites new)
Same-session artifacts re-enter as pseudo-facts
Memory grows without bound, degrading response quality over time

旧话题通过历史记录、内存检索或蒸馏过程泄露到新对话中。

症状：

Agent提及无关的过往话题
用户的修正无法生效（旧内存覆盖新内容）
同一会话的产物以伪事实形式重新进入系统
内存无限制增长，导致响应质量随时间下降

3. Tool Discipline Failure

3. 工具规则失效

Tools are declared in the prompt but not enforced in code. The model skips them or hallucinates execution.

Symptoms:

"Must use tool X" in prompt, but model answers without calling it
Tool results look correct but were never actually executed
Different tools fight over the same responsibility
Model uses tool when it shouldn't, or skips it when it must

提示中声明了工具使用规则，但代码未强制执行，模型跳过工具或产生幻觉执行。

症状：

提示中要求“必须使用工具X”，但模型未调用工具直接回答
工具结果看似正确但实际未执行
不同工具职责重叠冲突
模型在不应使用工具时调用，或必须使用时跳过

4. Rendering/Transport Corruption

4. 渲染/传输损坏

The agent's internal answer is correct, but the platform layer mutates it during delivery.

Symptoms:

Logs show correct answer, user sees broken output
Markdown rendering, JSON parsing, or streaming fragments corrupt valid responses
Hidden fallback agent quietly replaces the answer before delivery
Output differs between terminal and UI

Agent内部生成的响应正确，但平台层在交付过程中修改了内容。

症状：

日志显示正确响应，但用户看到异常输出
Markdown渲染、JSON解析或流式片段损坏有效响应
隐藏的 fallback Agent在交付前悄悄替换响应
终端与UI中的输出不一致

5. Hidden Agent Layers

5. 隐藏Agent层

Silent repair, retry, summarization, or recall agents run without explicit contracts.

Symptoms:

Output changes between internal generation and user delivery
"Auto-fix" loops run a second LLM pass the user doesn't know about
Multiple agents modify the same output without coordination
Answers get "smoothed" or "corrected" by invisible layers

静默运行的修复、重试、总结或召回Agent未遵循明确约定。

症状：

内部生成结果与用户最终收到的输出不一致
用户不知情的情况下，“自动修复”循环执行第二次LLM调用
多个Agent无协调地修改同一输出
响应被不可见层“平滑”或“修正”

Audit Workflow

审计工作流

Phase 1: Scope

阶段1：范围定义

Define what you're auditing:

Target system — what agent application?
Entrypoints — how do users interact with it?
Model stack — which LLM(s) and providers?
Symptoms — what does the user report?
Time window — when did it start?
Layers to audit — which of the 12 layers apply?

明确审计对象：

目标系统 — 哪个Agent应用？
入口点 — 用户如何与它交互？
模型栈 — 使用了哪些LLM及提供商？
症状 — 用户反馈了什么问题？
时间窗口 — 问题何时开始出现？
审计层级 — 12层中哪些需要审计？

Phase 2: Evidence Collection

阶段2：证据收集

Gather evidence from the codebase:

Source code — agent loop, tool router, memory admission, prompt assembly
Logs — historical session traces, tool call records
Config — prompt templates, tool schemas, provider settings
Memory files — SOPs, knowledge bases, session archives

Use

rg

to search for anti-patterns:

bash

undefined

从代码库中收集证据：

源代码 — Agent循环、工具路由、内存准入、提示组装逻辑
日志 — 历史会话追踪、工具调用记录
配置 — 提示模板、工具 schema、提供商设置
内存文件 — 标准操作流程、知识库、会话存档

使用

rg

搜索反模式：

bash

undefined

Tool requirements expressed only in prompt text (not code)

仅在提示文本中声明工具要求（未在代码中实现）

rg "must.*tool|必须.*工具|required.*call" --type md

Tool execution without validation

无验证的工具执行

rg "tool_call|toolCall|tool_use" --type py --type ts

Hidden LLM calls outside main agent loop

主Agent循环外的隐藏LLM调用

rg "completion|chat.create|messages.create|llm.invoke"

Memory admission without user-correction priority

未优先处理用户修正的内存准入逻辑

rg "memory.*admit|long.*term.*update|persist.*memory" --type py --type ts

Fallback loops that run additional LLM calls

执行额外LLM调用的 fallback 循环

rg "fallback|retry.*llm|repair.*prompt|re-?prompt" --type py --type ts

Silent output mutation

静默输出修改

rg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts

undefined

rg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts

undefined

Phase 3: Failure Mapping

阶段3：故障映射

For each finding, document:

Symptom — what the user sees
Mechanism — how the wrapper causes it
Source layer — which of the 12 layers
Root cause — the deepest cause
Evidence — file:line or log:row reference
Confidence — 0.0 to 1.0

针对每个检测结果，记录：

症状 — 用户看到的现象
机制 — 包装层如何导致问题
来源层级 — 属于12层中的哪一层
根本原因 — 最底层的原因
证据 — 文件:行号或日志:行号引用
置信度 — 0.0 到 1.0

Phase 4: Fix Strategy

阶段4：修复策略

Default fix order (code-first, not prompt-first):

Code-gate tool requirements — enforce in code, not just prompt text
Remove or narrow hidden repair agents — make fallback explicit with contracts
Reduce context duplication — same info through prompt + history + memory + distillation
Tighten memory admission — user corrections > agent assertions
Tighten distillation triggers — don't compress what shouldn't be compressed
Reduce rendering mutation — pass-through, don't transform
Convert to typed JSON envelopes — structured internal flow, not freeform prose

默认修复顺序（代码优先，而非提示优先）：

代码强制工具要求 — 在代码中强制执行，而非仅依赖提示文本
移除或缩小隐藏修复Agent范围 — 使 fallback 逻辑明确并遵循约定
减少上下文重复 — 避免同一信息同时出现在提示、历史记录、内存和蒸馏产物中
收紧内存准入规则 — 用户修正优先级高于Agent断言
收紧蒸馏触发条件 — 不压缩不应被压缩的内容
减少渲染变异 — 直接传递响应，不进行转换
改用类型化JSON信封 — 使用结构化内部流程，而非自由文本

Severity Model

严重度模型

Level	Meaning	Action
`critical`	Agent can confidently produce wrong operational behavior	Fix before next release
`high`	Agent frequently degrades correctness or stability	Fix this sprint
`medium`	Correctness usually survives but output is fragile or wasteful	Plan for next cycle
`low`	Mostly cosmetic or maintainability issues	Backlog

级别	含义	行动
`critical`	Agent会持续输出错误的操作行为	在下一次发布前修复
`high`	Agent频繁降低正确性或稳定性	本迭代修复
`medium`	通常能保持正确性，但输出脆弱或存在资源浪费	规划到下一周期修复
`low`	主要是外观或可维护性问题	放入待办清单

Output Format

输出格式

Present findings to the user in this order:

Severity-ranked findings (most critical first)
Architecture diagnosis (which layer corrupted what, and why)
Ordered fix plan (code-first, not prompt-first)

Do not lead with compliments or summaries. If the system is broken, say so directly.

按以下顺序向用户呈现检测结果：

按严重度排序的检测结果（最关键的在前）
架构诊断（哪一层导致了什么问题，以及原因）
有序修复计划（代码优先，而非提示优先）

不要以赞美或总结开头。如果系统存在问题，直接说明。

Quick Diagnostic Questions

快速诊断问题

When auditing an agent system, answer these:

#	Question	If Yes →
1	Can the model skip a required tool and still answer?	Tool not code-gated
2	Does old conversation content appear in new turns?	Memory contamination
3	Is the same info in system prompt AND memory AND history?	Context duplication
4	Does the platform run a second LLM pass before delivery?	Hidden repair loop
5	Does the output differ between internal generation and user delivery?	Rendering corruption
6	Are "must use tool X" rules only in prompt text?	Tool discipline failure
7	Can the agent's own monologue become persistent memory?	Memory poisoning

审计Agent系统时，回答以下问题：

序号	问题	如果是 →
1	模型是否可以跳过必要工具仍能回答？	工具未通过代码强制
2	旧对话内容是否出现在新会话中？	内存污染
3	同一信息是否同时存在于System prompt、内存和历史记录中？	上下文重复
4	平台是否在交付前执行第二次LLM调用？	隐藏修复循环
5	内部生成结果与用户收到的输出是否不一致？	渲染损坏
6	“必须使用工具X”的规则是否仅存在于提示文本中？	工具规则失效
7	Agent自身的独白是否会成为持久化内存？	内存中毒

Anti-Patterns to Avoid

需避免的反模式

Avoid blaming the model before falsifying wrapper-layer regressions.
Avoid blaming memory without showing the contamination path.
Do not let a clean current state erase a dirty historical incident.
Do not treat markdown prose as a trustworthy internal protocol.
Do not accept "must use tool" in prompt text when code never enforces it.
Keep findings direct, evidence-backed, and severity-ranked.

在排除包装层退化前，不要归咎于模型。
在未找到污染路径前，不要归咎于内存。
不要让当前的干净状态掩盖过往的故障记录。
不要将Markdown文本视为可靠的内部协议。
当代码从未强制执行时，不要接受仅在提示文本中声明的“必须使用工具”规则。
检测结果要直接、有证据支持，并按严重度排序。

Report Schema

报告Schema

Audits should produce structured reports following this shape:

json

{
  "schema_version": "ecc.agent-architecture-audit.report.v1",
  "executive_verdict": {
    "overall_health": "high_risk",
    "primary_failure_mode": "string",
    "most_urgent_fix": "string"
  },
  "scope": {
    "target_name": "string",
    "model_stack": ["string"],
    "layers_to_audit": ["string"]
  },
  "findings": [
    {
      "severity": "critical|high|medium|low",
      "title": "string",
      "mechanism": "string",
      "source_layer": "string",
      "root_cause": "string",
      "evidence_refs": ["file:line"],
      "confidence": 0.0,
      "recommended_fix": "string"
    }
  ],
  "ordered_fix_plan": [
    { "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
  ]
}

审计应生成符合以下结构的结构化报告：

json

{
  "schema_version": "ecc.agent-architecture-audit.report.v1",
  "executive_verdict": {
    "overall_health": "high_risk",
    "primary_failure_mode": "string",
    "most_urgent_fix": "string"
  },
  "scope": {
    "target_name": "string",
    "model_stack": ["string"],
    "layers_to_audit": ["string"]
  },
  "findings": [
    {
      "severity": "critical|high|medium|low",
      "title": "string",
      "mechanism": "string",
      "source_layer": "string",
      "root_cause": "string",
      "evidence_refs": ["file:line"],
      "confidence": 0.0,
      "recommended_fix": "string"
    }
  ],
  "ordered_fix_plan": [
    { "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
  ]
}

agent-architecture-audit

Original

Translation

Agent Architecture Audit

Agent架构审计

When to Activate

激活时机

The 12-Layer Stack

12层Agent栈

Common Failure Patterns

常见故障模式

1. Wrapper Regression

1. 包装层退化

2. Memory Contamination

2. 内存污染

3. Tool Discipline Failure

3. 工具规则失效

4. Rendering/Transport Corruption

4. 渲染/传输损坏

5. Hidden Agent Layers

5. 隐藏Agent层

Audit Workflow

审计工作流

Phase 1: Scope

阶段1：范围定义

Phase 2: Evidence Collection

阶段2：证据收集

Tool requirements expressed only in prompt text (not code)

仅在提示文本中声明工具要求（未在代码中实现）

Tool execution without validation

无验证的工具执行

Hidden LLM calls outside main agent loop

主Agent循环外的隐藏LLM调用

Memory admission without user-correction priority

未优先处理用户修正的内存准入逻辑

Fallback loops that run additional LLM calls

执行额外LLM调用的 fallback 循环

Silent output mutation

静默输出修改

Phase 3: Failure Mapping

阶段3：故障映射

Phase 4: Fix Strategy

阶段4：修复策略

Severity Model

严重度模型

Output Format

输出格式

Quick Diagnostic Questions

快速诊断问题

Anti-Patterns to Avoid

需避免的反模式

Report Schema

报告Schema

Related Skills

相关技能