context-degradation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Context Degradation Patterns

上下文退化模式

Language models exhibit predictable degradation as context grows. Understanding these patterns is essential for diagnosing failures and designing resilient systems.
随着上下文长度增加,语言模型会出现可预测的性能退化。理解这些模式对于诊断故障和设计具备韧性的系统至关重要。

Degradation Patterns

退化模式

PatternCauseSymptoms
Lost-in-MiddleAttention mechanics10-40% lower recall for middle content
Context PoisoningErrors compoundTool misalignment, persistent hallucinations
Context DistractionIrrelevant infoUses wrong information for decisions
Context ConfusionMixed tasksResponses address wrong aspects
Context ClashConflicting infoContradictory guidance derails reasoning
模式成因症状
Lost-in-Middle(中间信息丢失)注意力机制中间内容的召回率降低10-40%
Context Poisoning(上下文污染)错误累积工具调用失准、持续产生幻觉内容
Context Distraction(上下文干扰)无关信息决策时使用错误信息
Context Confusion(上下文混淆)任务混合回复针对错误的任务维度
Context Clash(上下文冲突)信息矛盾矛盾的指导信息破坏推理过程

Lost-in-Middle

Lost-in-Middle(中间信息丢失)

Information at beginning and end receives reliable attention. Middle content suffers dramatically reduced recall.
Mitigation:
markdown
[CURRENT TASK]                      # At start (high attention)
- Goal: Generate quarterly report
- Deadline: End of week

[DETAILED CONTEXT]                  # Middle (less attention)
- 50 pages of data
- Supporting evidence

[KEY FINDINGS]                      # At end (high attention)
- Revenue up 15%
- Growth in Region A
位于上下文开头和结尾的信息会获得稳定的注意力,而中间内容的召回率会大幅下降。
缓解方案:
markdown
[CURRENT TASK]                      # 开头位置(高关注度)
- Goal: Generate quarterly report
- Deadline: End of week

[DETAILED CONTEXT]                  # 中间位置(低关注度)
- 50 pages of data
- Supporting evidence

[KEY FINDINGS]                      # 结尾位置(高关注度)
- Revenue up 15%
- Growth in Region A

Context Poisoning

Context Poisoning(上下文污染)

Once errors enter context, they compound through repeated reference.
Entry pathways:
  1. Tool outputs with errors
  2. Retrieved docs with incorrect info
  3. Model-generated summaries with hallucinations
Symptoms:
  • Tool calls with wrong parameters
  • Strategies that take effort to undo
  • Hallucinations that persist despite correction
Recovery:
  • Truncate to before poisoning point
  • Explicitly note poisoning and re-evaluate
  • Restart with clean context
一旦错误进入上下文,就会通过重复引用不断累积恶化。
错误引入途径:
  1. 包含错误的工具输出
  2. 检索到的文档存在错误信息
  3. 模型生成的摘要包含幻觉内容
症状:
  • 工具调用使用错误参数
  • 形成难以纠正的错误策略
  • 即使修正后仍持续产生幻觉内容
恢复方法:
  • 截断上下文至污染发生前的位置
  • 明确标记污染并重新评估
  • 使用干净的上下文重启任务

Context Distraction

Context Distraction(上下文干扰)

Even a single irrelevant document reduces performance. Models must attend to everything—they cannot "skip" irrelevant content.
Mitigation:
  • Filter for relevance before loading
  • Use namespacing for organization
  • Access via tools instead of context
即使是单份无关文档也会降低性能。模型必须关注所有内容——它们无法「跳过」无关信息。
缓解方案:
  • 加载前过滤无关内容
  • 使用命名空间进行组织
  • 通过工具访问而非直接放入上下文

Degradation Thresholds

退化阈值

ModelDegradation OnsetSevere Degradation
GPT-5.2~64K tokens~200K tokens
Claude Opus 4.5~100K tokens~180K tokens
Claude Sonnet 4.5~80K tokens~150K tokens
Gemini 3 Pro~500K tokens~800K tokens
模型退化起始点严重退化点
GPT-5.2~64K tokens~200K tokens
Claude Opus 4.5~100K tokens~180K tokens
Claude Sonnet 4.5~80K tokens~150K tokens
Gemini 3 Pro~500K tokens~800K tokens

The Four-Bucket Approach

四桶法策略

StrategyPurpose
WriteSave context outside window
SelectPull relevant context in
CompressReduce tokens, preserve info
IsolateSplit across sub-agents
策略目的
Write(写入)将上下文保存到窗口之外
Select(筛选)引入相关上下文
Compress(压缩)减少tokens数量,保留关键信息
Isolate(隔离)拆分任务给子Agent处理

Counterintuitive Findings

反直觉发现

  1. Shuffled haystacks outperform coherent - Coherent context creates false associations
  2. Single distractors have outsized impact - Step function, not proportional
  3. Needle-question similarity matters - Dissimilar content degrades faster
  1. 打乱的信息堆比连贯信息表现更好 - 连贯上下文会产生错误关联
  2. 单个干扰项影响巨大 - 性能呈阶跃式下降,而非比例性下降
  3. 关键信息与问题的相似度很重要 - 相似度低的内容退化速度更快

When Larger Contexts Hurt

大上下文反而有害的场景

  • Performance degrades non-linearly after threshold
  • Cost grows exponentially with context length
  • Cognitive bottleneck remains regardless of size
  • 超过阈值后性能呈非线性退化
  • 成本随上下文长度呈指数增长
  • 无论上下文多大,认知瓶颈依然存在

Best Practices

最佳实践

  1. Monitor context length and performance correlation
  2. Place critical information at beginning or end
  3. Implement compaction triggers before degradation
  4. Validate retrieved documents for accuracy
  5. Use versioning to prevent outdated info clash
  6. Segment tasks to prevent confusion
  7. Design for graceful degradation
  8. Test with progressively larger contexts
  1. 监控上下文长度与性能的相关性
  2. 将关键信息放在上下文的开头或结尾
  3. 在退化发生前触发上下文压缩
  4. 验证检索文档的准确性
  5. 使用版本控制防止过时信息冲突
  6. 拆分任务避免上下文混淆
  7. 设计具备优雅退化能力的系统
  8. 用逐步增大的上下文进行测试