context-engineering

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Context Engineering

上下文工程

Context engineering curates the smallest high-signal token set for LLM tasks. The goal: maximize reasoning quality while minimizing token usage.
上下文工程为LLM任务筛选出最小规模的高信号Token集合。目标:在最小化Token使用量的同时最大化推理质量。

When to Activate

适用场景

  • Designing/debugging agent systems
  • Context limits constrain performance
  • Optimizing cost/latency
  • Building multi-agent coordination
  • Implementing memory systems
  • Evaluating agent performance
  • Developing LLM-powered pipelines
  • 设计/调试Agent系统
  • 上下文限制制约性能时
  • 优化成本/延迟
  • 构建多Agent协作机制
  • 实现内存系统
  • 评估Agent性能
  • 开发基于LLM的流水线

Core Principles

核心原则

  1. Context quality > quantity - High-signal tokens beat exhaustive content
  2. Attention is finite - U-shaped curve favors beginning/end positions
  3. Progressive disclosure - Load information just-in-time
  4. Isolation prevents degradation - Partition work across sub-agents
  5. Measure before optimizing - Know your baseline
  1. 上下文质量 > 数量 - 高信号Token优于详尽内容
  2. 注意力是有限的 - U型曲线显示开头/结尾位置更受关注
  3. 渐进式披露 - 仅在需要时加载信息
  4. 隔离防止退化 - 将工作拆分到子Agent中
  5. 先衡量再优化 - 了解你的基准线

Quick Reference

快速参考

TopicWhen to UseReference
FundamentalsUnderstanding context anatomy, attention mechanicscontext-fundamentals.md
DegradationDebugging failures, lost-in-middle, poisoningcontext-degradation.md
OptimizationCompaction, masking, caching, partitioningcontext-optimization.md
CompressionLong sessions, summarization strategiescontext-compression.md
MemoryCross-session persistence, knowledge graphsmemory-systems.md
Multi-AgentCoordination patterns, context isolationmulti-agent-patterns.md
EvaluationTesting agents, LLM-as-Judge, metricsevaluation.md
Tool DesignTool consolidation, description engineeringtool-design.md
PipelinesProject development, batch processingproject-development.md
主题适用场景参考文档
基础原理理解上下文结构、注意力机制context-fundamentals.md
退化问题调试故障、中间信息丢失、信息污染context-degradation.md
优化技术压缩、掩码、缓存、分区context-optimization.md
压缩策略长会话、总结策略context-compression.md
内存系统跨会话持久化、知识图谱memory-systems.md
多Agent模式协作模式、上下文隔离multi-agent-patterns.md
评估方法Agent测试、LLM-as-Judge、指标evaluation.md
工具设计工具整合、描述工程tool-design.md
流水线开发项目开发、批处理project-development.md

Key Metrics

关键指标

  • Token utilization: Warning at 70%, trigger optimization at 80%
  • Token variance: Explains 80% of agent performance variance
  • Multi-agent cost: ~15x single agent baseline
  • Compaction target: 50-70% reduction, <5% quality loss
  • Cache hit target: 70%+ for stable workloads
  • Token利用率:70%时发出警告,80%时触发优化
  • Token方差:解释了80%的Agent性能差异
  • 多Agent成本:约为单Agent基准的15倍
  • 压缩目标:减少50-70%的Token,质量损失<5%
  • 缓存命中率目标:稳定工作负载下达到70%+

Four-Bucket Strategy

四桶策略

  1. Write: Save context externally (scratchpads, files)
  2. Select: Pull only relevant context (retrieval, filtering)
  3. Compress: Reduce tokens while preserving info (summarization)
  4. Isolate: Split across sub-agents (partitioning)
  1. 写入:将上下文保存到外部(草稿本、文件)
  2. 选择:仅提取相关上下文(检索、过滤)
  3. 压缩:在保留信息的同时减少Token(总结)
  4. 隔离:拆分到子Agent中(分区)

Anti-Patterns

反模式

  • Exhaustive context over curated context
  • Critical info in middle positions
  • No compaction triggers before limits
  • Single agent for parallelizable tasks
  • Tools without clear descriptions
  • 优先使用详尽上下文而非精选上下文
  • 关键信息放在上下文中间位置
  • 达到限制前未触发压缩机制
  • 用单Agent处理可并行任务
  • 工具没有清晰的描述

Guidelines

指导方针

  1. Place critical info at beginning/end of context
  2. Implement compaction at 70-80% utilization
  3. Use sub-agents for context isolation, not role-play
  4. Design tools with 4-question framework (what, when, inputs, returns)
  5. Optimize for tokens-per-task, not tokens-per-request
  6. Validate with probe-based evaluation
  7. Monitor KV-cache hit rates in production
  8. Start minimal, add complexity only when proven necessary
  1. 将关键信息放在上下文的开头/结尾
  2. 在Token利用率达到70-80%时触发压缩
  3. 使用子Agent进行上下文隔离,而非角色扮演
  4. 用4问题框架设计工具(功能、适用场景、输入、返回值)
  5. 针对“每任务Token数”优化,而非“每请求Token数”
  6. 使用基于探针的评估进行验证
  7. 在生产环境中监控KV-cache命中率
  8. 从最小化方案开始,仅在验证必要时增加复杂度

Scripts

脚本

  • context_analyzer.py - Context health analysis, degradation detection
  • compression_evaluator.py - Compression quality evaluation
  • context_analyzer.py - 上下文健康分析、退化检测
  • compression_evaluator.py - 压缩质量评估