context-engineering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseContext Engineering
上下文工程
Context engineering curates the smallest high-signal token set for LLM tasks. The goal: maximize reasoning quality while minimizing token usage.
上下文工程为LLM任务筛选出最小规模的高信号Token集合。目标:在最小化Token使用量的同时最大化推理质量。
When to Activate
适用场景
- Designing/debugging agent systems
- Context limits constrain performance
- Optimizing cost/latency
- Building multi-agent coordination
- Implementing memory systems
- Evaluating agent performance
- Developing LLM-powered pipelines
- 设计/调试Agent系统
- 上下文限制制约性能时
- 优化成本/延迟
- 构建多Agent协作机制
- 实现内存系统
- 评估Agent性能
- 开发基于LLM的流水线
Core Principles
核心原则
- Context quality > quantity - High-signal tokens beat exhaustive content
- Attention is finite - U-shaped curve favors beginning/end positions
- Progressive disclosure - Load information just-in-time
- Isolation prevents degradation - Partition work across sub-agents
- Measure before optimizing - Know your baseline
- 上下文质量 > 数量 - 高信号Token优于详尽内容
- 注意力是有限的 - U型曲线显示开头/结尾位置更受关注
- 渐进式披露 - 仅在需要时加载信息
- 隔离防止退化 - 将工作拆分到子Agent中
- 先衡量再优化 - 了解你的基准线
Quick Reference
快速参考
| Topic | When to Use | Reference |
|---|---|---|
| Fundamentals | Understanding context anatomy, attention mechanics | context-fundamentals.md |
| Degradation | Debugging failures, lost-in-middle, poisoning | context-degradation.md |
| Optimization | Compaction, masking, caching, partitioning | context-optimization.md |
| Compression | Long sessions, summarization strategies | context-compression.md |
| Memory | Cross-session persistence, knowledge graphs | memory-systems.md |
| Multi-Agent | Coordination patterns, context isolation | multi-agent-patterns.md |
| Evaluation | Testing agents, LLM-as-Judge, metrics | evaluation.md |
| Tool Design | Tool consolidation, description engineering | tool-design.md |
| Pipelines | Project development, batch processing | project-development.md |
| 主题 | 适用场景 | 参考文档 |
|---|---|---|
| 基础原理 | 理解上下文结构、注意力机制 | context-fundamentals.md |
| 退化问题 | 调试故障、中间信息丢失、信息污染 | context-degradation.md |
| 优化技术 | 压缩、掩码、缓存、分区 | context-optimization.md |
| 压缩策略 | 长会话、总结策略 | context-compression.md |
| 内存系统 | 跨会话持久化、知识图谱 | memory-systems.md |
| 多Agent模式 | 协作模式、上下文隔离 | multi-agent-patterns.md |
| 评估方法 | Agent测试、LLM-as-Judge、指标 | evaluation.md |
| 工具设计 | 工具整合、描述工程 | tool-design.md |
| 流水线开发 | 项目开发、批处理 | project-development.md |
Key Metrics
关键指标
- Token utilization: Warning at 70%, trigger optimization at 80%
- Token variance: Explains 80% of agent performance variance
- Multi-agent cost: ~15x single agent baseline
- Compaction target: 50-70% reduction, <5% quality loss
- Cache hit target: 70%+ for stable workloads
- Token利用率:70%时发出警告,80%时触发优化
- Token方差:解释了80%的Agent性能差异
- 多Agent成本:约为单Agent基准的15倍
- 压缩目标:减少50-70%的Token,质量损失<5%
- 缓存命中率目标:稳定工作负载下达到70%+
Four-Bucket Strategy
四桶策略
- Write: Save context externally (scratchpads, files)
- Select: Pull only relevant context (retrieval, filtering)
- Compress: Reduce tokens while preserving info (summarization)
- Isolate: Split across sub-agents (partitioning)
- 写入:将上下文保存到外部(草稿本、文件)
- 选择:仅提取相关上下文(检索、过滤)
- 压缩:在保留信息的同时减少Token(总结)
- 隔离:拆分到子Agent中(分区)
Anti-Patterns
反模式
- Exhaustive context over curated context
- Critical info in middle positions
- No compaction triggers before limits
- Single agent for parallelizable tasks
- Tools without clear descriptions
- 优先使用详尽上下文而非精选上下文
- 关键信息放在上下文中间位置
- 达到限制前未触发压缩机制
- 用单Agent处理可并行任务
- 工具没有清晰的描述
Guidelines
指导方针
- Place critical info at beginning/end of context
- Implement compaction at 70-80% utilization
- Use sub-agents for context isolation, not role-play
- Design tools with 4-question framework (what, when, inputs, returns)
- Optimize for tokens-per-task, not tokens-per-request
- Validate with probe-based evaluation
- Monitor KV-cache hit rates in production
- Start minimal, add complexity only when proven necessary
- 将关键信息放在上下文的开头/结尾
- 在Token利用率达到70-80%时触发压缩
- 使用子Agent进行上下文隔离,而非角色扮演
- 用4问题框架设计工具(功能、适用场景、输入、返回值)
- 针对“每任务Token数”优化,而非“每请求Token数”
- 使用基于探针的评估进行验证
- 在生产环境中监控KV-cache命中率
- 从最小化方案开始,仅在验证必要时增加复杂度
Scripts
脚本
- context_analyzer.py - Context health analysis, degradation detection
- compression_evaluator.py - Compression quality evaluation
- context_analyzer.py - 上下文健康分析、退化检测
- compression_evaluator.py - 压缩质量评估