context-management
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseContext Management Skill
上下文管理技能
Overview
概述
This skill teaches you how to proactively manage your conversation context in AMCP to avoid LLM API errors caused by exceeding context window limits. Context management is critical when:
- Working on long coding sessions with many files
- Processing large tool outputs (e.g., results, file reads)
grep - Running multi-step debugging sessions
- Reviewing or refactoring large codebases
本技能将教你如何主动在AMCP中管理对话上下文,避免因超出上下文窗口限制而引发LLM API错误。在以下场景中,上下文管理至关重要:
- 处理包含大量文件的长时间编码会话
- 处理大型工具输出(如结果、文件读取内容)
grep - 运行多步骤调试会话
- 评审或重构大型代码库
Understanding Context Windows
理解上下文窗口
Different LLM models have different context window sizes:
| Model Family | Context Window |
|---|---|
| GPT-4 Turbo / GPT-4o | 128,000 tokens |
| GPT-4.1 | 1,000,000 tokens |
| Claude 3.5 Sonnet | 200,000 tokens |
| DeepSeek V3 | 64,000 tokens |
| Gemini 2.0 Flash | 1,000,000 tokens |
| Qwen 2.5 | 128,000 tokens |
AMCP automatically detects most model context windows. For unknown models, it uses 32,000 tokens as the default.
不同LLM模型的上下文窗口大小各不相同:
| 模型系列 | 上下文窗口 |
|---|---|
| GPT-4 Turbo / GPT-4o | 128,000 tokens |
| GPT-4.1 | 1,000,000 tokens |
| Claude 3.5 Sonnet | 200,000 tokens |
| DeepSeek V3 | 64,000 tokens |
| Gemini 2.0 Flash | 1,000,000 tokens |
| Qwen 2.5 | 128,000 tokens |
AMCP会自动检测大多数模型的上下文窗口。对于未知模型,默认使用32,000 tokens作为上下文窗口大小。
Key Metrics
关键指标
When managing context, track:
- Current tokens: Estimated size of current conversation history
- Threshold tokens: When compaction should trigger (default: 70% of context)
- Target tokens: What to aim for after compaction (default: 30% of context)
- Safety margin: Reserved for response generation (default: 10%)
管理上下文时,需跟踪以下指标:
- 当前tokens:当前对话历史的估算大小
- 阈值tokens:触发压缩的阈值(默认:上下文的70%)
- 目标tokens:压缩后要达到的目标大小(默认:上下文的30%)
- 安全余量:为响应生成预留的空间(默认:10%)
How AMCP Compacts Context
AMCP如何压缩上下文
AMCP's automatically compresses context when it exceeds the threshold:
SmartCompactorpython
undefined当上下文超出阈值时,AMCP的会自动压缩上下文:
SmartCompactorpython
undefinedIn agent.py (line 500-505):
In agent.py (line 500-505):
compactor = SmartCompactor(client, model)
if compactor.should_compact(history_to_add):
history_to_add, _ = compactor.compact(history_to_add)
**This happens automatically during conversation!** You don't need to trigger it manually.compactor = SmartCompactor(client, model)
if compactor.should_compact(history_to_add):
history_to_add, _ = compactor.compact(history_to_add)
**此过程会在对话中自动进行!** 你无需手动触发。Compaction Strategies
压缩策略
AMCP supports four strategies (configurable via ):
CompactionConfig-
SUMMARY (default): Uses LLM to create intelligent summary of old messages
- Best for: Long sessions where earlier context is important
- Preserves: Errors, working solutions, current task state, file paths
-
TRUNCATE: Simple removal of old messages, keeping first and last few
- Best for: Fast compaction when context is less important
- Fastest option
-
SLIDING_WINDOW: Keeps only the most recent messages that fit in target
- Best for: Sessions where only recent context matters
- Very efficient
-
HYBRID: Combines sliding window with summary of removed content
- Best for: Balance between summary quality and speed
AMCP支持四种可通过配置的策略:
CompactionConfig-
SUMMARY(默认):使用LLM对旧消息生成智能摘要
- 最适用于:早期上下文仍很重要的长会话
- 保留内容:错误信息、可行解决方案、当前任务状态、文件路径
-
TRUNCATE:简单移除旧消息,保留首尾少量内容
- 最适用于:上下文重要性较低时的快速压缩
- 速度最快的选项
-
SLIDING_WINDOW:仅保留能容纳在目标大小内的最新消息
- 最适用于:仅近期上下文重要的会话
- 效率极高
-
HYBRID:结合滑动窗口与已移除内容的摘要
- 最适用于:在摘要质量和速度之间取得平衡
Best Practices for Context Management
上下文管理最佳实践
1. Read Files Selectively
1. 选择性读取文件
Instead of reading entire large files:
python
undefined避免读取整个大文件:
python
undefinedBAD: Reads entire file
错误示例:读取整个文件
read_file(path="src/large_module.py")
read_file(path="src/large_module.py")
GOOD: Read specific sections
正确示例:读取特定章节
read_file(path="src/large_module.py", mode="indentation", offset=100, limit=50)
read_file(path="src/large_module.py", mode="indentation", offset=100, limit=50)
GOOD: Read specific line ranges
正确示例:读取特定行范围
read_file(path="src/large_module.py", mode="slice", ranges=["1-50", "200-250"])
**Use `read_file` in indentation mode** - it intelligently captures code blocks around your target, providing context without excessive content.read_file(path="src/large_module.py", mode="slice", ranges=["1-50", "200-250"])
**使用`read_file`的缩进模式** - 它会智能捕获目标周围的代码块,在不引入过多内容的前提下提供上下文。2. Use Grep with Context Limits
2. 限制Grep的上下文范围
python
undefinedpython
undefinedBAD: Returns all matches with full context
错误示例:返回所有匹配结果及完整上下文
grep(pattern="function", paths=["src/"])
grep(pattern="function", paths=["src/"])
GOOD: Limited context
正确示例:限制上下文范围
grep(pattern="function", paths=["src/"], context=2)
undefinedgrep(pattern="function", paths=["src/"], context=2)
undefined3. Iterate Over Small Batches
3. 分小批量处理
When processing multiple files:
python
undefined处理多个文件时:
python
undefinedInstead of processing all files at once:
不要一次性处理所有文件:
for file in files:
# Process one file at a time
result = process_file(file)
# Save intermediate results
undefinedfor file in files:
# 一次处理一个文件
result = process_file(file)
# 保存中间结果
undefined4. Clear Conversation When Starting New Tasks
4. 开始新任务时清除对话
After completing a complex task, consider suggesting:
"会话历史较长。如果开始新的无关任务,建议清除历史或创建新会话以减少上下文。"
完成复杂任务后,可以建议:
"会话历史较长。如果开始新的无关任务,建议清除历史或创建新会话以减少上下文。"
5. Be Strategic with Tool Calls
5. 合理调用工具
- Avoid redundant tool calls (e.g., reading the same file multiple times)
- Use to find what you need before reading files
grep - Process results incrementally rather than all at once
- 避免重复调用工具(如多次读取同一文件)
- 先使用找到所需内容,再读取文件
grep - 增量处理结果,而非一次性处理全部
Monitoring Context Usage
监控上下文使用情况
You can check context usage programmatically:
python
from amcp import SmartCompactor你可以通过编程方式检查上下文使用情况:
python
from amcp import SmartCompactorCreate compactor
创建压缩器
compactor = SmartCompactor(client, model="gpt-4-turbo")
compactor = SmartCompactor(client, model="gpt-4-turbo")
Get detailed usage info
获取详细使用信息
usage = compactor.get_token_usage(messages)
print(f"Current: {usage['current_tokens']:,} tokens")
print(f"Usage: {usage['usage_ratio']:.1%} of context")
print(f"Headroom: {usage['headroom_tokens']:,} tokens")
print(f"Should compact: {usage['should_compact']}")
undefinedusage = compactor.get_token_usage(messages)
print(f"当前tokens: {usage['current_tokens']:,} tokens")
print(f"使用率: {usage['usage_ratio']:.1%} 上下文")
print(f"剩余空间: {usage['headroom_tokens']:,} tokens")
print(f"需要压缩: {usage['should_compact']}")
undefinedConfiguration
配置
Context compaction is configured via :
CompactionConfigpython
from amcp import SmartCompactor, CompactionConfig, CompactionStrategy
config = CompactionConfig(
strategy=CompactionStrategy.SUMMARY,
threshold_ratio=0.7, # Compact at 70% usage
target_ratio=0.3, # Aim for 30% after compaction
preserve_last=6, # Keep last 6 user/assistant messages
preserve_tool_results=True, # Preserve recent tool results
max_tool_results=10, # Max tool results to preserve
min_tokens_to_compact=5000, # Don't compact tiny contexts
safety_margin=0.1, # 10% margin for responses
)You can configure compaction in :
~/.config/amcp/config.tomltoml
[chat]
model = "deepseek-chat"
[compaction]
strategy = "summary" # summary, truncate, sliding_window, hybrid
threshold_ratio = 0.7
target_ratio = 0.3上下文压缩可通过进行配置:
CompactionConfigpython
from amcp import SmartCompactor, CompactionConfig, CompactionStrategy
config = CompactionConfig(
strategy=CompactionStrategy.SUMMARY,
threshold_ratio=0.7, # 使用率达70%时触发压缩
target_ratio=0.3, # 压缩后目标使用率为30%
preserve_last=6, # 保留最后6条用户/助手消息
preserve_tool_results=True, # 保留近期工具结果
max_tool_results=10, # 保留工具结果的最大数量
min_tokens_to_compact=5000, # 上下文过小时不压缩
safety_margin=0.1, # 为响应预留10%的安全余量
)你也可以在中配置压缩:
~/.config/amcp/config.tomltoml
[chat]
model = "deepseek-chat"
[compaction]
strategy = "summary" # summary, truncate, sliding_window, hybrid
threshold_ratio = 0.7
target_ratio = 0.3Automatic Summary Structure
自动摘要结构
When using SUMMARY or HYBRID strategies, the summary follows this structure:
xml
<current_task>
What we're working on now - be specific about files and goals
</current_task>
<completed>
- Task 1: Brief outcome + key changes made
- Task 2: Brief outcome + key changes made
</completed>
<code_state>
Key files and their current state - signatures + key logic only
Include file paths that were modified
</code_state>
<important>
Any crucial context: errors, decisions made, constraints, blockers
</important>使用SUMMARY或HYBRID策略时,摘要遵循以下结构:
xml
<current_task>
当前正在处理的任务 - 需明确文件和目标
</current_task>
<completed>
- 任务1:简要结果 + 关键修改内容
- 任务2:简要结果 + 关键修改内容
</completed>
<code_state>
关键文件及其当前状态 - 仅保留签名和核心逻辑
包含已修改的文件路径
</code_state>
<important>
所有关键上下文:错误信息、已做决策、约束条件、阻塞问题
</important>Token Estimation
Token估算
AMCP provides accurate token estimation:
python
from amcp import estimate_tokens
tokens = estimate_tokens(messages)- Uses library when available (recommended)
tiktoken - Falls back to character-based estimation (4 chars ≈ 1 token)
- Accounts for message role overhead
AMCP提供精准的token估算功能:
python
from amcp import estimate_tokens
tokens = estimate_tokens(messages)- 可用时使用库(推荐)
tiktoken - 回退到基于字符的估算(4个字符≈1个token)
- 考虑消息角色的额外开销
Common Issues and Solutions
常见问题及解决方案
Issue 1: "Context length exceeded" API Error
问题1:“Context length exceeded” API错误
Symptom: LLM API returns error about context window being exceeded.
Causes:
- Large tool outputs (e.g., or
grepresults)find - Reading many large files
- Long multi-step task conversations
Solutions:
- Let AMCP's auto-compaction handle it (it will compact automatically)
- Reduce tool output size (use ,
grep --limitwith ranges)read_file - Process files in batches, not all at once
- Suggest clearing conversation history if starting new task
症状:LLM API返回上下文窗口超出的错误。
原因:
- 大型工具输出(如或
grep结果)find - 读取大量大文件
- 多步骤任务的长对话
解决方案:
- 让AMCP的自动压缩功能处理(会自动触发压缩)
- 减小工具输出大小(使用、带范围的
grep --limit)read_file - 分批次处理文件,而非一次性处理
- 如果开始新任务,建议清除对话历史
Issue 2: Context Losing Important Information
问题2:上下文丢失重要信息
Symptom: After compaction, agent forgets critical details.
Causes:
- SUMMARY strategy missed important context
- Old critical messages were removed
Solutions:
- Use to keep more recent messages
preserve_last - Consider HYBRID strategy (keeps sliding window + summary)
- Manually restate critical context in your prompt
症状:压缩后,Agent忘记了关键细节。
原因:
- SUMMARY策略遗漏了重要上下文
- 旧的关键消息被移除
解决方案:
- 使用参数保留更多近期消息
preserve_last - 考虑使用HYBRID策略(保留滑动窗口+摘要)
- 在提示词中手动重述关键上下文
Issue 3: Summary Losing Code Details
问题3:摘要丢失代码细节
Symptom: Agent forgets specific code changes after compaction.
Solutions:
- Use (default) to keep recent tool outputs
preserve_tool_results=True - Increase (default: 10)
max_tool_results - Review summary and add missing details
症状:压缩后,Agent忘记了特定的代码修改。
解决方案:
- 设置(默认开启)以保留近期工具输出
preserve_tool_results=True - 增大的值(默认:10)
max_tool_results - 检查摘要并补充缺失的细节
Progressive Disclosure
渐进式披露
AMCP uses progressive disclosure to manage skill instructions:
python
undefinedAMCP使用渐进式披露来管理技能说明:
python
undefinedIn skills.py (line 297):
In skills.py (line 297):
skills_summary = skill_manager.build_skills_summary()
This means:
- You only get a compact summary of all skills initially
- Full skill content is available when needed
- Reduces initial context overheadskills_summary = skill_manager.build_skills_summary()
这意味着:
- 初始时你只会收到所有技能的精简摘要
- 完整技能内容会在需要时提供
- 减少初始上下文开销Memory System vs Context
记忆系统与上下文的区别
AMCP separates:
- Conversation Context: The messages sent to the LLM (limited by context window)
- Persistent Memory: Long-term storage in (unlimited)
.amcp/memory/
Use the memory system to store important information that should persist long-term:
python
memory(action="write", content="# Project Notes\n- Uses PostgreSQL database")This information is NOT in the conversation context - it's stored separately and retrieved when relevant.
AMCP将两者分开:
- 对话上下文:发送给LLM的消息(受上下文窗口限制)
- 持久化记忆:存储在中的长期存储(无限制)
.amcp/memory/
使用记忆系统存储需要长期保留的重要信息:
python
memory(action="write", content="# 项目笔记\n- 使用PostgreSQL数据库")此信息不会出现在对话上下文中 - 它会被单独存储,并在相关时被检索。
Practical Workflow
实用工作流
For complex tasks with large context:
-
Start: Useor
grepto locate relevant filesfindpythongrep(pattern="class User", paths=["src/"]) -
Explore: Read specific sections using indentation modepython
read_file(path="src/models/user.py", mode="indentation", offset=1) -
Process: Work incrementally, one file at a time
-
Monitor: If context gets large, compaction happens automatically
-
Persist: Save important findings to memory if needed for future sessionspython
memory(action="append", content="Found authentication bug in src/auth.py:45")
处理上下文较大的复杂任务时:
-
定位:使用或
grep找到相关文件findpythongrep(pattern="class User", paths=["src/"]) -
探索:使用缩进模式读取特定章节python
read_file(path="src/models/user.py", mode="indentation", offset=1) -
处理:增量工作,一次处理一个文件
-
监控:如果上下文变大,压缩会自动进行
-
持久化:若未来会话需要,将重要发现保存到记忆中python
memory(action="append", content="在src/auth.py:45处发现认证漏洞")
Event Monitoring
事件监控
AMCP emits events when compaction occurs:
python
from amcp import get_event_bus, EventType
@get_event_bus().on(EventType.CONTEXT_COMPACTED)
async def on_compaction(event):
data = event.data
print(f"Context compacted: {data['original_tokens']} -> {data['compacted_tokens']}")压缩发生时,AMCP会触发事件:
python
from amcp import get_event_bus, EventType
@get_event_bus().on(EventType.CONTEXT_COMPACTED)
async def on_compaction(event):
data = event.data
print(f"上下文已压缩: {data['original_tokens']} -> {data['compacted_tokens']}")Key Takeaways
核心要点
- Auto-compaction happens automatically - you don't need to trigger it
- Read files selectively - use indentation mode or ranges
- Monitor context usage - be aware of large tool outputs
- Persist important info to memory - for long-term retention
- Process incrementally - handle large tasks in steps
- Configure as needed - adjust for your use case
CompactionConfig
- 自动压缩会自动进行 - 你无需手动触发
- 选择性读取文件 - 使用缩进模式或范围读取
- 监控上下文使用情况 - 注意大型工具输出
- 将重要信息持久化到记忆中 - 用于长期保存
- 增量处理 - 分步骤处理大型任务
- 按需配置 - 根据使用场景调整
CompactionConfig
When to Suggest Clearing Context
何时建议清除上下文
If you notice:
- Conversation is very long (>50 messages)
- You're starting a completely different task
- Context from earlier sessions is no longer relevant
Then suggest:
"会话历史较长。如果开始新的无关任务,建议清除历史或创建新会话以减少上下文。"
This is proactive context management!
当你注意到以下情况时:
- 对话非常长(超过50条消息)
- 你要开始完全不同的任务
- 早期会话的上下文已不再相关
此时建议:
"会话历史较长。如果开始新的无关任务,建议清除历史或创建新会话以减少上下文。"
这是主动式的上下文管理!