context-management

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Context Management Skill

上下文管理技能

Overview

概述

This skill teaches you how to proactively manage your conversation context in AMCP to avoid LLM API errors caused by exceeding context window limits. Context management is critical when:

Working on long coding sessions with many files
Processing large tool outputs (e.g.,
```
grep
```
results, file reads)
Running multi-step debugging sessions
Reviewing or refactoring large codebases

本技能将教你如何主动在AMCP中管理对话上下文，避免因超出上下文窗口限制而引发LLM API错误。在以下场景中，上下文管理至关重要：

处理包含大量文件的长时间编码会话
处理大型工具输出（如
```
grep
```
结果、文件读取内容）
运行多步骤调试会话
评审或重构大型代码库

Understanding Context Windows

理解上下文窗口

Different LLM models have different context window sizes:

Model Family	Context Window
GPT-4 Turbo / GPT-4o	128,000 tokens
GPT-4.1	1,000,000 tokens
Claude 3.5 Sonnet	200,000 tokens
DeepSeek V3	64,000 tokens
Gemini 2.0 Flash	1,000,000 tokens
Qwen 2.5	128,000 tokens

AMCP automatically detects most model context windows. For unknown models, it uses 32,000 tokens as the default.

不同LLM模型的上下文窗口大小各不相同：

模型系列	上下文窗口
GPT-4 Turbo / GPT-4o	128,000 tokens
GPT-4.1	1,000,000 tokens
Claude 3.5 Sonnet	200,000 tokens
DeepSeek V3	64,000 tokens
Gemini 2.0 Flash	1,000,000 tokens
Qwen 2.5	128,000 tokens

AMCP会自动检测大多数模型的上下文窗口。对于未知模型，默认使用32,000 tokens作为上下文窗口大小。

Key Metrics

关键指标

When managing context, track:

Current tokens: Estimated size of current conversation history
Threshold tokens: When compaction should trigger (default: 70% of context)
Target tokens: What to aim for after compaction (default: 30% of context)
Safety margin: Reserved for response generation (default: 10%)

管理上下文时，需跟踪以下指标：

当前tokens：当前对话历史的估算大小
阈值tokens：触发压缩的阈值（默认：上下文的70%）
目标tokens：压缩后要达到的目标大小（默认：上下文的30%）
安全余量：为响应生成预留的空间（默认：10%）

How AMCP Compacts Context

AMCP如何压缩上下文

AMCP's

SmartCompactor

automatically compresses context when it exceeds the threshold:

python

undefined

当上下文超出阈值时，AMCP的

SmartCompactor

会自动压缩上下文：

python

undefined

In agent.py (line 500-505):

compactor = SmartCompactor(client, model) if compactor.should_compact(history_to_add): history_to_add, _ = compactor.compact(history_to_add)


**This happens automatically during conversation!** You don't need to trigger it manually.

compactor = SmartCompactor(client, model) if compactor.should_compact(history_to_add): history_to_add, _ = compactor.compact(history_to_add)


**此过程会在对话中自动进行！** 你无需手动触发。

Compaction Strategies

压缩策略

AMCP supports four strategies (configurable via

CompactionConfig

SUMMARY (default): Uses LLM to create intelligent summary of old messages
- Best for: Long sessions where earlier context is important
- Preserves: Errors, working solutions, current task state, file paths
TRUNCATE: Simple removal of old messages, keeping first and last few
- Best for: Fast compaction when context is less important
- Fastest option
SLIDING_WINDOW: Keeps only the most recent messages that fit in target
- Best for: Sessions where only recent context matters
- Very efficient
HYBRID: Combines sliding window with summary of removed content
- Best for: Balance between summary quality and speed

AMCP支持四种可通过

CompactionConfig

配置的策略：

SUMMARY（默认）：使用LLM对旧消息生成智能摘要
- 最适用于：早期上下文仍很重要的长会话
- 保留内容：错误信息、可行解决方案、当前任务状态、文件路径
TRUNCATE：简单移除旧消息，保留首尾少量内容
- 最适用于：上下文重要性较低时的快速压缩
- 速度最快的选项
SLIDING_WINDOW：仅保留能容纳在目标大小内的最新消息
- 最适用于：仅近期上下文重要的会话
- 效率极高
HYBRID：结合滑动窗口与已移除内容的摘要
- 最适用于：在摘要质量和速度之间取得平衡

Best Practices for Context Management

上下文管理最佳实践

1. Read Files Selectively

1. 选择性读取文件

Instead of reading entire large files:

python

undefined

避免读取整个大文件：

python

undefined

BAD: Reads entire file

错误示例：读取整个文件

read_file(path="src/large_module.py")

GOOD: Read specific sections

正确示例：读取特定章节

read_file(path="src/large_module.py", mode="indentation", offset=100, limit=50)

GOOD: Read specific line ranges

正确示例：读取特定行范围

read_file(path="src/large_module.py", mode="slice", ranges=["1-50", "200-250"])


**Use `read_file` in indentation mode** - it intelligently captures code blocks around your target, providing context without excessive content.

read_file(path="src/large_module.py", mode="slice", ranges=["1-50", "200-250"])


**使用`read_file`的缩进模式** - 它会智能捕获目标周围的代码块，在不引入过多内容的前提下提供上下文。

2. Use Grep with Context Limits

2. 限制Grep的上下文范围

python

undefined

python

undefined

BAD: Returns all matches with full context

错误示例：返回所有匹配结果及完整上下文

grep(pattern="function", paths=["src/"])

GOOD: Limited context

正确示例：限制上下文范围

grep(pattern="function", paths=["src/"], context=2)

undefined

grep(pattern="function", paths=["src/"], context=2)

undefined

3. Iterate Over Small Batches

3. 分小批量处理

When processing multiple files:

python

undefined

处理多个文件时：

python

undefined

Instead of processing all files at once:

不要一次性处理所有文件：

for file in files: # Process one file at a time result = process_file(file) # Save intermediate results

undefined

for file in files: # 一次处理一个文件 result = process_file(file) # 保存中间结果

undefined

4. Clear Conversation When Starting New Tasks

4. 开始新任务时清除对话

After completing a complex task, consider suggesting:

"会话历史较长。如果开始新的无关任务，建议清除历史或创建新会话以减少上下文。"

完成复杂任务后，可以建议：

"会话历史较长。如果开始新的无关任务，建议清除历史或创建新会话以减少上下文。"

5. Be Strategic with Tool Calls

5. 合理调用工具

Avoid redundant tool calls (e.g., reading the same file multiple times)
Use
```
grep
```
to find what you need before reading files
Process results incrementally rather than all at once

避免重复调用工具（如多次读取同一文件）
先使用
```
grep
```
找到所需内容，再读取文件
增量处理结果，而非一次性处理全部

Monitoring Context Usage

监控上下文使用情况

You can check context usage programmatically:

python

from amcp import SmartCompactor

你可以通过编程方式检查上下文使用情况：

python

from amcp import SmartCompactor

Create compactor

创建压缩器

compactor = SmartCompactor(client, model="gpt-4-turbo")

Get detailed usage info

获取详细使用信息

usage = compactor.get_token_usage(messages) print(f"Current: {usage['current_tokens']:,} tokens") print(f"Usage: {usage['usage_ratio']:.1%} of context") print(f"Headroom: {usage['headroom_tokens']:,} tokens") print(f"Should compact: {usage['should_compact']}")

undefined

usage = compactor.get_token_usage(messages) print(f"当前tokens: {usage['current_tokens']:,} tokens") print(f"使用率: {usage['usage_ratio']:.1%} 上下文") print(f"剩余空间: {usage['headroom_tokens']:,} tokens") print(f"需要压缩: {usage['should_compact']}")

undefined

Configuration

配置

Context compaction is configured via

CompactionConfig

python

from amcp import SmartCompactor, CompactionConfig, CompactionStrategy

config = CompactionConfig(
    strategy=CompactionStrategy.SUMMARY,
    threshold_ratio=0.7,  # Compact at 70% usage
    target_ratio=0.3,     # Aim for 30% after compaction
    preserve_last=6,      # Keep last 6 user/assistant messages
    preserve_tool_results=True,  # Preserve recent tool results
    max_tool_results=10,  # Max tool results to preserve
    min_tokens_to_compact=5000,  # Don't compact tiny contexts
    safety_margin=0.1,    # 10% margin for responses
)

You can configure compaction in

~/.config/amcp/config.toml

toml

[chat]
model = "deepseek-chat"

[compaction]
strategy = "summary"  # summary, truncate, sliding_window, hybrid
threshold_ratio = 0.7
target_ratio = 0.3

上下文压缩可通过

CompactionConfig

进行配置：

python

from amcp import SmartCompactor, CompactionConfig, CompactionStrategy

config = CompactionConfig(
    strategy=CompactionStrategy.SUMMARY,
    threshold_ratio=0.7,  # 使用率达70%时触发压缩
    target_ratio=0.3,     # 压缩后目标使用率为30%
    preserve_last=6,      # 保留最后6条用户/助手消息
    preserve_tool_results=True,  # 保留近期工具结果
    max_tool_results=10,  # 保留工具结果的最大数量
    min_tokens_to_compact=5000,  # 上下文过小时不压缩
    safety_margin=0.1,    # 为响应预留10%的安全余量
)

你也可以在

~/.config/amcp/config.toml

中配置压缩：

toml

[chat]
model = "deepseek-chat"

[compaction]
strategy = "summary"  # summary, truncate, sliding_window, hybrid
threshold_ratio = 0.7
target_ratio = 0.3

Automatic Summary Structure

自动摘要结构

When using SUMMARY or HYBRID strategies, the summary follows this structure:

xml

<current_task>
What we're working on now - be specific about files and goals
</current_task>

<completed>
- Task 1: Brief outcome + key changes made
- Task 2: Brief outcome + key changes made
</completed>

<code_state>
Key files and their current state - signatures + key logic only
Include file paths that were modified
</code_state>

<important>
Any crucial context: errors, decisions made, constraints, blockers
</important>

使用SUMMARY或HYBRID策略时，摘要遵循以下结构：

xml

<current_task>
当前正在处理的任务 - 需明确文件和目标
</current_task>

<completed>
- 任务1：简要结果 + 关键修改内容
- 任务2：简要结果 + 关键修改内容
</completed>

<code_state>
关键文件及其当前状态 - 仅保留签名和核心逻辑
包含已修改的文件路径
</code_state>

<important>
所有关键上下文：错误信息、已做决策、约束条件、阻塞问题
</important>

Token Estimation

Token估算

AMCP provides accurate token estimation:

python

from amcp import estimate_tokens

tokens = estimate_tokens(messages)

Uses
```
tiktoken
```
library when available (recommended)
Falls back to character-based estimation (4 chars ≈ 1 token)
Accounts for message role overhead

AMCP提供精准的token估算功能：

python

from amcp import estimate_tokens

tokens = estimate_tokens(messages)

可用时使用
```
tiktoken
```
库（推荐）
回退到基于字符的估算（4个字符≈1个token）
考虑消息角色的额外开销

Common Issues and Solutions

常见问题及解决方案

Issue 1: "Context length exceeded" API Error

问题1：“Context length exceeded” API错误

Symptom: LLM API returns error about context window being exceeded.

Causes:

Large tool outputs (e.g.,
```
grep
```
or
```
find
```
results)
Reading many large files
Long multi-step task conversations

Solutions:

Let AMCP's auto-compaction handle it (it will compact automatically)
Reduce tool output size (use
```
grep --limit
```
,
```
read_file
```
with ranges)
Process files in batches, not all at once
Suggest clearing conversation history if starting new task

症状：LLM API返回上下文窗口超出的错误。

原因：

大型工具输出（如
```
grep
```
或
```
find
```
结果）
读取大量大文件
多步骤任务的长对话

解决方案：

让AMCP的自动压缩功能处理（会自动触发压缩）
减小工具输出大小（使用
```
grep --limit
```
、带范围的
```
read_file
```
）
分批次处理文件，而非一次性处理
如果开始新任务，建议清除对话历史

Issue 2: Context Losing Important Information

问题2：上下文丢失重要信息

Symptom: After compaction, agent forgets critical details.

Causes:

SUMMARY strategy missed important context
Old critical messages were removed

Solutions:

Use
```
preserve_last
```
to keep more recent messages
Consider HYBRID strategy (keeps sliding window + summary)
Manually restate critical context in your prompt

症状：压缩后，Agent忘记了关键细节。

原因：

SUMMARY策略遗漏了重要上下文
旧的关键消息被移除

解决方案：

使用
```
preserve_last
```
参数保留更多近期消息
考虑使用HYBRID策略（保留滑动窗口+摘要）
在提示词中手动重述关键上下文

Issue 3: Summary Losing Code Details

问题3：摘要丢失代码细节

Symptom: Agent forgets specific code changes after compaction.

Solutions:

Use
```
preserve_tool_results=True
```
(default) to keep recent tool outputs
Increase
```
max_tool_results
```
(default: 10)
Review summary and add missing details

症状：压缩后，Agent忘记了特定的代码修改。

解决方案：

设置
```
preserve_tool_results=True
```
（默认开启）以保留近期工具输出
增大
```
max_tool_results
```
的值（默认：10）
检查摘要并补充缺失的细节

Progressive Disclosure

渐进式披露

AMCP uses progressive disclosure to manage skill instructions:

python

undefined

AMCP使用渐进式披露来管理技能说明：

python

undefined

In skills.py (line 297):

skills_summary = skill_manager.build_skills_summary()


This means:
- You only get a compact summary of all skills initially
- Full skill content is available when needed
- Reduces initial context overhead

skills_summary = skill_manager.build_skills_summary()


这意味着：
- 初始时你只会收到所有技能的精简摘要
- 完整技能内容会在需要时提供
- 减少初始上下文开销

Memory System vs Context

记忆系统与上下文的区别

AMCP separates:

Conversation Context: The messages sent to the LLM (limited by context window)
Persistent Memory: Long-term storage in
```
.amcp/memory/
```
(unlimited)

Use the memory system to store important information that should persist long-term:

python

memory(action="write", content="# Project Notes\n- Uses PostgreSQL database")

This information is NOT in the conversation context - it's stored separately and retrieved when relevant.

AMCP将两者分开：

对话上下文：发送给LLM的消息（受上下文窗口限制）
持久化记忆：存储在
```
.amcp/memory/
```
中的长期存储（无限制）

使用记忆系统存储需要长期保留的重要信息：

python

memory(action="write", content="# 项目笔记\n- 使用PostgreSQL数据库")

此信息不会出现在对话上下文中 - 它会被单独存储，并在相关时被检索。

Practical Workflow

实用工作流

For complex tasks with large context:

Start: Use

grep

find

to locate relevant files

python

grep(pattern="class User", paths=["src/"])

Explore: Read specific sections using indentation mode

python

read_file(path="src/models/user.py", mode="indentation", offset=1)

Process: Work incrementally, one file at a time
Monitor: If context gets large, compaction happens automatically

Persist: Save important findings to memory if needed for future sessions

python

memory(action="append", content="Found authentication bug in src/auth.py:45")

处理上下文较大的复杂任务时：

定位：使用

grep

或

find

找到相关文件

python

grep(pattern="class User", paths=["src/"])

探索：使用缩进模式读取特定章节

python

read_file(path="src/models/user.py", mode="indentation", offset=1)

处理：增量工作，一次处理一个文件
监控：如果上下文变大，压缩会自动进行
持久化：若未来会话需要，将重要发现保存到记忆中
python
```
memory(action="append", content="在src/auth.py:45处发现认证漏洞")
```

Event Monitoring

事件监控

AMCP emits events when compaction occurs:

python

from amcp import get_event_bus, EventType

@get_event_bus().on(EventType.CONTEXT_COMPACTED)
async def on_compaction(event):
    data = event.data
    print(f"Context compacted: {data['original_tokens']} -> {data['compacted_tokens']}")

压缩发生时，AMCP会触发事件：

python

from amcp import get_event_bus, EventType

@get_event_bus().on(EventType.CONTEXT_COMPACTED)
async def on_compaction(event):
    data = event.data
    print(f"上下文已压缩: {data['original_tokens']} -> {data['compacted_tokens']}")

Key Takeaways

核心要点

Auto-compaction happens automatically - you don't need to trigger it
Read files selectively - use indentation mode or ranges
Monitor context usage - be aware of large tool outputs
Persist important info to memory - for long-term retention
Process incrementally - handle large tasks in steps
Configure as needed - adjust
```
CompactionConfig
```
for your use case

自动压缩会自动进行 - 你无需手动触发
选择性读取文件 - 使用缩进模式或范围读取
监控上下文使用情况 - 注意大型工具输出
将重要信息持久化到记忆中 - 用于长期保存
增量处理 - 分步骤处理大型任务
按需配置 - 根据使用场景调整
```
CompactionConfig
```

When to Suggest Clearing Context

何时建议清除上下文

If you notice:

Conversation is very long (>50 messages)
You're starting a completely different task
Context from earlier sessions is no longer relevant

Then suggest:

"会话历史较长。如果开始新的无关任务，建议清除历史或创建新会话以减少上下文。"

This is proactive context management!

当你注意到以下情况时：

对话非常长（超过50条消息）
你要开始完全不同的任务
早期会话的上下文已不再相关

此时建议：

"会话历史较长。如果开始新的无关任务，建议清除历史或创建新会话以减少上下文。"

这是主动式的上下文管理！