optimize-agent-docs

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Agent Knowledge Optimizer

Agent知识优化器

Transform accumulated documentation into a retrieval-optimized knowledge system.

将堆积的文档转换为检索优化的知识系统。

Core Principle

核心原则

File organization is a human concern. Agents don't browse—they search and load. Optimize for:

Discovery: What knowledge exists?
Relevance: Is it needed for this task?
Efficiency: What's the minimum to load?

文件组织是人类的需求。Agent不会浏览文档——它们只会搜索和加载内容。优化方向：

可发现性：现有哪些知识？
相关性：该知识是否适用于当前任务？
效率：最少需要加载哪些内容？

Workflow

工作流程

Phase 1: Knowledge Extraction

阶段1：知识提取

Inventory all agent documentation:

bash

undefined

盘点所有Agent文档：

bash

undefined

Find all agent doc sources

find . -maxdepth 2 -name ".md" -path "/.claude/" -o
-name ".md" -path "/.codex/" -o
-name ".md" -path "/.cursor/*" -o
-name "CLAUDE.md" -o -name "AGENTS.md" -o -name "INSTRUCTIONS.md"


For each file, extract:
- Discrete facts (single pieces of actionable information)
- Instructions (procedures, rules, constraints)
- Context triggers (when is this knowledge needed?)

find . -maxdepth 2 -name ".md" -path "/.claude/" -o
-name ".md" -path "/.codex/" -o
-name ".md" -path "/.cursor/*" -o
-name "CLAUDE.md" -o -name "AGENTS.md" -o -name "INSTRUCTIONS.md"


对每个文件，提取以下内容：
- 离散事实（单一可操作信息）
- 操作说明（流程、规则、约束条件）
- 上下文触发条件（何时需要该知识？）

Phase 2: Chunk Analysis

阶段2：知识块分析

Break content into retrieval units—the smallest self-contained piece of information that makes sense alone.

Good chunk:

undefined

将内容拆分为检索单元——即可以独立理解的最小自包含信息片段。

优质知识块：

undefined

Adding API Endpoints

Create handler in src/handlers/
Register route in src/routes.rs
Add OpenAPI spec to docs/api.yaml


Bad chunk (too coupled):

See the API section for endpoint patterns, but first read the auth docs, which reference the middleware guide...


Score each chunk:
- **Self-contained?** Can agent act on this without loading more?
- **Task-specific?** Clear when this is needed?
- **Information-dense?** High signal per token?

Create handler in src/handlers/
Register route in src/routes.rs
Add OpenAPI spec to docs/api.yaml


劣质知识块（耦合性过高）：

See the API section for endpoint patterns, but first read the auth docs, which reference the middleware guide...


为每个知识块评分：
- **是否自包含？** Agent无需加载更多内容就能基于此采取行动？
- **是否针对特定任务？** 明确何时需要该知识？
- **信息密度如何？** 每个token的信号量是否高？

Phase 3: Build Knowledge Manifest

阶段3：构建知识清单

Generate

.claude/KNOWLEDGE.md

—a lightweight index the agent reads first:

markdown

undefined

生成

.claude/KNOWLEDGE.md

——Agent首先读取的轻量级索引：

markdown

undefined

Knowledge Manifest

Task → Knowledge Map

When working on...	Load	Key terms
API endpoints	references/api.md	route, handler, endpoint
Authentication	references/auth.md	token, session, login
Database changes	references/schema.md	migration, model, query
Testing	references/testing.md	spec, fixture, mock
Deployment	references/deploy.md	release, staging, prod

When working on...	Load	Key terms
API endpoints	references/api.md	route, handler, endpoint
Authentication	references/auth.md	token, session, login
Database changes	references/schema.md	migration, model, query
Testing	references/testing.md	spec, fixture, mock
Deployment	references/deploy.md	release, staging, prod

Quick Reference

Build Commands

```
npm run dev
```
— Start dev server (port 3000)
```
npm test
```
— Run test suite
```
npm run build
```
— Production build

```
npm run dev
```
— Start dev server (port 3000)
```
npm test
```
— Run test suite
```
npm run build
```
— Production build

Key Paths

Handlers:
```
src/handlers/
```
Routes:
```
src/routes.ts
```
Tests:
```
tests/
```

Handlers:
```
src/handlers/
```
Routes:
```
src/routes.ts
```
Tests:
```
tests/
```

Critical Rules

Never commit .env files
All PRs require tests
Use conventional commits


The manifest contains:
1. **Task→Knowledge map**: What to load for what context
2. **Quick reference**: High-frequency facts (no file loading needed)
3. **Critical rules**: Must-know constraints (always relevant)

Never commit .env files
All PRs require tests
Use conventional commits


该清单包含：
1. **任务→知识映射表**：不同上下文场景下需要加载的内容
2. **快速参考**：高频事实（无需加载文件）
3. **关键规则**：必须知晓的约束条件（始终相关）

Phase 4: Compile Optimized Artifacts

阶段4：编译优化产物

Transform verbose source docs into dense, agent-optimized versions.

Compression techniques:

Source (verbose)	Compiled (dense)
"When you want to add a new endpoint, you should first create a handler function..."	`New endpoint: handler → route → spec`
Long prose paragraphs	Structured tables
Repeated information	Single source of truth
Examples with explanation	Just the pattern

Output structure:

.claude/
├── CLAUDE.md              # Human-readable, can stay verbose
├── KNOWLEDGE.md           # Agent manifest (generated)
└── compiled/              # Agent-optimized versions (generated)
    ├── api.md             # Dense API reference
    ├── patterns.md        # Code patterns as templates
    └── rules.md           # All constraints in one place

将冗长的源文档转换为简洁、Agent优化版的文档。

压缩技巧：

Source (verbose)	Compiled (dense)
"When you want to add a new endpoint, you should first create a handler function..."	`New endpoint: handler → route → spec`
Long prose paragraphs	Structured tables
Repeated information	Single source of truth
Examples with explanation	Just the pattern

输出结构：

.claude/
├── CLAUDE.md              # Human-readable, can stay verbose
├── KNOWLEDGE.md           # Agent manifest (generated)
└── compiled/              # Agent-optimized versions (generated)
    ├── api.md             # Dense API reference
    ├── patterns.md        # Code patterns as templates
    └── rules.md           # All constraints in one place

Phase 5: Generate Retrieval Hints

阶段5：生成检索提示

Add grep-friendly markers throughout compiled docs:

markdown

<!-- @task:new-endpoint @load:api,routes -->

在编译后的文档中添加支持grep检索的标记：

markdown

<!-- @task:new-endpoint @load:api,routes -->

Adding Endpoints

Authentication Flow

Test Patterns


These markers enable:
```bash


这些标记支持：
```bash

Find relevant sections for a task

grep -l "@task:new-endpoint" .claude/compiled/*.md

undefined

grep -l "@task:new-endpoint" .claude/compiled/*.md

undefined

Phase 6: Validation

阶段6：验证

Test the optimized system:

Coverage check: Every fact from source exists in compiled output
Retrieval test: Can common tasks be served with minimal loading?
Density check: Compiled versions smaller than sources?

bash

undefined

测试优化后的系统：

覆盖检查：源文档中的每个事实都存在于编译后的输出中
检索测试：常见任务是否能通过最少加载量完成？
密度检查：编译后的文档是否比源文档更小？

bash

undefined

Compare sizes

wc -l .claude/references/.md # Source wc -l .claude/compiled/.md # Compiled (should be smaller)

undefined

wc -l .claude/references/.md # Source wc -l .claude/compiled/.md # Compiled (should be smaller)

undefined

Manifest Format

清单格式

The

KNOWLEDGE.md

manifest follows this structure:

markdown

undefined

KNOWLEDGE.md

清单遵循以下结构：

markdown

undefined

Knowledge Manifest

Task Context Map

Context	Load	Search
[task description]	[file path]	[grep terms]

Context	Load	Search
[task description]	[file path]	[grep terms]

Always-Loaded Facts

Commands

[Most-used commands as a table]

Paths

[Key directories and their purposes]

Rules

[Critical constraints that always apply]

Chunk Index

Topic	Location	Lines	Summary
[topic]	[file:line-range]	[count]	[one-line summary]

undefined

Topic	Location	Lines	Summary
[topic]	[file:line-range]	[count]	[one-line summary]

undefined

Information Density Principles

信息密度原则

Convert Prose to Structure

将散文转换为结构化内容

Before:

"The authentication system uses JWT tokens stored in httpOnly cookies. When a user logs in, the server validates credentials against the database, generates a token with a 24-hour expiry, and sets it as a cookie..."

After:

undefined

Before:

"The authentication system uses JWT tokens stored in httpOnly cookies. When a user logs in, the server validates credentials against the database, generates a token with a 24-hour expiry, and sets it as a cookie..."

After:

undefined

Auth Flow

Method: JWT in httpOnly cookie
Expiry: 24h
Flow: credentials → DB validate → token → cookie

undefined

Method: JWT in httpOnly cookie
Expiry: 24h
Flow: credentials → DB validate → token → cookie

undefined

Eliminate Redundancy

消除冗余

If the same information appears in multiple places, create one canonical source and reference it:

markdown

undefined

如果同一信息出现在多个位置，创建唯一的权威来源并引用它：

markdown

undefined

Token Handling

See: Auth Flow — tokens section

undefined

See: Auth Flow — tokens section

undefined

Prefer Tables Over Lists

优先使用表格而非列表

Before:

markdown

- The API endpoint for users is /api/users
- The API endpoint for posts is /api/posts
- The API endpoint for comments is /api/comments

After:

markdown

| Resource | Endpoint |
|----------|----------|
| Users | /api/users |
| Posts | /api/posts |
| Comments | /api/comments |

Before:

markdown

- The API endpoint for users is /api/users
- The API endpoint for posts is /api/posts
- The API endpoint for comments is /api/comments

After:

markdown

| Resource | Endpoint |
|----------|----------|
| Users | /api/users |
| Posts | /api/posts |
| Comments | /api/comments |

Use Patterns Over Examples

使用模式而非示例

Before:

markdown

To create a user handler:
```javascript
export async function createUser(req, res) {
  const { name, email } = req.body;
  const user = await db.users.create({ name, email });
  res.json(user);
}

After:

markdown

Handler pattern: `export async function {action}{Resource}(req, res)`
Body: Extract params → DB operation → Return result

Before:

markdown

To create a user handler:
```javascript
export async function createUser(req, res) {
  const { name, email } = req.body;
  const user = await db.users.create({ name, email });
  res.json(user);
}

After:

markdown

Handler pattern: `export async function {action}{Resource}(req, res)`
Body: Extract params → DB operation → Return result

Output Checklist

输出检查清单

After optimization, verify:

```
KNOWLEDGE.md
```
exists and is under 100 lines
Task→knowledge mappings cover common workflows
Quick reference has most-used facts
Compiled docs are denser than sources
No orphaned knowledge (everything indexed)
Retrieval hints enable grep-based discovery
Original source docs untouched (human reference)

优化完成后，验证以下内容：

```
KNOWLEDGE.md
```
已存在且行数不超过100行
任务→知识映射表覆盖了常见工作流
快速参考包含高频使用的事实
编译后的文档比源文档更简洁
无孤立知识（所有内容均已索引）
检索提示支持基于grep的内容发现
原始源文档未被修改（供人类参考）