optimize-agent-docs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Agent Knowledge Optimizer

Agent知识优化器

Transform accumulated documentation into a retrieval-optimized knowledge system.
将堆积的文档转换为检索优化的知识系统。

Core Principle

核心原则

File organization is a human concern. Agents don't browse—they search and load. Optimize for:
  • Discovery: What knowledge exists?
  • Relevance: Is it needed for this task?
  • Efficiency: What's the minimum to load?
文件组织是人类的需求。Agent不会浏览文档——它们只会搜索和加载内容。优化方向:
  • 可发现性:现有哪些知识?
  • 相关性:该知识是否适用于当前任务?
  • 效率:最少需要加载哪些内容?

Workflow

工作流程

Phase 1: Knowledge Extraction

阶段1:知识提取

Inventory all agent documentation:
bash
undefined
盘点所有Agent文档:
bash
undefined

Find all agent doc sources

Find all agent doc sources

find . -maxdepth 2 -name ".md" -path "/.claude/" -o
-name "
.md" -path "/.codex/" -o
-name ".md" -path "/.cursor/*" -o
-name "CLAUDE.md" -o -name "AGENTS.md" -o -name "INSTRUCTIONS.md"

For each file, extract:
- Discrete facts (single pieces of actionable information)
- Instructions (procedures, rules, constraints)
- Context triggers (when is this knowledge needed?)
find . -maxdepth 2 -name ".md" -path "/.claude/" -o
-name "
.md" -path "/.codex/" -o
-name ".md" -path "/.cursor/*" -o
-name "CLAUDE.md" -o -name "AGENTS.md" -o -name "INSTRUCTIONS.md"

对每个文件,提取以下内容:
- 离散事实(单一可操作信息)
- 操作说明(流程、规则、约束条件)
- 上下文触发条件(何时需要该知识?)

Phase 2: Chunk Analysis

阶段2:知识块分析

Break content into retrieval units—the smallest self-contained piece of information that makes sense alone.
Good chunk:
undefined
将内容拆分为检索单元——即可以独立理解的最小自包含信息片段。
优质知识块:
undefined

Adding API Endpoints

Adding API Endpoints

  1. Create handler in src/handlers/
  2. Register route in src/routes.rs
  3. Add OpenAPI spec to docs/api.yaml

Bad chunk (too coupled):
See the API section for endpoint patterns, but first read the auth docs, which reference the middleware guide...

Score each chunk:
- **Self-contained?** Can agent act on this without loading more?
- **Task-specific?** Clear when this is needed?
- **Information-dense?** High signal per token?
  1. Create handler in src/handlers/
  2. Register route in src/routes.rs
  3. Add OpenAPI spec to docs/api.yaml

劣质知识块(耦合性过高):
See the API section for endpoint patterns, but first read the auth docs, which reference the middleware guide...

为每个知识块评分:
- **是否自包含?** Agent无需加载更多内容就能基于此采取行动?
- **是否针对特定任务?** 明确何时需要该知识?
- **信息密度如何?** 每个token的信号量是否高?

Phase 3: Build Knowledge Manifest

阶段3:构建知识清单

Generate
.claude/KNOWLEDGE.md
—a lightweight index the agent reads first:
markdown
undefined
生成
.claude/KNOWLEDGE.md
——Agent首先读取的轻量级索引:
markdown
undefined

Knowledge Manifest

Knowledge Manifest

Task → Knowledge Map

Task → Knowledge Map

When working on...LoadKey terms
API endpointsreferences/api.mdroute, handler, endpoint
Authenticationreferences/auth.mdtoken, session, login
Database changesreferences/schema.mdmigration, model, query
Testingreferences/testing.mdspec, fixture, mock
Deploymentreferences/deploy.mdrelease, staging, prod
When working on...LoadKey terms
API endpointsreferences/api.mdroute, handler, endpoint
Authenticationreferences/auth.mdtoken, session, login
Database changesreferences/schema.mdmigration, model, query
Testingreferences/testing.mdspec, fixture, mock
Deploymentreferences/deploy.mdrelease, staging, prod

Quick Reference

Quick Reference

Build Commands

Build Commands

  • npm run dev
    — Start dev server (port 3000)
  • npm test
    — Run test suite
  • npm run build
    — Production build
  • npm run dev
    — Start dev server (port 3000)
  • npm test
    — Run test suite
  • npm run build
    — Production build

Key Paths

Key Paths

  • Handlers:
    src/handlers/
  • Routes:
    src/routes.ts
  • Tests:
    tests/
  • Handlers:
    src/handlers/
  • Routes:
    src/routes.ts
  • Tests:
    tests/

Critical Rules

Critical Rules

  • Never commit .env files
  • All PRs require tests
  • Use conventional commits

The manifest contains:
1. **Task→Knowledge map**: What to load for what context
2. **Quick reference**: High-frequency facts (no file loading needed)
3. **Critical rules**: Must-know constraints (always relevant)
  • Never commit .env files
  • All PRs require tests
  • Use conventional commits

该清单包含:
1. **任务→知识映射表**:不同上下文场景下需要加载的内容
2. **快速参考**:高频事实(无需加载文件)
3. **关键规则**:必须知晓的约束条件(始终相关)

Phase 4: Compile Optimized Artifacts

阶段4:编译优化产物

Transform verbose source docs into dense, agent-optimized versions.
Compression techniques:
Source (verbose)Compiled (dense)
"When you want to add a new endpoint, you should first create a handler function..."
New endpoint: handler → route → spec
Long prose paragraphsStructured tables
Repeated informationSingle source of truth
Examples with explanationJust the pattern
Output structure:
.claude/
├── CLAUDE.md              # Human-readable, can stay verbose
├── KNOWLEDGE.md           # Agent manifest (generated)
└── compiled/              # Agent-optimized versions (generated)
    ├── api.md             # Dense API reference
    ├── patterns.md        # Code patterns as templates
    └── rules.md           # All constraints in one place
将冗长的源文档转换为简洁、Agent优化版的文档。
压缩技巧:
Source (verbose)Compiled (dense)
"When you want to add a new endpoint, you should first create a handler function..."
New endpoint: handler → route → spec
Long prose paragraphsStructured tables
Repeated informationSingle source of truth
Examples with explanationJust the pattern
输出结构:
.claude/
├── CLAUDE.md              # Human-readable, can stay verbose
├── KNOWLEDGE.md           # Agent manifest (generated)
└── compiled/              # Agent-optimized versions (generated)
    ├── api.md             # Dense API reference
    ├── patterns.md        # Code patterns as templates
    └── rules.md           # All constraints in one place

Phase 5: Generate Retrieval Hints

阶段5:生成检索提示

Add grep-friendly markers throughout compiled docs:
markdown
<!-- @task:new-endpoint @load:api,routes -->
在编译后的文档中添加支持grep检索的标记:
markdown
<!-- @task:new-endpoint @load:api,routes -->

Adding Endpoints

Adding Endpoints

<!-- @task:fix-auth @load:auth,middleware -->
<!-- @task:fix-auth @load:auth,middleware -->

Authentication Flow

Authentication Flow

<!-- @task:write-test @load:testing -->
<!-- @task:write-test @load:testing -->

Test Patterns

Test Patterns


These markers enable:
```bash

这些标记支持:
```bash

Find relevant sections for a task

Find relevant sections for a task

grep -l "@task:new-endpoint" .claude/compiled/*.md
undefined
grep -l "@task:new-endpoint" .claude/compiled/*.md
undefined

Phase 6: Validation

阶段6:验证

Test the optimized system:
  1. Coverage check: Every fact from source exists in compiled output
  2. Retrieval test: Can common tasks be served with minimal loading?
  3. Density check: Compiled versions smaller than sources?
bash
undefined
测试优化后的系统:
  1. 覆盖检查:源文档中的每个事实都存在于编译后的输出中
  2. 检索测试:常见任务是否能通过最少加载量完成?
  3. 密度检查:编译后的文档是否比源文档更小?
bash
undefined

Compare sizes

Compare sizes

wc -l .claude/references/.md # Source wc -l .claude/compiled/.md # Compiled (should be smaller)
undefined
wc -l .claude/references/.md # Source wc -l .claude/compiled/.md # Compiled (should be smaller)
undefined

Manifest Format

清单格式

The
KNOWLEDGE.md
manifest follows this structure:
markdown
undefined
KNOWLEDGE.md
清单遵循以下结构:
markdown
undefined

Knowledge Manifest

Knowledge Manifest

<!-- Auto-generated. Source: .claude/references/, CLAUDE.md -->
<!-- Auto-generated. Source: .claude/references/, CLAUDE.md -->

Task Context Map

Task Context Map

<!-- What to load based on current work -->
ContextLoadSearch
[task description][file path][grep terms]
<!-- What to load based on current work -->
ContextLoadSearch
[task description][file path][grep terms]

Always-Loaded Facts

Always-Loaded Facts

<!-- High-frequency, never needs file lookup -->
<!-- High-frequency, never needs file lookup -->

Commands

Commands

[Most-used commands as a table]
[Most-used commands as a table]

Paths

Paths

[Key directories and their purposes]
[Key directories and their purposes]

Rules

Rules

[Critical constraints that always apply]
[Critical constraints that always apply]

Chunk Index

Chunk Index

<!-- What exists and where -->
TopicLocationLinesSummary
[topic][file:line-range][count][one-line summary]
undefined
<!-- What exists and where -->
TopicLocationLinesSummary
[topic][file:line-range][count][one-line summary]
undefined

Information Density Principles

信息密度原则

Convert Prose to Structure

将散文转换为结构化内容

Before:
"The authentication system uses JWT tokens stored in httpOnly cookies. When a user logs in, the server validates credentials against the database, generates a token with a 24-hour expiry, and sets it as a cookie..."
After:
undefined
Before:
"The authentication system uses JWT tokens stored in httpOnly cookies. When a user logs in, the server validates credentials against the database, generates a token with a 24-hour expiry, and sets it as a cookie..."
After:
undefined

Auth Flow

Auth Flow

  • Method: JWT in httpOnly cookie
  • Expiry: 24h
  • Flow: credentials → DB validate → token → cookie
undefined
  • Method: JWT in httpOnly cookie
  • Expiry: 24h
  • Flow: credentials → DB validate → token → cookie
undefined

Eliminate Redundancy

消除冗余

If the same information appears in multiple places, create one canonical source and reference it:
markdown
undefined
如果同一信息出现在多个位置,创建唯一的权威来源并引用它:
markdown
undefined

Token Handling

Token Handling

See: Auth Flow — tokens section
undefined
See: Auth Flow — tokens section
undefined

Prefer Tables Over Lists

优先使用表格而非列表

Before:
markdown
- The API endpoint for users is /api/users
- The API endpoint for posts is /api/posts
- The API endpoint for comments is /api/comments
After:
markdown
| Resource | Endpoint |
|----------|----------|
| Users | /api/users |
| Posts | /api/posts |
| Comments | /api/comments |
Before:
markdown
- The API endpoint for users is /api/users
- The API endpoint for posts is /api/posts
- The API endpoint for comments is /api/comments
After:
markdown
| Resource | Endpoint |
|----------|----------|
| Users | /api/users |
| Posts | /api/posts |
| Comments | /api/comments |

Use Patterns Over Examples

使用模式而非示例

Before:
markdown
To create a user handler:
```javascript
export async function createUser(req, res) {
  const { name, email } = req.body;
  const user = await db.users.create({ name, email });
  res.json(user);
}
After:
markdown
Handler pattern: `export async function {action}{Resource}(req, res)`
Body: Extract params → DB operation → Return result
Before:
markdown
To create a user handler:
```javascript
export async function createUser(req, res) {
  const { name, email } = req.body;
  const user = await db.users.create({ name, email });
  res.json(user);
}
After:
markdown
Handler pattern: `export async function {action}{Resource}(req, res)`
Body: Extract params → DB operation → Return result

Output Checklist

输出检查清单

After optimization, verify:
  • KNOWLEDGE.md
    exists and is under 100 lines
  • Task→knowledge mappings cover common workflows
  • Quick reference has most-used facts
  • Compiled docs are denser than sources
  • No orphaned knowledge (everything indexed)
  • Retrieval hints enable grep-based discovery
  • Original source docs untouched (human reference)
优化完成后,验证以下内容:
  • KNOWLEDGE.md
    已存在且行数不超过100行
  • 任务→知识映射表覆盖了常见工作流
  • 快速参考包含高频使用的事实
  • 编译后的文档比源文档更简洁
  • 无孤立知识(所有内容均已索引)
  • 检索提示支持基于grep的内容发现
  • 原始源文档未被修改(供人类参考)