functional-area-resolver
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFunctional-Area Resolver — Pattern for Compressing Routing Tables
功能区域解析器——路由表格压缩模式
Problem
问题
Routing files (RESOLVER.md, AGENTS.md) grow as skills are added. Each skill
gets its own row (trigger -> skill path). At ~200+ skills this hits 25-30KB,
eating context budget that should go to actual work.
路由文件(RESOLVER.md、AGENTS.md)会随着Skill的增加而不断膨胀。每个Skill单独占一行(触发词 -> Skill路径)。当Skill数量达到200+时,文件大小会达到25-30KB,占用了本该用于实际工作的上下文预算。
Solution: Functional-Area Dispatchers
解决方案:功能区域调度器
Replace N rows per area with one entry per functional area. Each entry
lists all sub-skills it can dispatch to in a clause.
(dispatcher for: ...)将每个区域的N行条目替换为每个功能区域一条条目。每个条目会在子句中列出其可调度的所有子Skill。
(dispatcher for: ...)Before (270 rows, 25KB)
压缩前(270行,25KB)
- Creating/enriching a person or company page -> `enrich`
- Fix broken citations in brain pages -> `citation-fixer`
- Publish/share a brain page as link -> `brain-publish`
- Generate PDF from brain page -> `brain-pdf`
- Read a book through lens of a problem -> `strategic-reading`
- Personalized book analysis -> `book-mirror`
- Brain integrity -> `brain-librarian`
...- 创建/丰富个人或公司页面 -> `enrich`
- 修复脑图页面中的损坏引用 -> `citation-fixer`
- 将脑图页面发布/分享为链接 -> `brain-publish`
- 从脑图页面生成PDF -> `brain-pdf`
- 从问题视角阅读书籍 -> `strategic-reading`
- 个性化书籍分析 -> `book-mirror`
- 脑图完整性维护 -> `brain-librarian`
...After (13 rows, 13KB)
压缩后(13行,13KB)
- **Brain & knowledge**: create/enrich/search/export brain pages, filing,
citations, publishing, book analysis, strategic reading, concept synthesis,
archive mining -> `brain-ops` (dispatcher for: enrich, query, brain-pdf,
brain-publish, brain-export, brain-librarian, citation-fixer, book-mirror,
strategic-reading, concept-synthesis, archive-crawler, ...)- **脑图与知识管理**:创建/丰富/搜索/导出脑图页面、归档、引用、发布、书籍分析、策略性阅读、概念合成、档案挖掘 -> `brain-ops` (dispatcher for: enrich, query, brain-pdf,
brain-publish, brain-export, brain-librarian, citation-fixer, book-mirror,
strategic-reading, concept-synthesis, archive-crawler, ...)Why It Works
为什么该方案有效
The LLM doesn't need one row per sub-skill. It needs:
- Area recognition — "this is about brain pages" -> Brain & Knowledge
- Sub-skill visibility — the list shows what's available
(dispatcher for: ...) - The skill file itself — once the LLM reads , it has full routing detail
brain-ops/SKILL.md
This is a two-layer dispatch: routing file routes to the area, the area
skill routes to the specific sub-skill. Each layer does one job well.
LLM不需要为每个子Skill单独一行,它只需要:
- 区域识别 —— "这是关于脑图页面的请求" → 匹配「脑图与知识管理」区域
- 子Skill可见性 —— 列表展示了所有可用的子Skill
(dispatcher for: ...) - Skill文件本身 —— 当LLM读取后,就能获取完整的路由细节
brain-ops/SKILL.md
这是一种两层调度机制:路由文件先路由到对应区域,区域Skill再路由到具体的子Skill。每一层都专注完成单一任务。
A/B Eval Results
A/B评估结果
Three resolver architectures tested across three Anthropic frontier models
(Opus 4.7, Sonnet 4.6, Haiku 4.5) on real production AGENTS.md content,
20 hand-authored training fixtures + 5 held-out blind fixtures, n=3 seeded
repeats per (fixture, variant). Two scoring rules: STRICT (predicted
slug exactly equals expected) and LENIENT (predicted is in the same
dispatcher area as expected). Both matter:
- STRICT measures: "does the LLM return the exact slug?"
- LENIENT measures: "does the LLM land in the right area, even if it picks a
more-specific sub-skill from ?" This is closer to production behavior — an agent that lands in
(dispatcher for: ...)for an email intent succeeds even if the resolver entry saidgmail.executive-assistant
针对三种Anthropic前沿模型(Opus 4.7、Sonnet 4.6、Haiku 4.5),基于真实生产环境的AGENTS.md内容测试了三种解析器架构,包含20个人工编写的训练用例 + 5个预留盲测用例,每个(用例、变体)组合重复3次随机种子测试。采用两种评分规则:STRICT(严格匹配)(预测的slug与预期完全一致)和LENIENT(宽松匹配)(预测结果与预期属于同一调度器区域)。两种规则都很重要:
- STRICT衡量:"LLM是否返回完全匹配的slug?"
- LENIENT衡量:"LLM是否定位到正确的区域,即使它从中选择了更具体的子Skill?"这更贴近生产环境的实际行为——例如,当用户有邮件需求时,Agent定位到
(dispatcher for: ...)而非gmail,仍视为成功。executive-assistant
Training corpus (n=20, 3 seeds × 3 variants × 3 models, LENIENT)
训练数据集(n=20,3次随机种子 × 3种变体 × 3种模型,宽松匹配)
| Variant | Opus 4.7 | Sonnet 4.6 | Haiku 4.5 | Size |
|---|---|---|---|---|
| baseline (270 bullet rows) | 81.7% ± 7.2% | 86.7% ± 7.2% | 73.3% ± 7.2% | 25KB |
| functional-areas (this pattern) | 98.3% ± 7.2% | 100% ± 0% | 88.3% ± 7.2% | 13KB |
| resolver-of-resolvers (no dispatcher clause) | 63.3% ± 14.3% | 41.7% ± 7.2% | 65.0% ± 12.4% | 10KB |
| 变体 | Opus 4.7 | Sonnet 4.6 | Haiku 4.5 | 大小 |
|---|---|---|---|---|
| 基准版(270条项目符号行) | 81.7% ± 7.2% | 86.7% ± 7.2% | 73.3% ± 7.2% | 25KB |
| 功能区域版(本方案) | 98.3% ± 7.2% | 100% ± 0% | 88.3% ± 7.2% | 13KB |
| 解析器嵌套版(无调度器子句) | 63.3% ± 14.3% | 41.7% ± 7.2% | 65.0% ± 12.4% | 10KB |
Held-out blind corpus (n=5, 3 seeds, LENIENT)
预留盲测数据集(n=5,3次随机种子,宽松匹配)
| Variant | Opus 4.7 | Sonnet 4.6 | Haiku 4.5 |
|---|---|---|---|
| baseline | 100% ± 0% | 100% ± 0% | 100% ± 0% |
| functional-areas | 100% ± 0% | 100% ± 0% | 100% ± 0% |
| resolver-of-resolvers | 100% ± 0% | 73.3% ± 28.7% | 100% ± 0% |
| 变体 | Opus 4.7 | Sonnet 4.6 | Haiku 4.5 |
|---|---|---|---|
| 基准版 | 100% ± 0% | 100% ± 0% | 100% ± 0% |
| 功能区域版 | 100% ± 0% | 100% ± 0% | 100% ± 0% |
| 解析器嵌套版 | 100% ± 0% | 73.3% ± 28.7% | 100% ± 0% |
What the data shows
数据结论
-
Functional-areas BEATS baseline on training across all three models (+13 to +17pp) at 48% the size. Held-out is saturated at 100% for both — within margin of error.
-
Theclause is the load-bearing signal. resolver-of-resolvers strips that clause and collapses to 41.7% on Sonnet — the catastrophic failure case the original PR predicted, now observed.
(dispatcher for: ...) -
The pattern works because the LLM can drill into the dispatcher list. Most "STRICT failures" are the LLM picking a more-specific sub-skill (instead of
gmail). That's the pattern working as designed. STRICT scoring under-counts; LENIENT scoring reflects production agent behavior.executive-assistant -
The pattern's value scales with model tier. Compression gain (functional-areas vs baseline, training, LENIENT) is +17pp on Opus, +13pp on Sonnet, +15pp on Haiku. Sonnet shows the cleanest separation between functional-areas and resolver-of-resolvers (100% vs 41.7%) — model capacity affects how much the dispatcher signal matters.
-
功能区域版在所有三种模型的训练数据集上均优于基准版(提升13-17个百分点),且文件大小仅为基准版的48%。盲测数据中两者均达到100%,处于误差范围内。
-
子句是核心信号。解析器嵌套版移除了该子句后,Sonnet模型的准确率骤降至41.7%——这正是最初PR预测的灾难性失败场景,现已被实际观测到。
(dispatcher for: ...) -
该方案有效是因为LLM可以深入调度器列表。大多数"STRICT规则下的失败"是LLM选择了更具体的子Skill(如选择而非
gmail),而这正是方案设计的预期效果。STRICT评分会低估性能,LENIENT评分更能反映生产环境中Agent的实际表现。executive-assistant -
方案价值随模型层级提升而放大。功能区域版与基准版在训练数据集上的宽松匹配准确率差距:Opus提升17个百分点,Sonnet提升13个百分点,Haiku提升15个百分点。Sonnet模型的功能区域版与解析器嵌套版表现差异最明显(100% vs 41.7%)——模型能力会影响调度器信号的重要性。
Reproduce
复现方法
bash
cd evals/functional-area-resolver
node harness.mjs --model opus # ~225 LLM calls, ~$1.70 at Opus pricing
node harness.mjs --model sonnet # ~$1.00
node harness.mjs --model haiku # ~$0.30
node rescore.mjs baseline-runs/2026-05-11-opus-4-7.jsonl # zero-cost re-scoreReceipts (model, prompt_template_hash, fixtures_hash, harness_sha, ts):
.
evals/functional-area-resolver/baseline-runs/2026-05-11-{opus-4-7,sonnet-4-6,haiku-4-5}.jsonlbash
cd evals/functional-area-resolver
node harness.mjs --model opus # 约225次LLM调用,按Opus定价约1.70美元
node harness.mjs --model sonnet # 约1.00美元
node harness.mjs --model haiku # 约0.30美元
node rescore.mjs baseline-runs/2026-05-11-opus-4-7.jsonl # 零成本重新评分验证文件(模型、prompt_template_hash、fixtures_hash、harness_sha、时间戳):
。
evals/functional-area-resolver/baseline-runs/2026-05-11-{opus-4-7,sonnet-4-6,haiku-4-5}.jsonlMethodology caveats
方法论说明
- Production prompt matters. With a naive "return the skill slug" prompt
(no instruction about ), every compression variant collapses to ~30-60% on Opus. The dispatcher-aware prompt is in
(dispatcher for: ...). Use it as the template for your agent's harness; without it, compression breaks.evals/functional-area-resolver/harness-runner.ts:PROMPT_TEMPLATE - Training corpus and variants were authored by the same release. Held-out corpus was written before the variants and never adjusted; this mitigates but does not eliminate overfitting.
- Confidence intervals via t-distribution across n=3 seeded repeats. Hold the n=3 lower-bound: high CIs mean the underlying sample is noisy.
- Single-vendor result. All three models are Anthropic. Cross-vendor verification (Gemini, GPT) is a v0.33.x follow-up.
- Held-out blind set is small (n=5). Saturated at 100% across most cells — the harness can't distinguish between "100%" and "95% with one nondeterministic miss." Expanding to ≥20 is a v0.33.x follow-up.
- 生产环境提示词至关重要。如果使用简单的"返回Skill slug"提示词(未提及),所有压缩变体在Opus模型上的准确率都会降至30-60%。支持调度器的提示词位于
(dispatcher for: ...)。请将其作为Agent测试框架的模板;否则压缩会失效。evals/functional-area-resolver/harness-runner.ts:PROMPT_TEMPLATE - 训练数据集与变体由同一版本编写。预留盲测数据集在变体编写前就已完成,且未做任何调整——这在一定程度上缓解了过拟合问题,但无法完全消除。
- 置信区间基于n=3次随机种子的t分布。注意n=3是下限:高置信区间意味着底层样本存在噪声。
- 单一厂商测试结果。所有三种模型均来自Anthropic。跨厂商验证(Gemini、GPT)将在v0.33.x版本中完成。
- 预留盲测集规模较小(n=5)。大多数测试项的准确率都饱和在100%——测试框架无法区分"100%"与"95%(含一次非确定性错误)"。后续v0.33.x版本会将规模扩展至≥20个用例。
Prior work and citations
相关工作与引用
The pattern is a static-prompt analog of hierarchical agent routing, a
2024-2025 research direction:
- AnyTool (arXiv:2402.04253) showed
meta-agent → category-agent → tool-agent hierarchy on 16K APIs beats flat
retrieval by +35.4pp. The clause is the meta-agent's view collapsed into a single LLM pass.
(dispatcher for: ...) - RAG-MCP (arXiv:2505.03275) reports 49.2% prompt-token reduction at 3.2× accuracy gain via embedding-based pre-retrieval. The token-reduction story matches ours (48% smaller), via a different mechanism (RAG vs static dispatcher).
- Anthropic Agent Skills (engineering blog) promotes progressive disclosure: frontmatter (~80 tokens) always loaded, SKILL.md body loaded on match. This skill applies the same principle at the routing-table level, not the per-skill body level.
The 2025-2026 literature has no published benchmark for static-prompt
hierarchical routing (every published hierarchical scheme resolves the
hierarchy at runtime via a second LLM call). Our finding — that the
hierarchy can be inlined into a single-LLM-pass dispatcher list and retain
routing accuracy — is the open contribution. See
for methodology details.
evals/functional-area-resolver/README.md该模式是分层Agent路由的静态提示词实现,属于2024-2025年的研究方向:
- AnyTool(arXiv:2402.04253)证明,元Agent→分类Agent→工具Agent的分层架构在16K个API上的表现比扁平检索高出35.4个百分点。子句是将元Agent的视图压缩为单次LLM调用的实现方式。
(dispatcher for: ...) - RAG-MCP(arXiv:2505.03275)报告称,通过基于嵌入的预检索,提示词令牌减少49.2%,同时准确率提升3.2倍。我们的令牌减少效果与之匹配(文件大小减少48%),但实现机制不同(RAG vs 静态调度器)。
- Anthropic Agent Skills (技术博客)提倡渐进式披露:始终加载前言(约80个令牌),匹配时再加载SKILL.md主体内容。本Skill将同一原则应用于路由表格层面,而非单个Skill的主体内容层面。
2025-2026年的文献中尚未有关于静态提示词分层路由的公开基准(所有已发表的分层方案都通过第二次LLM调用在运行时解析层级)。我们的发现——层级可以内联到单次LLM调用的调度器列表中,同时保持路由准确率——是公开的贡献成果。更多方法论细节请见。
evals/functional-area-resolver/README.mdHow To Compress
压缩步骤
Step 1: Preconditions
步骤1:前置条件
Refuse to compress if either gate fails:
- Source routing file is under 12KB (compression overhead exceeds benefit).
- shows uncommitted changes to the routing file (the compressor's edit would entangle with whatever the user was doing).
git status
If a user wants to override either gate, they ask explicitly with .
--force如果以下任一条件不满足,则拒绝压缩:
- 源路由文件大小小于12KB(压缩开销大于收益)。
- 显示路由文件存在未提交的更改(压缩器的编辑会与用户正在进行的操作冲突)。
git status
如果用户希望绕过任一条件,可以使用参数明确指定。
--forceStep 2: When to compress which file
步骤2:选择要压缩的文件
GBrain workspaces often have TWO routing files merged at runtime (per
v0.31.7): and a sibling
. Choose which to compress:
src/core/check-resolvable.tsskills/RESOLVER.md../AGENTS.md- Only one is fat (>12KB): compress that one; leave the small one alone.
- Both are fat: compress them separately, in order: AGENTS.md first (usually the larger one in OpenClaw-style deployments), then RESOLVER.md.
- Only the small one is fat (rare): same rule — compress it.
If the deployment uses only one routing file, this section is a no-op —
compress that one.
GBrain工作区通常有两个在运行时合并的路由文件(基于 v0.31.7版本):和同级目录的。选择要压缩的文件:
src/core/check-resolvable.tsskills/RESOLVER.md../AGENTS.md- 只有一个文件较大(>12KB):压缩该文件,保留小文件不变。
- 两个文件都较大:分别压缩,顺序为:先压缩AGENTS.md(在OpenClaw风格的部署中通常更大),再压缩RESOLVER.md。
- 只有小文件较大(罕见):遵循相同规则——压缩该文件。
如果部署仅使用一个路由文件,则直接压缩该文件即可。
Step 3: Identify functional areas
步骤3:识别功能区域
Group skills by domain. Typical areas (adjust per deployment):
- Brain & Knowledge — brain-ops as dispatcher
- Content Ingestion — ingest as dispatcher
- Calendar & Scheduling — google-calendar as dispatcher
- Email & Comms — executive-assistant as dispatcher
- Research & Investigation — perplexity-research as dispatcher
- X/Twitter & Social — x-ingest as dispatcher
- Places & Travel — checkin as dispatcher
- Product & Building — acp-coding as dispatcher
- Infrastructure — healthcheck as dispatcher
- Tasks & Logistics — daily-task-manager as dispatcher
- People & Contacts — google-contacts as dispatcher
按领域对Skill进行分组。典型区域(可根据部署调整):
- 脑图与知识管理 —— 使用作为调度器
brain-ops - 内容导入 —— 使用作为调度器
ingest - 日历与日程安排 —— 使用作为调度器
google-calendar - 邮件与通讯 —— 使用作为调度器
executive-assistant - 研究与调查 —— 使用作为调度器
perplexity-research - X/Twitter与社交 —— 使用作为调度器
x-ingest - 地点与旅行 —— 使用作为调度器
checkin - 产品与开发 —— 使用作为调度器
acp-coding - 基础设施 —— 使用作为调度器
healthcheck - 任务与后勤 —— 使用作为调度器
daily-task-manager - 人员与联系人 —— 使用作为调度器
google-contacts
Step 4: Build the area entry format
步骤4:构建区域条目格式
Each area entry follows this template:
- **{Area Name}**: {comma-separated trigger phrases} -> `{dispatcher-skill}`
(dispatcher for: {comma-separated sub-skill names})Rules:
- Trigger phrases should be broad enough to catch intent ("brain pages, enrich, search, filing, citations, book analysis")
- Sub-skill list should be comprehensive — this is how the LLM knows what's available
- The dispatcher skill file should have its own internal routing table
每个区域条目遵循以下模板:
- **{区域名称}**: {逗号分隔的触发短语} -> `{调度器Skill}`
(dispatcher for: {逗号分隔的子Skill名称})规则:
- 触发短语应足够宽泛,以覆盖用户意图(例如"脑图页面、丰富、搜索、归档、引用、书籍分析")
- 子Skill列表应全面——这是LLM了解可用Skill的途径
- 调度器Skill文件应包含自己的内部路由表格
Step 5: Keep always-on entries separate
步骤5:单独保留始终启用的条目
Gates and always-on entries (acknowledge, multi-user, entity-detector, etc.)
stay as individual rows — they're checked on every message, not dispatched.
门控和始终启用的条目(acknowledge、multi-user、entity-detector等)保持单独行——它们会在每条消息中被检查,而非通过调度器路由。
Step 6 (MANDATORY): Verify routing accuracy
步骤6(必填):验证路由准确率
Run two gates before committing the compressed file. Do NOT commit if either
fails.
Gate 1: Structural verification. Confirms your
fixtures still resolve to the right skills under the compressed routing file.
Run from the workspace whose routing file you just edited:
routing-eval.jsonlbash
gbrain routing-eval --jsonIf accuracy on your fixtures drops below 95%, revert and tune the area
entries before re-running.
Gate 2: LLM A/B verification on YOUR edited file. Confirms a frontier
LLM can still drill into the dispatcher list and reach sub-skills under
your specific compression. Requires a gbrain repo checkout because the
harness lives there. Copy your edited routing file into the harness's
variants directory, then invoke the harness with pointing
at it:
--variantsbash
undefined提交压缩后的文件前,需通过两个验证环节。任一环节失败,均不得提交。
环节1:结构验证。确认用例在压缩后的路由文件下仍能解析到正确的Skill。在编辑了路由文件的工作区中运行:
routing-eval.jsonlbash
gbrain routing-eval --json如果用例的准确率低于95%,则回滚并调整区域条目后重新运行。
环节2:针对编辑后文件的LLM A/B验证。确认前沿LLM仍能深入调度器列表,找到对应子Skill。需要检出gbrain仓库,因为测试框架位于其中。将编辑后的路由文件复制到测试框架的变体目录,然后使用参数指定该文件:
--variantsbash
undefinedIn your agent workspace, identify the routing file you just compressed.
在你的Agent工作区中,找到刚压缩的路由文件。
EDITED=/path/to/your/AGENTS.md # or skills/RESOLVER.md, whichever you edited
EDITED=/path/to/your/AGENTS.md # 或skills/RESOLVER.md,取决于你编辑的文件
In your gbrain repo checkout:
在你的gbrain仓库检出目录中:
cd /path/to/gbrain/evals/functional-area-resolver
TMP=$(mktemp -d)/variants && mkdir -p "$TMP"
cp "$EDITED" "$TMP/my-edit.md"
cd /path/to/gbrain/evals/functional-area-resolver
TMP=$(mktemp -d)/variants && mkdir -p "$TMP"
cp "$EDITED" "$TMP/my-edit.md"
Run the harness against your file (sequential, ~75 calls × $0.0076 ≈ $0.57 on Opus).
针对你的文件运行测试框架(串行执行,约75次调用 × 0.0076美元 ≈ Opus模型下0.57美元)。
ANTHROPIC_API_KEY=... node harness.mjs --variants-dir "$TMP" --variants my-edit
--model opus --parallel 3 --yes
--model opus --parallel 3 --yes
The harness uses gbrain's bundled fixture set, so this verifies "did the LLM
land in the right sub-skill for routing intents the gbrain-bundled fixtures
cover" — a regression check on shared skills, not a full re-eval of YOUR
fixture set. For full eval coverage, mirror this skill's
`fixtures.jsonl` + `fixtures-held-out.jsonl` setup with intents specific
to your skills.
If the lenient (same-area) score on your variant drops below 95%, revert the
compression and tune. Common causes:
- A sub-skill was omitted from the `(dispatcher for: ...)` list.
- Trigger phrases for an area are too narrow (LLM can't recognize intent).
- Areas were collapsed too aggressively (too few areas — see Anti-Patterns).
- ASCII `->` vs Unicode `→` mismatch — the harness now accepts both, but
earlier versions only matched Unicode. Pin gbrain to v0.32.3.0+.
Common false negatives on the harness eval (NOT bugs in your compression):
- The gbrain-bundled fixtures target skill names like `enrich`, `query`,
`gmail`, `executive-assistant`. If your routing file doesn't expose
those skills at all, expect strict-scoring failures on those fixtures.
Lenient scoring stays accurate for any sub-skill present in your
`(dispatcher for: ...)` lists.ANTHROPIC_API_KEY=... node harness.mjs --variants-dir "$TMP" --variants my-edit
--model opus --parallel 3 --yes
--model opus --parallel 3 --yes
测试框架使用gbrain内置的用例集,因此验证的是"LLM能否为gbrain内置用例覆盖的路由意图找到正确的子Skill"——这是对共享Skill的回归检查,而非对你的用例集的完整重新评估。如需完整评估覆盖,请参照本Skill的`fixtures.jsonl` + `fixtures-held-out.jsonl`结构,创建针对你的Skill的特定意图用例。
如果你的变体的宽松匹配(同区域)评分低于95%,则回滚压缩并进行调整。常见原因:
- 某个子Skill被遗漏在`(dispatcher for: ...)`列表中。
- 某个区域的触发短语过于狭窄(LLM无法识别意图)。
- 区域合并过于激进(区域过少——见反模式)。
- ASCII `->`与Unicode `→`不匹配——当前测试框架已兼容两者,但早期版本仅支持Unicode。请将gbrain固定到v0.32.3.0+版本。
测试框架评估中的常见假阴性(非压缩错误):
- gbrain内置用例针对的Skill名称如`enrich`、`query`、`gmail`、`executive-assistant`。如果你的路由文件根本不包含这些Skill,则这些用例的严格匹配评分会失败,但宽松匹配评分仍会保持准确(只要你的`(dispatcher for: ...)`列表中存在对应子Skill)。Step 7: Review the diff before committing
步骤7:提交前查看差异
Show the user the proposed edit (or the actual git diff) and wait for
explicit approval before staging. Same convention as .
skills/book-mirror/SKILL.md向用户展示拟议的编辑内容(或实际的git差异),等待用户明确批准后再暂存。与遵循相同的约定。
skills/book-mirror/SKILL.mdContract
协议
This skill guarantees:
- Routing matches the canonical triggers in the frontmatter.
- Compression is only performed when the preconditions in Step 1 pass (file ≥12KB AND clean working tree, or ).
--force - The mandatory verification gate in Step 6 fires on the user's edited file, not on sample variants. The user runs AND the gbrain-repo harness (
gbrain routing-eval --json) before committing the compressed file.node harness.mjs --variants-dir <tmp> --variants my-edit - Privacy contract preserved: no fork-specific filesystem path literals (server-side brain home, OpenClaw fork home) leak into the compressed output.
The full behavior contract is documented in the body sections above; this section exists for the conformance test.
本Skill保证:
- 路由与前言中的标准触发词匹配。
- 仅当步骤1中的前置条件满足(文件≥12KB且工作区干净,或使用参数)时,才会执行压缩。
--force - 步骤6中的必填验证环节针对用户编辑后的文件执行,而非示例变体。用户在提交压缩文件前,需运行和gbrain仓库中的测试框架(
gbrain routing-eval --json)。node harness.mjs --variants-dir <tmp> --variants my-edit - 隐私协议得到遵守:压缩输出中不会泄露任何特定于分支的文件系统路径字面量(服务器端脑图目录、OpenClaw分支目录)。
完整的行为协议已在上述章节中记录;本节用于一致性测试。
Output Format
输出格式
The compressed routing file follows the area-entry template documented in Step 4 ("Build the area entry format"). Each entry: {dispatcher-skill}` (dispatcher for: {sub-skill list})->→` (used in some production deployments); the gbrain harness accepts both.
- **{Area Name}**: {trigger phrases} -> \. The dispatcher arrow may be either ASCII (default in this template) or Unicode压缩后的路由文件遵循步骤4中记录的区域条目模板("构建区域条目格式")。每个条目格式为:{调度器Skill}` (dispatcher for: {子Skill列表})->→`(部分生产部署使用);gbrain测试框架兼容两者。
- **{区域名称}**: {触发短语} -> \。调度器箭头可以是ASCII (本模板默认)或Unicode Anti-Patterns
反模式
-
Resolver-of-resolvers with pipe tables. Tested and failed (see eval table). The LLM picks area names from the table instead of drilling into sub-skills.
-
Removing sub-skill names. Without thelist, the LLM can't route to specific sub-skills. The list is the routing signal.
(dispatcher for: ...) -
Too few areas. Collapsing to <5 areas makes each area too broad. 12-15 areas is the sweet spot.
-
Too many areas. Defeats the purpose. If you have 50 areas, just keep individual rows.
-
使用管道表格的解析器嵌套。已测试并失败(见评估表格)。LLM会从表格中选择区域名称,而非深入子Skill。
-
移除子Skill名称。如果没有列表,LLM无法路由到具体的子Skill。该列表是路由信号的核心。
(dispatcher for: ...) -
区域过少。合并为少于5个区域会导致每个区域过于宽泛。12-15个区域是最佳范围。
-
区域过多。失去了压缩的意义。如果有50个区域,不如保留单独行。
Maintenance
维护
When adding a new skill:
- Identify its functional area.
- Add the skill name to that area's list.
(dispatcher for: ...) - Update the area's skill file with routing detail.
- Run the routing eval (Step 6) to verify.
When adding a new functional area:
- Create the dispatcher skill with internal routing.
- Add the area entry to the routing file.
- Run the routing eval (Step 6) to verify.
添加新Skill时:
- 确定其所属的功能区域。
- 将Skill名称添加到该区域的列表中。
(dispatcher for: ...) - 更新该区域的Skill文件,添加路由细节。
- 运行步骤6中的路由评估进行验证。
添加新功能区域时:
- 创建带有内部路由的调度器Skill。
- 将区域条目添加到路由文件中。
- 运行步骤6中的路由评估进行验证。
Changelog
更新日志
v1.0.0 — 2026-05-11
v1.0.0 — 2026-05-11
- Initial version. Pattern shipped in gbrain v0.32.3.0 with a held-out A/B
eval (see ).
evals/functional-area-resolver/ - Skill renamed from to
compress-agents-mdpre-release; the contribution is the pattern, not the filename.functional-area-resolver
- 初始版本。该模式随gbrain v0.32.3.0发布,并附带预留数据集的A/B评估(见)。
evals/functional-area-resolver/ - Skill在预发布阶段从更名为
compress-agents-md;核心贡献是该模式,而非文件名。functional-area-resolver