functional-area-resolver

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Functional-Area Resolver — Pattern for Compressing Routing Tables

功能区域解析器——路由表格压缩模式

Problem

问题

Routing files (RESOLVER.md, AGENTS.md) grow as skills are added. Each skill gets its own row (trigger -> skill path). At ~200+ skills this hits 25-30KB, eating context budget that should go to actual work.
路由文件(RESOLVER.md、AGENTS.md)会随着Skill的增加而不断膨胀。每个Skill单独占一行(触发词 -> Skill路径)。当Skill数量达到200+时,文件大小会达到25-30KB,占用了本该用于实际工作的上下文预算。

Solution: Functional-Area Dispatchers

解决方案:功能区域调度器

Replace N rows per area with one entry per functional area. Each entry lists all sub-skills it can dispatch to in a
(dispatcher for: ...)
clause.
将每个区域的N行条目替换为每个功能区域一条条目。每个条目会在
(dispatcher for: ...)
子句中列出其可调度的所有子Skill。

Before (270 rows, 25KB)

压缩前(270行,25KB)

- Creating/enriching a person or company page -> `enrich`
- Fix broken citations in brain pages -> `citation-fixer`
- Publish/share a brain page as link -> `brain-publish`
- Generate PDF from brain page -> `brain-pdf`
- Read a book through lens of a problem -> `strategic-reading`
- Personalized book analysis -> `book-mirror`
- Brain integrity -> `brain-librarian`
...
- 创建/丰富个人或公司页面 -> `enrich`
- 修复脑图页面中的损坏引用 -> `citation-fixer`
- 将脑图页面发布/分享为链接 -> `brain-publish`
- 从脑图页面生成PDF -> `brain-pdf`
- 从问题视角阅读书籍 -> `strategic-reading`
- 个性化书籍分析 -> `book-mirror`
- 脑图完整性维护 -> `brain-librarian`
...

After (13 rows, 13KB)

压缩后(13行,13KB)

- **Brain & knowledge**: create/enrich/search/export brain pages, filing,
  citations, publishing, book analysis, strategic reading, concept synthesis,
  archive mining -> `brain-ops` (dispatcher for: enrich, query, brain-pdf,
  brain-publish, brain-export, brain-librarian, citation-fixer, book-mirror,
  strategic-reading, concept-synthesis, archive-crawler, ...)
- **脑图与知识管理**:创建/丰富/搜索/导出脑图页面、归档、引用、发布、书籍分析、策略性阅读、概念合成、档案挖掘 -> `brain-ops` (dispatcher for: enrich, query, brain-pdf,
  brain-publish, brain-export, brain-librarian, citation-fixer, book-mirror,
  strategic-reading, concept-synthesis, archive-crawler, ...)

Why It Works

为什么该方案有效

The LLM doesn't need one row per sub-skill. It needs:
  1. Area recognition — "this is about brain pages" -> Brain & Knowledge
  2. Sub-skill visibility — the
    (dispatcher for: ...)
    list shows what's available
  3. The skill file itself — once the LLM reads
    brain-ops/SKILL.md
    , it has full routing detail
This is a two-layer dispatch: routing file routes to the area, the area skill routes to the specific sub-skill. Each layer does one job well.
LLM不需要为每个子Skill单独一行,它只需要:
  1. 区域识别 —— "这是关于脑图页面的请求" → 匹配「脑图与知识管理」区域
  2. 子Skill可见性 ——
    (dispatcher for: ...)
    列表展示了所有可用的子Skill
  3. Skill文件本身 —— 当LLM读取
    brain-ops/SKILL.md
    后,就能获取完整的路由细节
这是一种两层调度机制:路由文件先路由到对应区域,区域Skill再路由到具体的子Skill。每一层都专注完成单一任务。

A/B Eval Results

A/B评估结果

Three resolver architectures tested across three Anthropic frontier models (Opus 4.7, Sonnet 4.6, Haiku 4.5) on real production AGENTS.md content, 20 hand-authored training fixtures + 5 held-out blind fixtures, n=3 seeded repeats per (fixture, variant). Two scoring rules: STRICT (predicted slug exactly equals expected) and LENIENT (predicted is in the same dispatcher area as expected). Both matter:
  • STRICT measures: "does the LLM return the exact slug?"
  • LENIENT measures: "does the LLM land in the right area, even if it picks a more-specific sub-skill from
    (dispatcher for: ...)
    ?" This is closer to production behavior — an agent that lands in
    gmail
    for an email intent succeeds even if the resolver entry said
    executive-assistant
    .
针对三种Anthropic前沿模型(Opus 4.7、Sonnet 4.6、Haiku 4.5),基于真实生产环境的AGENTS.md内容测试了三种解析器架构,包含20个人工编写的训练用例 + 5个预留盲测用例,每个(用例、变体)组合重复3次随机种子测试。采用两种评分规则:STRICT(严格匹配)(预测的slug与预期完全一致)和LENIENT(宽松匹配)(预测结果与预期属于同一调度器区域)。两种规则都很重要:
  • STRICT衡量:"LLM是否返回完全匹配的slug?"
  • LENIENT衡量:"LLM是否定位到正确的区域,即使它从
    (dispatcher for: ...)
    中选择了更具体的子Skill?"这更贴近生产环境的实际行为——例如,当用户有邮件需求时,Agent定位到
    gmail
    而非
    executive-assistant
    ,仍视为成功。

Training corpus (n=20, 3 seeds × 3 variants × 3 models, LENIENT)

训练数据集(n=20,3次随机种子 × 3种变体 × 3种模型,宽松匹配)

VariantOpus 4.7Sonnet 4.6Haiku 4.5Size
baseline (270 bullet rows)81.7% ± 7.2%86.7% ± 7.2%73.3% ± 7.2%25KB
functional-areas (this pattern)98.3% ± 7.2%100% ± 0%88.3% ± 7.2%13KB
resolver-of-resolvers (no dispatcher clause)63.3% ± 14.3%41.7% ± 7.2%65.0% ± 12.4%10KB
变体Opus 4.7Sonnet 4.6Haiku 4.5大小
基准版(270条项目符号行)81.7% ± 7.2%86.7% ± 7.2%73.3% ± 7.2%25KB
功能区域版(本方案)98.3% ± 7.2%100% ± 0%88.3% ± 7.2%13KB
解析器嵌套版(无调度器子句)63.3% ± 14.3%41.7% ± 7.2%65.0% ± 12.4%10KB

Held-out blind corpus (n=5, 3 seeds, LENIENT)

预留盲测数据集(n=5,3次随机种子,宽松匹配)

VariantOpus 4.7Sonnet 4.6Haiku 4.5
baseline100% ± 0%100% ± 0%100% ± 0%
functional-areas100% ± 0%100% ± 0%100% ± 0%
resolver-of-resolvers100% ± 0%73.3% ± 28.7%100% ± 0%
变体Opus 4.7Sonnet 4.6Haiku 4.5
基准版100% ± 0%100% ± 0%100% ± 0%
功能区域版100% ± 0%100% ± 0%100% ± 0%
解析器嵌套版100% ± 0%73.3% ± 28.7%100% ± 0%

What the data shows

数据结论

  1. Functional-areas BEATS baseline on training across all three models (+13 to +17pp) at 48% the size. Held-out is saturated at 100% for both — within margin of error.
  2. The
    (dispatcher for: ...)
    clause is the load-bearing signal.
    resolver-of-resolvers strips that clause and collapses to 41.7% on Sonnet — the catastrophic failure case the original PR predicted, now observed.
  3. The pattern works because the LLM can drill into the dispatcher list. Most "STRICT failures" are the LLM picking a more-specific sub-skill (
    gmail
    instead of
    executive-assistant
    ). That's the pattern working as designed. STRICT scoring under-counts; LENIENT scoring reflects production agent behavior.
  4. The pattern's value scales with model tier. Compression gain (functional-areas vs baseline, training, LENIENT) is +17pp on Opus, +13pp on Sonnet, +15pp on Haiku. Sonnet shows the cleanest separation between functional-areas and resolver-of-resolvers (100% vs 41.7%) — model capacity affects how much the dispatcher signal matters.
  1. 功能区域版在所有三种模型的训练数据集上均优于基准版(提升13-17个百分点),且文件大小仅为基准版的48%。盲测数据中两者均达到100%,处于误差范围内。
  2. (dispatcher for: ...)
    子句是核心信号
    。解析器嵌套版移除了该子句后,Sonnet模型的准确率骤降至41.7%——这正是最初PR预测的灾难性失败场景,现已被实际观测到。
  3. 该方案有效是因为LLM可以深入调度器列表。大多数"STRICT规则下的失败"是LLM选择了更具体的子Skill(如选择
    gmail
    而非
    executive-assistant
    ),而这正是方案设计的预期效果。STRICT评分会低估性能,LENIENT评分更能反映生产环境中Agent的实际表现。
  4. 方案价值随模型层级提升而放大。功能区域版与基准版在训练数据集上的宽松匹配准确率差距:Opus提升17个百分点,Sonnet提升13个百分点,Haiku提升15个百分点。Sonnet模型的功能区域版与解析器嵌套版表现差异最明显(100% vs 41.7%)——模型能力会影响调度器信号的重要性。

Reproduce

复现方法

bash
cd evals/functional-area-resolver
node harness.mjs --model opus    # ~225 LLM calls, ~$1.70 at Opus pricing
node harness.mjs --model sonnet  # ~$1.00
node harness.mjs --model haiku   # ~$0.30
node rescore.mjs baseline-runs/2026-05-11-opus-4-7.jsonl  # zero-cost re-score
Receipts (model, prompt_template_hash, fixtures_hash, harness_sha, ts):
evals/functional-area-resolver/baseline-runs/2026-05-11-{opus-4-7,sonnet-4-6,haiku-4-5}.jsonl
.
bash
cd evals/functional-area-resolver
node harness.mjs --model opus    # 约225次LLM调用,按Opus定价约1.70美元
node harness.mjs --model sonnet  # 约1.00美元
node harness.mjs --model haiku   # 约0.30美元
node rescore.mjs baseline-runs/2026-05-11-opus-4-7.jsonl  # 零成本重新评分
验证文件(模型、prompt_template_hash、fixtures_hash、harness_sha、时间戳):
evals/functional-area-resolver/baseline-runs/2026-05-11-{opus-4-7,sonnet-4-6,haiku-4-5}.jsonl

Methodology caveats

方法论说明

  • Production prompt matters. With a naive "return the skill slug" prompt (no instruction about
    (dispatcher for: ...)
    ), every compression variant collapses to ~30-60% on Opus. The dispatcher-aware prompt is in
    evals/functional-area-resolver/harness-runner.ts:PROMPT_TEMPLATE
    . Use it as the template for your agent's harness; without it, compression breaks.
  • Training corpus and variants were authored by the same release. Held-out corpus was written before the variants and never adjusted; this mitigates but does not eliminate overfitting.
  • Confidence intervals via t-distribution across n=3 seeded repeats. Hold the n=3 lower-bound: high CIs mean the underlying sample is noisy.
  • Single-vendor result. All three models are Anthropic. Cross-vendor verification (Gemini, GPT) is a v0.33.x follow-up.
  • Held-out blind set is small (n=5). Saturated at 100% across most cells — the harness can't distinguish between "100%" and "95% with one nondeterministic miss." Expanding to ≥20 is a v0.33.x follow-up.
  • 生产环境提示词至关重要。如果使用简单的"返回Skill slug"提示词(未提及
    (dispatcher for: ...)
    ),所有压缩变体在Opus模型上的准确率都会降至30-60%。支持调度器的提示词位于
    evals/functional-area-resolver/harness-runner.ts:PROMPT_TEMPLATE
    。请将其作为Agent测试框架的模板;否则压缩会失效。
  • 训练数据集与变体由同一版本编写。预留盲测数据集在变体编写前就已完成,且未做任何调整——这在一定程度上缓解了过拟合问题,但无法完全消除。
  • 置信区间基于n=3次随机种子的t分布。注意n=3是下限:高置信区间意味着底层样本存在噪声。
  • 单一厂商测试结果。所有三种模型均来自Anthropic。跨厂商验证(Gemini、GPT)将在v0.33.x版本中完成。
  • 预留盲测集规模较小(n=5)。大多数测试项的准确率都饱和在100%——测试框架无法区分"100%"与"95%(含一次非确定性错误)"。后续v0.33.x版本会将规模扩展至≥20个用例。

Prior work and citations

相关工作与引用

The pattern is a static-prompt analog of hierarchical agent routing, a 2024-2025 research direction:
  • AnyTool (arXiv:2402.04253) showed meta-agent → category-agent → tool-agent hierarchy on 16K APIs beats flat retrieval by +35.4pp. The
    (dispatcher for: ...)
    clause is the meta-agent's view collapsed into a single LLM pass.
  • RAG-MCP (arXiv:2505.03275) reports 49.2% prompt-token reduction at 3.2× accuracy gain via embedding-based pre-retrieval. The token-reduction story matches ours (48% smaller), via a different mechanism (RAG vs static dispatcher).
  • Anthropic Agent Skills (engineering blog) promotes progressive disclosure: frontmatter (~80 tokens) always loaded, SKILL.md body loaded on match. This skill applies the same principle at the routing-table level, not the per-skill body level.
The 2025-2026 literature has no published benchmark for static-prompt hierarchical routing (every published hierarchical scheme resolves the hierarchy at runtime via a second LLM call). Our finding — that the hierarchy can be inlined into a single-LLM-pass dispatcher list and retain routing accuracy — is the open contribution. See
evals/functional-area-resolver/README.md
for methodology details.
该模式是分层Agent路由的静态提示词实现,属于2024-2025年的研究方向:
  • AnyToolarXiv:2402.04253)证明,元Agent→分类Agent→工具Agent的分层架构在16K个API上的表现比扁平检索高出35.4个百分点。
    (dispatcher for: ...)
    子句是将元Agent的视图压缩为单次LLM调用的实现方式。
  • RAG-MCParXiv:2505.03275)报告称,通过基于嵌入的预检索,提示词令牌减少49.2%,同时准确率提升3.2倍。我们的令牌减少效果与之匹配(文件大小减少48%),但实现机制不同(RAG vs 静态调度器)。
  • Anthropic Agent Skills技术博客)提倡渐进式披露:始终加载前言(约80个令牌),匹配时再加载SKILL.md主体内容。本Skill将同一原则应用于路由表格层面,而非单个Skill的主体内容层面。
2025-2026年的文献中尚未有关于静态提示词分层路由的公开基准(所有已发表的分层方案都通过第二次LLM调用在运行时解析层级)。我们的发现——层级可以内联到单次LLM调用的调度器列表中,同时保持路由准确率——是公开的贡献成果。更多方法论细节请见
evals/functional-area-resolver/README.md

How To Compress

压缩步骤

Step 1: Preconditions

步骤1:前置条件

Refuse to compress if either gate fails:
  • Source routing file is under 12KB (compression overhead exceeds benefit).
  • git status
    shows uncommitted changes to the routing file (the compressor's edit would entangle with whatever the user was doing).
If a user wants to override either gate, they ask explicitly with
--force
.
如果以下任一条件不满足,则拒绝压缩:
  • 源路由文件大小小于12KB(压缩开销大于收益)。
  • git status
    显示路由文件存在未提交的更改(压缩器的编辑会与用户正在进行的操作冲突)。
如果用户希望绕过任一条件,可以使用
--force
参数明确指定。

Step 2: When to compress which file

步骤2:选择要压缩的文件

GBrain workspaces often have TWO routing files merged at runtime (per
src/core/check-resolvable.ts
v0.31.7):
skills/RESOLVER.md
and a sibling
../AGENTS.md
. Choose which to compress:
  • Only one is fat (>12KB): compress that one; leave the small one alone.
  • Both are fat: compress them separately, in order: AGENTS.md first (usually the larger one in OpenClaw-style deployments), then RESOLVER.md.
  • Only the small one is fat (rare): same rule — compress it.
If the deployment uses only one routing file, this section is a no-op — compress that one.
GBrain工作区通常有两个在运行时合并的路由文件(基于
src/core/check-resolvable.ts
v0.31.7版本):
skills/RESOLVER.md
和同级目录的
../AGENTS.md
。选择要压缩的文件:
  • 只有一个文件较大(>12KB):压缩该文件,保留小文件不变。
  • 两个文件都较大:分别压缩,顺序为:先压缩AGENTS.md(在OpenClaw风格的部署中通常更大),再压缩RESOLVER.md。
  • 只有小文件较大(罕见):遵循相同规则——压缩该文件。
如果部署仅使用一个路由文件,则直接压缩该文件即可。

Step 3: Identify functional areas

步骤3:识别功能区域

Group skills by domain. Typical areas (adjust per deployment):
  • Brain & Knowledge — brain-ops as dispatcher
  • Content Ingestion — ingest as dispatcher
  • Calendar & Scheduling — google-calendar as dispatcher
  • Email & Comms — executive-assistant as dispatcher
  • Research & Investigation — perplexity-research as dispatcher
  • X/Twitter & Social — x-ingest as dispatcher
  • Places & Travel — checkin as dispatcher
  • Product & Building — acp-coding as dispatcher
  • Infrastructure — healthcheck as dispatcher
  • Tasks & Logistics — daily-task-manager as dispatcher
  • People & Contacts — google-contacts as dispatcher
按领域对Skill进行分组。典型区域(可根据部署调整):
  • 脑图与知识管理 —— 使用
    brain-ops
    作为调度器
  • 内容导入 —— 使用
    ingest
    作为调度器
  • 日历与日程安排 —— 使用
    google-calendar
    作为调度器
  • 邮件与通讯 —— 使用
    executive-assistant
    作为调度器
  • 研究与调查 —— 使用
    perplexity-research
    作为调度器
  • X/Twitter与社交 —— 使用
    x-ingest
    作为调度器
  • 地点与旅行 —— 使用
    checkin
    作为调度器
  • 产品与开发 —— 使用
    acp-coding
    作为调度器
  • 基础设施 —— 使用
    healthcheck
    作为调度器
  • 任务与后勤 —— 使用
    daily-task-manager
    作为调度器
  • 人员与联系人 —— 使用
    google-contacts
    作为调度器

Step 4: Build the area entry format

步骤4:构建区域条目格式

Each area entry follows this template:
- **{Area Name}**: {comma-separated trigger phrases} -> `{dispatcher-skill}`
  (dispatcher for: {comma-separated sub-skill names})
Rules:
  • Trigger phrases should be broad enough to catch intent ("brain pages, enrich, search, filing, citations, book analysis")
  • Sub-skill list should be comprehensive — this is how the LLM knows what's available
  • The dispatcher skill file should have its own internal routing table
每个区域条目遵循以下模板:
- **{区域名称}**: {逗号分隔的触发短语} -> `{调度器Skill}`
  (dispatcher for: {逗号分隔的子Skill名称})
规则:
  • 触发短语应足够宽泛,以覆盖用户意图(例如"脑图页面、丰富、搜索、归档、引用、书籍分析")
  • 子Skill列表应全面——这是LLM了解可用Skill的途径
  • 调度器Skill文件应包含自己的内部路由表格

Step 5: Keep always-on entries separate

步骤5:单独保留始终启用的条目

Gates and always-on entries (acknowledge, multi-user, entity-detector, etc.) stay as individual rows — they're checked on every message, not dispatched.
门控和始终启用的条目(acknowledge、multi-user、entity-detector等)保持单独行——它们会在每条消息中被检查,而非通过调度器路由。

Step 6 (MANDATORY): Verify routing accuracy

步骤6(必填):验证路由准确率

Run two gates before committing the compressed file. Do NOT commit if either fails.
Gate 1: Structural verification. Confirms your
routing-eval.jsonl
fixtures still resolve to the right skills under the compressed routing file. Run from the workspace whose routing file you just edited:
bash
gbrain routing-eval --json
If accuracy on your fixtures drops below 95%, revert and tune the area entries before re-running.
Gate 2: LLM A/B verification on YOUR edited file. Confirms a frontier LLM can still drill into the dispatcher list and reach sub-skills under your specific compression. Requires a gbrain repo checkout because the harness lives there. Copy your edited routing file into the harness's variants directory, then invoke the harness with
--variants
pointing at it:
bash
undefined
提交压缩后的文件前,需通过两个验证环节。任一环节失败,均不得提交。
环节1:结构验证。确认
routing-eval.jsonl
用例在压缩后的路由文件下仍能解析到正确的Skill。在编辑了路由文件的工作区中运行:
bash
gbrain routing-eval --json
如果用例的准确率低于95%,则回滚并调整区域条目后重新运行。
环节2:针对编辑后文件的LLM A/B验证。确认前沿LLM仍能深入调度器列表,找到对应子Skill。需要检出gbrain仓库,因为测试框架位于其中。将编辑后的路由文件复制到测试框架的变体目录,然后使用
--variants
参数指定该文件:
bash
undefined

In your agent workspace, identify the routing file you just compressed.

在你的Agent工作区中,找到刚压缩的路由文件。

EDITED=/path/to/your/AGENTS.md # or skills/RESOLVER.md, whichever you edited
EDITED=/path/to/your/AGENTS.md # 或skills/RESOLVER.md,取决于你编辑的文件

In your gbrain repo checkout:

在你的gbrain仓库检出目录中:

cd /path/to/gbrain/evals/functional-area-resolver TMP=$(mktemp -d)/variants && mkdir -p "$TMP" cp "$EDITED" "$TMP/my-edit.md"
cd /path/to/gbrain/evals/functional-area-resolver TMP=$(mktemp -d)/variants && mkdir -p "$TMP" cp "$EDITED" "$TMP/my-edit.md"

Run the harness against your file (sequential, ~75 calls × $0.0076 ≈ $0.57 on Opus).

针对你的文件运行测试框架(串行执行,约75次调用 × 0.0076美元 ≈ Opus模型下0.57美元)。

ANTHROPIC_API_KEY=... node harness.mjs --variants-dir "$TMP" --variants my-edit
--model opus --parallel 3 --yes

The harness uses gbrain's bundled fixture set, so this verifies "did the LLM
land in the right sub-skill for routing intents the gbrain-bundled fixtures
cover" — a regression check on shared skills, not a full re-eval of YOUR
fixture set. For full eval coverage, mirror this skill's
`fixtures.jsonl` + `fixtures-held-out.jsonl` setup with intents specific
to your skills.

If the lenient (same-area) score on your variant drops below 95%, revert the
compression and tune. Common causes:
- A sub-skill was omitted from the `(dispatcher for: ...)` list.
- Trigger phrases for an area are too narrow (LLM can't recognize intent).
- Areas were collapsed too aggressively (too few areas — see Anti-Patterns).
- ASCII `->` vs Unicode `→` mismatch — the harness now accepts both, but
  earlier versions only matched Unicode. Pin gbrain to v0.32.3.0+.

Common false negatives on the harness eval (NOT bugs in your compression):
- The gbrain-bundled fixtures target skill names like `enrich`, `query`,
  `gmail`, `executive-assistant`. If your routing file doesn't expose
  those skills at all, expect strict-scoring failures on those fixtures.
  Lenient scoring stays accurate for any sub-skill present in your
  `(dispatcher for: ...)` lists.
ANTHROPIC_API_KEY=... node harness.mjs --variants-dir "$TMP" --variants my-edit
--model opus --parallel 3 --yes

测试框架使用gbrain内置的用例集,因此验证的是"LLM能否为gbrain内置用例覆盖的路由意图找到正确的子Skill"——这是对共享Skill的回归检查,而非对你的用例集的完整重新评估。如需完整评估覆盖,请参照本Skill的`fixtures.jsonl` + `fixtures-held-out.jsonl`结构,创建针对你的Skill的特定意图用例。

如果你的变体的宽松匹配(同区域)评分低于95%,则回滚压缩并进行调整。常见原因:
- 某个子Skill被遗漏在`(dispatcher for: ...)`列表中。
- 某个区域的触发短语过于狭窄(LLM无法识别意图)。
- 区域合并过于激进(区域过少——见反模式)。
- ASCII `->`与Unicode `→`不匹配——当前测试框架已兼容两者,但早期版本仅支持Unicode。请将gbrain固定到v0.32.3.0+版本。

测试框架评估中的常见假阴性(非压缩错误):
- gbrain内置用例针对的Skill名称如`enrich`、`query`、`gmail`、`executive-assistant`。如果你的路由文件根本不包含这些Skill,则这些用例的严格匹配评分会失败,但宽松匹配评分仍会保持准确(只要你的`(dispatcher for: ...)`列表中存在对应子Skill)。

Step 7: Review the diff before committing

步骤7:提交前查看差异

Show the user the proposed edit (or the actual git diff) and wait for explicit approval before staging. Same convention as
skills/book-mirror/SKILL.md
.
向用户展示拟议的编辑内容(或实际的git差异),等待用户明确批准后再暂存。与
skills/book-mirror/SKILL.md
遵循相同的约定。

Contract

协议

This skill guarantees:
  • Routing matches the canonical triggers in the frontmatter.
  • Compression is only performed when the preconditions in Step 1 pass (file ≥12KB AND clean working tree, or
    --force
    ).
  • The mandatory verification gate in Step 6 fires on the user's edited file, not on sample variants. The user runs
    gbrain routing-eval --json
    AND the gbrain-repo harness (
    node harness.mjs --variants-dir <tmp> --variants my-edit
    ) before committing the compressed file.
  • Privacy contract preserved: no fork-specific filesystem path literals (server-side brain home, OpenClaw fork home) leak into the compressed output.
The full behavior contract is documented in the body sections above; this section exists for the conformance test.
本Skill保证:
  • 路由与前言中的标准触发词匹配。
  • 仅当步骤1中的前置条件满足(文件≥12KB且工作区干净,或使用
    --force
    参数)时,才会执行压缩。
  • 步骤6中的必填验证环节针对用户编辑后的文件执行,而非示例变体。用户在提交压缩文件前,需运行
    gbrain routing-eval --json
    和gbrain仓库中的测试框架(
    node harness.mjs --variants-dir <tmp> --variants my-edit
    )。
  • 隐私协议得到遵守:压缩输出中不会泄露任何特定于分支的文件系统路径字面量(服务器端脑图目录、OpenClaw分支目录)。
完整的行为协议已在上述章节中记录;本节用于一致性测试。

Output Format

输出格式

The compressed routing file follows the area-entry template documented in Step 4 ("Build the area entry format"). Each entry:
- **{Area Name}**: {trigger phrases} -> \
{dispatcher-skill}` (dispatcher for: {sub-skill list})
. The dispatcher arrow may be either ASCII 
->
(default in this template) or Unicode
→` (used in some production deployments); the gbrain harness accepts both.
压缩后的路由文件遵循步骤4中记录的区域条目模板("构建区域条目格式")。每个条目格式为:
- **{区域名称}**: {触发短语} -> \
{调度器Skill}` (dispatcher for: {子Skill列表})
。调度器箭头可以是ASCII 
->
(本模板默认)或Unicode 
→`(部分生产部署使用);gbrain测试框架兼容两者。

Anti-Patterns

反模式

  • Resolver-of-resolvers with pipe tables. Tested and failed (see eval table). The LLM picks area names from the table instead of drilling into sub-skills.
  • Removing sub-skill names. Without the
    (dispatcher for: ...)
    list, the LLM can't route to specific sub-skills. The list is the routing signal.
  • Too few areas. Collapsing to <5 areas makes each area too broad. 12-15 areas is the sweet spot.
  • Too many areas. Defeats the purpose. If you have 50 areas, just keep individual rows.
  • 使用管道表格的解析器嵌套。已测试并失败(见评估表格)。LLM会从表格中选择区域名称,而非深入子Skill。
  • 移除子Skill名称。如果没有
    (dispatcher for: ...)
    列表,LLM无法路由到具体的子Skill。该列表是路由信号的核心。
  • 区域过少。合并为少于5个区域会导致每个区域过于宽泛。12-15个区域是最佳范围。
  • 区域过多。失去了压缩的意义。如果有50个区域,不如保留单独行。

Maintenance

维护

When adding a new skill:
  1. Identify its functional area.
  2. Add the skill name to that area's
    (dispatcher for: ...)
    list.
  3. Update the area's skill file with routing detail.
  4. Run the routing eval (Step 6) to verify.
When adding a new functional area:
  1. Create the dispatcher skill with internal routing.
  2. Add the area entry to the routing file.
  3. Run the routing eval (Step 6) to verify.
添加新Skill时:
  1. 确定其所属的功能区域。
  2. 将Skill名称添加到该区域的
    (dispatcher for: ...)
    列表中。
  3. 更新该区域的Skill文件,添加路由细节。
  4. 运行步骤6中的路由评估进行验证。
添加新功能区域时:
  1. 创建带有内部路由的调度器Skill。
  2. 将区域条目添加到路由文件中。
  3. 运行步骤6中的路由评估进行验证。

Changelog

更新日志

v1.0.0 — 2026-05-11

v1.0.0 — 2026-05-11

  • Initial version. Pattern shipped in gbrain v0.32.3.0 with a held-out A/B eval (see
    evals/functional-area-resolver/
    ).
  • Skill renamed from
    compress-agents-md
    to
    functional-area-resolver
    pre-release; the contribution is the pattern, not the filename.
  • 初始版本。该模式随gbrain v0.32.3.0发布,并附带预留数据集的A/B评估(见
    evals/functional-area-resolver/
    )。
  • Skill在预发布阶段从
    compress-agents-md
    更名为
    functional-area-resolver
    ;核心贡献是该模式,而非文件名。