schema-author

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

schema-author — evolve your schema pack

schema-author — 升级你的schema包

Non-goals (use these other skills instead)

非目标（请改用以下其他技能）

This skill AUTHORS the schema pack (adds page types, link verbs, prefixes, flags). For these adjacent jobs, route elsewhere:

Filing one specific page →
```
skills/brain-taxonomist/SKILL.md
```
. Brain- taxonomist routes at WRITE TIME ("where does this note go?"). schema-author changes the rules at AUTHORING TIME ("what types and prefixes exist?").
Schema-check as part of EIIRP iteration →
```
skills/eiirp/SKILL.md
```
already has a schema-check phase. Don't duplicate.
Just looking up a type's settings →
```
gbrain schema explain <type>
```
directly. This skill is for CHANGING the pack, not READING from it.
Querying who knows about X →
```
skills/expert-routing/SKILL.md
```
(or
```
gbrain whoknows
```
directly). schema-author makes a type expert-routable; it does not run the query.

本技能用于编写（AUTHORS）schema包（添加页面类型、链接动词、前缀、标记）。对于以下相关任务，请转至其他技能：

归档单个特定页面 →
```
skills/brain-taxonomist/SKILL.md
```
。Brain-taxonomist在写入阶段（“这条笔记应该放在哪里？”）进行路由，而schema-author在编写阶段（“存在哪些类型和前缀？”）修改规则。
作为EIIRP迭代一部分的schema检查 →
```
skills/eiirp/SKILL.md
```
已包含schema检查阶段，请勿重复操作。
仅查询某类型的设置 → 直接使用
```
gbrain schema explain <type>
```
。本技能用于修改包，而非读取包内容。
查询谁了解X相关内容 →
```
skills/expert-routing/SKILL.md
```
（或直接使用
```
gbrain whoknows
```
）。schema-author使类型支持专家路由，但不执行查询操作。

Convention

约定

Convention: see conventions/brain-first.md for the lookup chain (search → query → get_page → external).

Convention: see conventions/schema-evolution.md for "when to add a type vs alias vs prefix" — the heuristic.

约定： 查阅conventions/brain-first.md了解查找链（搜索→查询→get_page→外部）。

约定： 查阅conventions/schema-evolution.md了解“何时添加类型、别名或前缀”的启发式规则。

When to invoke

调用时机

Invoke when the user (or a sibling skill) says any of:

"Add a
```
researcher
```
type to my schema"
"I have 4000 untyped pages under
```
meetings/
```
"
"My brain doesn't know that
```
journal-article
```
is a type"
"Set
```
paper
```
to be extractable"
"Propose types from what I've ingested"
"Sync the new types to backfill existing pages"

DON'T invoke for "where does THIS note go" (use brain-taxonomist) or "who knows about X" (use expert-routing /

gbrain whoknows

当用户（或同级技能）说出以下任意内容时调用：

“为我的schema添加
```
researcher
```
类型”
“我的
```
meetings/
```
目录下有4000个未分类页面”
“我的brain不知道
```
journal-article
```
是一种类型”
“将
```
paper
```
设置为可提取”
“根据我已导入的内容提议新类型”
“同步新类型以回填现有页面”

请勿在“这条笔记应该放在哪里”（使用brain-taxonomist）或“谁了解X相关内容”（使用expert-routing /

gbrain whoknows

）时调用本技能。

Tutorial + vision

教程与愿景

Why this matters:
```
docs/what-schemas-unlock.md
```
— 7 killer use cases (4000 invisible meetings made queryable, founder ops brain, research brain, legal brain, team brain, agent-as-co-curator) plus the structural argument for why types matter at query time. Read this before pitching schema authoring to a user — it's the doc that explains the difference between a pile of notes and a brain with structure.
5-minute walkthrough:
```
docs/schema-author-tutorial.md
```
— fork the bundled pack, add a researcher type, sync, prove the T1.5 wiring via
```
gbrain whoknows
```
. Use placeholder pages so it runs against any brain without affecting real content.

重要性：
```
docs/what-schemas-unlock.md
```
— 7个关键用例（4000个不可见会议变为可查询、创始人运营brain、研究brain、法律brain、团队brain、Agent作为联合策展人），以及类型在查询阶段为何重要的结构化论证。在向用户推荐schema编写之前，请阅读本文档——它解释了一堆笔记与结构化brain之间的区别。
5分钟演练：
```
docs/schema-author-tutorial.md
```
— 复刻捆绑包、添加researcher类型、同步、通过
```
gbrain whoknows
```
验证T1.5连接。使用占位页面，以便在不影响真实内容的情况下在任意brain上运行。

Workflow

工作流程

Phase 1 — Brain (know which pack is active)

阶段1 — Brain（了解当前激活的包）

gbrain schema active --json

Output gives you

pack_name

version

sha8

page_types_count

source_tier

. If

source_tier === "default"

, the user is on bundled

gbrain-base

and any mutation will need a fork first (Phase 4).

gbrain schema active --json

输出将提供

pack_name

、

version

、

sha8

、

page_types_count

、

source_tier

。如果

source_tier === "default"

，则用户使用的是捆绑的

gbrain-base

，任何修改都需要先复刻（阶段4）。

Phase 2 — Assess (what does the current pack cover?)

阶段2 — 评估（当前包覆盖了哪些内容？）

gbrain schema stats --json

Returns per-type page counts, untyped count, and

dead_prefixes

(pack- declared prefixes with zero matching pages — probable mis-declarations). If coverage < 90%, there's untyped content worth typing.

gbrain schema review-orphans --limit 50 --json

Untyped pages drilldown. Look for shared path prefixes (e.g. "12 of these are under

research/papers/

") — those are candidates for a new type.

gbrain schema stats --json

返回各类型页面数量、未分类页面数量，以及

dead_prefixes

（包中声明但无匹配页面的前缀——可能是错误声明）。如果覆盖率<90%，则存在值得分类的未分类内容。

gbrain schema review-orphans --limit 50 --json

未分类页面详情。查找共享路径前缀（例如“其中12个位于

research/papers/

下”）——这些是新类型的候选对象。

Phase 3 — Propose (what types should the pack add?)

阶段3 — 提议（包应添加哪些类型？）

gbrain schema detect --json

Clusters pages by

source_path

and proposes candidate types. Heuristic only (no LLM call).

gbrain schema suggest --json

LLM-refined candidates with confidence scores. Use the top-3 hit rate as the signal for which to promote.

gbrain schema detect --json

按

source_path

聚类页面并提议候选类型。仅使用启发式规则（无LLM调用）。

gbrain schema suggest --json

经LLM优化的候选类型，带有置信度评分。使用前3个候选的命中率作为选择依据。

Phase 4 — Apply (mutate the pack)

阶段4 — 应用（修改包）

If the active pack is bundled (

gbrain-base

gbrain-recommended

), fork it first:

gbrain schema fork gbrain-base mine
gbrain schema use mine

Then add the types one at a time:

gbrain schema add-type researcher \
  --primitive entity \
  --prefix people/researchers/ \
  --extractable \
  --expert

For complex multi-mutation refactors (e.g. add a type AND the link verb that points to it), agents reaching this surface over MCP can use the batched

schema_apply_mutations

op:

jsonl

{"op": "add_type", "name": "researcher", "primitive": "entity", "prefix": "people/researchers/", "extractable": true, "expert_routing": true}
{"op": "add_type", "name": "paper", "primitive": "annotation", "prefix": "research/papers/", "extractable": true}
{"op": "add_link_type", "name": "authored", "inference": {"page_type": "researcher", "target_type": "paper"}}

Validate before sync:

gbrain schema lint --with-db

The

--with-db

flag opts into the 2 DB-aware rules (

extractable_empty_corpus

mutation_count_anomaly

) that detect mis-declared types you'd otherwise discover only at runtime.

如果激活的包是捆绑包（

gbrain-base

或

gbrain-recommended

），请先复刻：

gbrain schema fork gbrain-base mine
gbrain schema use mine

然后逐个添加类型：

gbrain schema add-type researcher \
  --primitive entity \
  --prefix people/researchers/ \
  --extractable \
  --expert

对于复杂的多修改重构（例如添加类型及其指向的链接动词），通过MCP访问此接口的Agent可使用批量

schema_apply_mutations

操作：

jsonl

{"op": "add_type", "name": "researcher", "primitive": "entity", "prefix": "people/researchers/", "extractable": true, "expert_routing": true}
{"op": "add_type", "name": "paper", "primitive": "annotation", "prefix": "research/papers/", "extractable": true}
{"op": "add_link_type", "name": "authored", "inference": {"page_type": "researcher", "target_type": "paper"}}

同步前验证：

gbrain schema lint --with-db

--with-db

标志启用两个数据库感知规则（

extractable_empty_corpus

、

mutation_count_anomaly

），可检测否则仅在运行时才会发现的错误声明类型。

Phase 5 — Sync (backfill existing pages with the new types)

阶段5 — 同步（用新类型回填现有页面）

Dry-run first:

gbrain schema sync --json

Returns per-prefix

would_apply

counts + sample slugs. If the numbers look right:

gbrain schema sync --apply

Chunked UPDATE in 1000-row batches; never wedges concurrent writers. Idempotent on re-run (second

--apply

finds nothing to backfill).

先执行试运行：

gbrain schema sync --json

返回每个前缀的

would_apply

计数+示例slug。如果数值合理：

gbrain schema sync --apply

以1000行批次进行分块更新；不会阻塞并发写入。重复执行具有幂等性（第二次

--apply

将无内容可回填）。

Phase 6 — Verify

阶段6 — 验证

gbrain schema stats --json

Coverage should be ≥95% now. Spot-check the new type:

gbrain whoknows "machine learning"

researcher

was declared

--expert

, results should include researcher-typed pages. (The pack-aware wiring at the query path was added in v0.40.6.0 — pre-v0.40.6 brains silently ignored custom expert-routed types.)

gbrain schema stats --json

现在覆盖率应≥95%。抽查新类型：

gbrain whoknows "machine learning"

如果

researcher

被声明为

--expert

，结果应包含researcher类型的页面。（查询路径中的包感知连接在v0.40.6.0中添加——v0.40.6之前的brain会忽略自定义专家路由类型。）

Phase 7 — Commit (preserve the change)

阶段7 — 提交（保存更改）

If the pack is in source control, commit:

cd ~/.gbrain/schema-packs/mine
git add pack.json
git commit -m "schema: add researcher + paper types + authored link"
git push

If the brain daemon is running (

gbrain serve --http

), other processes pick up the change within 1 second (stat-mtime TTL gate in loadActivePack — v0.40.6.0 closed the cross-process invalidation gap).

如果包处于版本控制中，请提交：

cd ~/.gbrain/schema-packs/mine
git add pack.json
git commit -m "schema: add researcher + paper types + authored link"
git push

如果brain守护进程正在运行（

gbrain serve --http

），其他进程将在1秒内获取更改（loadActivePack中的stat-mtime TTL gate——v0.40.6.0修复了跨进程失效的问题）。

Outputs

输出

Mutated pack file at

~/.gbrain/schema-packs/<name>/pack.{json,yaml}

Audit row in

~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl

per mutation.

```
pages.type
```
backfilled on matching rows after
```
sync --apply
```
.
Query paths (
```
whoknows
```
,
```
find_experts
```
) now route through the new expert types.

修改后的包文件位于

~/.gbrain/schema-packs/<name>/pack.{json,yaml}

。

每次修改都会在
```
~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl
```
中添加一条审核记录。
执行
```
sync --apply
```
后，匹配行的
```
pages.type
```
将被回填。
查询路径（
```
whoknows
```
、
```
find_experts
```
）现在会通过新的专家类型进行路由。

Contract

契约

Inputs: a natural-language request that names a type / prefix / link verb / flag change, OR the result of
```
gbrain schema review-orphans
```
showing untyped pages that need a new type.

Outputs: mutated pack file at

~/.gbrain/schema-packs/<name>/pack.{json,yaml}

+ an audit row in

~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl

+ (if

sync --apply

ran) backfilled

pages.type

on matching rows.

Side effects: invalidates the in-process pack cache + the query cache for the source. Other processes pick up the change within 1 second (stat-mtime TTL).
Idempotency: every primitive is idempotent.
```
add-alias
```
/
```
add-prefix
```
no-op on duplicate;
```
sync --apply
```
finds nothing to update on second run.
Trust: CLI = local trust (no scope check). MCP = OAuth
```
admin
```
scope (write ops). Audit log captures
```
actor: mcp:<clientId8>
```
per mutation.
Atomicity: every mutation is wrapped in
```
withMutation
```
's atomic write (
```
.tmp + fsync + rename
```
) + per-pack
```
O_CREAT|O_EXCL
```
lock. Crash mid-write leaves the original file untouched.

输入： 命名类型/前缀/链接动词/标记更改的自然语言请求，或
```
gbrain schema review-orphans
```
显示的需要新类型的未分类页面结果。

输出： 修改后的包文件位于

~/.gbrain/schema-packs/<name>/pack.{json,yaml}

~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl

中的审核记录 + （如果执行了

sync --apply

）匹配行的

pages.type

被回填。

副作用： 使进程内包缓存和源查询缓存失效。其他进程将在1秒内获取更改（stat-mtime TTL）。
幂等性： 每个操作都是幂等的。
```
add-alias
```
/
```
add-prefix
```
在重复操作时无效果；
```
sync --apply
```
在第二次运行时无内容可更新。
信任： CLI = 本地信任（无范围检查）。MCP = OAuth
```
admin
```
范围（写入操作）。审核日志会记录每次修改的
```
actor: mcp:<clientId8>
```
。
原子性： 每次修改都被
```
withMutation
```
的原子写入（
```
.tmp + fsync + rename
```
）+ 每个包的
```
O_CREAT|O_EXCL
```
锁包裹。写入中途崩溃会保留原始文件不变。

Anti-Patterns

反模式

Don't mutate
gbrain-base
or
gbrain-recommended
. Fork first (
```
gbrain schema fork gbrain-base mine
```
). These are bundled packs; edits would be lost on upgrade. The mutation primitives refuse with
```
PACK_READONLY
```
.
Don't add a type for a directory you imported once for triage. Pack types are permanent decisions; one-time imports are not. See
```
skills/conventions/schema-evolution.md
```
for the <20-pages-don't-pack-codify heuristic.
Don't add
--expert
to a type with no
path_prefixes
. The
```
expert_routing_without_prefix
```
lint warns about this — expert-routed types with no prefix never match a put_page inference, so
```
whoknows
```
silently never surfaces them.
Don't promote a
schema suggest
candidate without verifying the prefix matches real content. Run
```
lint --with-db
```
before
```
add-type
```
to catch prefix collisions pre-write.
Don't conflate "filing one page" with "evolving the schema." Filing routes via
```
brain-taxonomist
```
; schema-author is for authoring the type taxonomy itself. The Non-goals section above names the boundary.
Don't skip the dry-run before
sync --apply
. Always run
```
sync
```
first to see
```
would_apply
```
counts + sample slugs. A pack prefix that matches 50,000 pages is recoverable but slow; verifying first is cheap.
Don't remove a type without checking references.
```
remove-type
```
refuses with
```
STILL_REFERENCED
```
if another type's
```
aliases
```
/
```
enrichable_types
```
/
```
link_types
```
/
```
frontmatter_links
```
references it. Break the references first; don't add
```
--force
```
.

不要修改
gbrain-base
或
gbrain-recommended
。先复刻（
```
gbrain schema fork gbrain-base mine
```
）。这些是捆绑包，升级时修改会丢失。修改操作会返回
```
PACK_READONLY
```
错误。
不要为仅导入一次用于分类的目录添加类型。 包类型是永久决策；一次性导入不属于此类。查阅
```
skills/conventions/schema-evolution.md
```
了解“少于20页则不纳入包”的启发式规则。
不要为无
path_prefixes
的类型添加
--expert
。
```
expert_routing_without_prefix
```
规则会对此发出警告——无前缀的专家路由类型永远不会匹配put_page推断，因此
```
whoknows
```
不会显示它们。
不要在未验证前缀匹配真实内容的情况下推广
schema suggest
候选类型。在
```
add-type
```
前运行
```
lint --with-db
```
以在写入前捕获前缀冲突。
不要混淆“归档单个页面”与“升级schema”。 归档通过
```
brain-taxonomist
```
路由；schema-author用于编写类型分类体系本身。上述非目标部分明确了边界。
不要在
sync --apply
前跳过试运行。始终先运行
```
sync
```
查看
```
would_apply
```
计数+示例slug。匹配50000个页面的包前缀可恢复但速度慢；提前验证成本低。
不要在未检查引用的情况下删除类型。 如果其他类型的
```
aliases
```
/
```
enrichable_types
```
/
```
link_types
```
/
```
frontmatter_links
```
引用了该类型，
```
remove-type
```
会返回
```
STILL_REFERENCED
```
错误。先删除这些引用；不要使用
```
--force
```
。

Output Format

输出格式

When invoked, this skill produces structured output suitable for both human + JSON consumption:

Per-mutation result (JSON):

json

{"schema_version": 1, "pack": "mine", "path": "/Users/.../pack.json", "format": "json", "prev_sha8": "a1b2c3d4", "new_sha8": "e5f6g7h8"}

Per-batch result (from
schema_apply_mutations
MCP op):

json

{"schema_version": 1, "pack": "mine", "batch_id": "batch-1716491400-abc123", "mutations_applied": 3, "results": [{...}, {...}, {...}]}

Stats JSON (per-source + aggregate + dead-prefix hints):

json

{"schema_version": 1, "pack_identity": "mine@1.0.0+abc12345", "aggregate": {"total_pages": 4823, "typed_pages": 4710, "untyped_pages": 113, "coverage": 0.9766, "by_type": [{"type": "person", "count": 2104}, ...]}, "per_source": [...], "dead_prefixes": [{"type": "researcher", "prefix": "people/researchers/"}]}

Sync dry-run JSON:

json

{"schema_version": 1, "apply": false, "pack_identity": "mine@1.0.0+abc12345", "per_prefix": [{"type": "meeting", "prefix": "meetings/", "would_apply": 4000, "sample_slugs": ["meetings/2026-01-01-foo", ...], "dead_prefix": false, "applied": 0}], "total_would_apply": 4000, "total_applied": 0}

Human output (the agent's final summary):

One line per mutation:

Pack: <name> (<format>)

and

Sha8: <prev> → <new>

Stats: total pages, typed %, untyped count, per-type breakdown, dead-prefix list
Sync: per-prefix
```
would_apply
```
/
```
applied
```
count + sample slugs in dry-run mode

On failure, the error envelope follows the standard

StructuredAgentError

shape from

src/core/errors.ts

{error, code, message, details?}

. Codes from the mutation primitives:

PACK_NOT_FOUND

PACK_READONLY

PACK_CORRUPT

TYPE_EXISTS

TYPE_NOT_FOUND

INVALID_PRIMITIVE

INVALID_RESULT

IO_ERROR

STILL_REFERENCED

LOCK_BUSY

调用本技能时，会生成适合人类和JSON消费的结构化输出：

每次修改的结果（JSON）：

json

{"schema_version": 1, "pack": "mine", "path": "/Users/.../pack.json", "format": "json", "prev_sha8": "a1b2c3d4", "new_sha8": "e5f6g7h8"}

批量操作结果（来自
schema_apply_mutations
MCP操作）：

json

{"schema_version": 1, "pack": "mine", "batch_id": "batch-1716491400-abc123", "mutations_applied": 3, "results": [{...}, {...}, {...}]}

统计信息JSON（按源+汇总+无效前缀提示）：

json

{"schema_version": 1, "pack_identity": "mine@1.0.0+abc12345", "aggregate": {"total_pages": 4823, "typed_pages": 4710, "untyped_pages": 113, "coverage": 0.9766, "by_type": [{"type": "person", "count": 2104}, ...]}, "per_source": [...], "dead_prefixes": [{"type": "researcher", "prefix": "people/researchers/"}]}

同步试运行JSON：

json

{"schema_version": 1, "apply": false, "pack_identity": "mine@1.0.0+abc12345", "per_prefix": [{"type": "meeting", "prefix": "meetings/", "would_apply": 4000, "sample_slugs": ["meetings/2026-01-01-foo", ...], "dead_prefix": false, "applied": 0}], "total_would_apply": 4000, "total_applied": 0}

人类可读输出（Agent的最终摘要）：

每次修改一行：

Pack: <name> (<format>)

和

Sha8: <prev> → <new>

统计信息：总页面数、已分类百分比、未分类页面数、各类型细分、无效前缀列表
同步：试运行模式下每个前缀的
```
would_apply
```
/
```
applied
```
计数+示例slug

失败时，错误信封遵循

src/core/errors.ts

中的标准

StructuredAgentError

格式：

{error, code, message, details?}

。修改操作返回的错误码包括：

PACK_NOT_FOUND

、

PACK_READONLY

、

PACK_CORRUPT

、

TYPE_EXISTS

、

TYPE_NOT_FOUND

、

INVALID_PRIMITIVE

、

INVALID_RESULT

、

IO_ERROR

、

STILL_REFERENCED

、

LOCK_BUSY

。

Failure modes

失败模式

PACK_READONLY

→ you tried to mutate

gbrain-base

gbrain-recommended

. Fork first.

```
INVALID_RESULT
```
→ the mutation would create a dangling reference or prefix collision. The pre-write lint gate caught it. Read the error message; the lint rule name names the problem.
```
STILL_REFERENCED
```
→ you tried to remove a type that another type's
```
aliases
```
/
```
enrichable_types
```
/
```
link_types
```
/
```
frontmatter_links
```
references. The error names every reference. Remove those first.
```
LOCK_BUSY
```
→ another process is mid-mutation. Wait 30s and retry, or pass
```
--force
```
if you know the holder is wedged.
```
permission_denied
```
(MCP only) → your OAuth client doesn't have
```
admin
```
scope. Re-register with
```
gbrain auth register-client --scopes admin
```
.

PACK_READONLY

→ 你尝试修改

gbrain-base

或

gbrain-recommended

。请先复刻。

```
INVALID_RESULT
```
→ 修改会创建悬空引用或前缀冲突。写入前的检查规则已捕获此问题。阅读错误消息；规则名称指明了问题。
```
STILL_REFERENCED
```
→ 你尝试删除的类型被其他类型的
```
aliases
```
/
```
enrichable_types
```
/
```
link_types
```
/
```
frontmatter_links
```
引用。错误会列出所有引用。请先删除这些引用。
```
LOCK_BUSY
```
→ 另一个进程正在执行修改。等待30秒后重试，或在确认持有锁的进程已卡住时使用
```
--force
```
。
```
permission_denied
```
（仅MCP）→ 你的OAuth客户端没有
```
admin
```
范围。使用
```
gbrain auth register-client --scopes admin
```
重新注册。