wren-enrich-context
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWren Enrich Context — Fill the Business-Context Gap
Wren 上下文增强——填补业务上下文空白
This skill exists because most business context never lives in a DB schema — it lives in handbooks, glossaries, finance reports, support playbooks, code comments, Slack rules-of-thumb. The agent reads those raw artifacts, finds what's missing from the Wren project, and either grills the user one question at a time (grill mode) or applies its best inferences directly and hands over an audit (auto-pilot mode) before writing back. The output lands in three (or four) sinks each project already has — no new artifact, no new tooling.
本技能的存在是因为大多数业务上下文从未存储在数据库架构中——它们存在于手册、术语表、财务报告、支持操作指南、代码注释、Slack经验规则中。Agent会读取这些原始工件,找出Wren项目中缺失的内容,要么逐个问题询问用户(grill模式),要么直接应用最佳推断并提供审计报告(auto-pilot模式),然后写入结果。输出会存入每个项目已有的三个(或四个)目标位置——无需新工件或新工具。
Version check
版本检查
Silently fetch .
Compare the key with this skill's version (from the frontmatter above).
If the remote version is newer, notify the user before proceeding:
https://raw.githubusercontent.com/Canner/WrenAI/main/skills/versions.jsonwren-enrich-contextA newer version of the wren-enrich-context skill is available. Update with:bashnpx skills add Canner/WrenAI --skill wren-enrich-contextThe CLI auto-detects your installed agent. To target a specific one, add(e.g.,--agent <name>,claude-code,cursor,windsurf).cline
Continue regardless of update status.
静默获取。将其中的键值与本技能的版本(来自上方的前置信息)进行对比。如果远程版本更新,请在继续前通知用户:
https://raw.githubusercontent.com/Canner/WrenAI/main/skills/versions.jsonwren-enrich-contextA newer version of the wren-enrich-context skill is available. Update with:bashnpx skills add Canner/WrenAI --skill wren-enrich-contextThe CLI auto-detects your installed agent. To target a specific one, add(e.g.,--agent <name>,claude-code,cursor,windsurf).cline
无论更新状态如何,继续执行后续步骤。
Hard rules — READ FIRST
硬性规则——请先阅读
Universal (apply to both modes)
通用规则(适用于两种模式)
- Only add, never modify existing. If you find an existing MDL description / relationship / rule that looks wrong, do not edit it. Surface it on the "please fix manually" list shown in Step 9.
- Every MDL edit must validate. Right after any MDL YAML change, run . If it fails, revert that single change and feed the error back. Never leave a project in an invalid state.
wren context validate - Pre-draft every proposal. Whether you're showing the draft to the user (grill) or applying it directly (auto-pilot), generate the concrete content — never lazy-ask "what should the description say?".
- Be explicit about confidence. In grill mode, open Lane 3 inference questions with "I'm guessing — ". In auto-pilot, tag every Lane 3 inference and partial Lane 2 match in the Step 9 audit with confidence (high / med / low) and source.
- 仅添加,绝不修改现有内容。如果发现现有MDL描述/关联/规则看起来有误,请勿编辑。将其列入步骤9中的“请手动修复”列表。
- 每次MDL编辑必须验证。任何MDL YAML修改后,立即运行。如果验证失败,撤销此次单独修改并反馈错误信息。绝不能让项目处于无效状态。
wren context validate - 为每个提案预先生成草稿。无论是向用户展示草稿(grill模式)还是直接应用(auto-pilot模式),都要生成具体内容——绝不能笼统提问“描述应该怎么写?”。
- 明确说明置信度。在grill模式下,所有第三类推断问题开头要加上“我猜测——”。在auto-pilot模式下,步骤9的审计报告中要为每个第三类推断和部分第二类匹配项标记置信度(高/中/低)及来源。
Grill mode only
仅适用于grill模式
- One question at a time. Grill relentlessly. Walk every gap top-down, resolve one decision before moving to the next. Provide a recommended answer for every question.
- Skip is final for this session. No pending queue, no nagging next round. If the user wants to revisit, they re-run the skill.
- 一次一个问题,持续提问。自上而下梳理每个空白,解决一个决策后再进行下一个。为每个问题提供推荐答案。
- 跳过即视为本次会话终结。无待处理队列,后续会话不会再次提及。如果用户想重新处理,需重新运行本技能。
Auto-pilot mode only
仅适用于auto-pilot模式
-
Drop into grill for three cases. Always interrupt auto-pilot and ask the user when:
- (a) Lane 2 conflict — raw and current MDL disagree.
- (b) High-blast-radius proposal (any lane) — new cube, new view, new relationship, or new MDL metric/calculated column. These become public artifacts visible to every future agent session, so blast radius doesn't depend on whether the trigger was raw evidence (Lane 2) or inference (Lane 3).
- (c) Lane 2 routing ambiguity — you can't confidently pick a sink (MDL / /
instructions.md/queries.yml).cubes/
Everything else: apply directly and log to the audit list.
-
三种情况切换至grill模式。出现以下情况时,必须中断auto-pilot并询问用户:
- (a) 第二类冲突——原始内容与当前MDL不一致。
- (b) 高影响提案(任何类别)——新增cube、新增视图、新增关联,或新增MDL指标/计算列。这些会成为公共工件,对未来所有Agent会话可见,因此影响范围不取决于触发因素是原始证据(第二类)还是推断(第三类)。
- (c) 第二类路由歧义——无法确定应写入哪个目标位置(MDL//
instructions.md/queries.yml)。cubes/
其他所有情况:直接应用并记录到审计列表中。
Step 0 — Mode selection (before anything else)
步骤0——模式选择(优先执行)
Before touching the project or reading any file, ask the user which mode to run in. Lock the choice for the whole session — no mid-session switching; the user re-runs to change.
Two modes for this session:a) Grill mode — I walk every gap with you, one question at a time, proposing a draft and waiting for your accept / edit / skip. You stay in the driver's seat. Best when the raw material is sensitive, when you want to learn what I don't know about your project, or when you'd rather review than re-do.b) Auto-pilot mode — I read raw + current context, make my best inferences, and apply them. I'll only stop to grill you on (1) conflicts between raw and existing MDL and (2) high-blast-radius additions like new metrics, views, or relationships. The session ends with a full diff + confidence-tagged inference list for you to audit.Which? (a / b)
Remember the choice as and use it to branch Steps 6 and 9.
MODE = grill | autopilot在接触项目或读取任何文件之前,询问用户要运行哪种模式。整个会话锁定该选择——会话中途不可切换;用户需重新运行技能以更改模式。
本次会话提供两种模式:a) Grill模式——我会与你逐个梳理每个空白,提出草稿并等待你的接受/编辑/跳过。你掌控全程。适用于原始材料敏感、你想了解项目中我未知的信息,或你更倾向于审核而非重做的场景。b) Auto-pilot模式——我会读取原始内容+当前上下文,做出最佳推断并应用。仅在以下情况时切换至grill模式询问你:(1)原始内容与现有MDL冲突;(2)新增指标、视图或关联等高影响添加项。会话结束时会提供完整差异+带置信度标记的推断列表供你审核。选择哪种模式?(a / b)
将选择记录为,并在步骤6和步骤9中根据该选择分支执行。
MODE = grill | autopilotPreflight
预检查
Step 1 — Choose the Wren project
步骤1——选择Wren项目
Always ask the user which project to enrich before doing anything else — never assume cwd. A user can have several Wren projects and an ambient profile that doesn't match the one they want to augment.
~/.wrenOffer concrete hints in the question so the user can answer in one round-trip:
bash
undefined在执行任何操作前,务必询问用户要增强哪个项目——切勿假设当前工作目录即为目标项目。用户可能拥有多个Wren项目,且环境配置文件指向的项目可能并非其想要增强的项目。
~/.wren在问题中提供具体提示,以便用户一次回答即可确认:
bash
undefinedHint 1 — does cwd look like a project?
提示1——当前目录是否看起来像项目?
test -f wren_project.yml && pwd
test -f wren_project.yml && pwd
Hint 2 — does ~/.wren/config.yml point at a default project?
提示2——~/.wren/config.yml是否指向默认项目?
grep -E '^project_path:' ~/.wren/config.yml 2>/dev/null
Then ask:
> Which Wren project do you want me to augment?
> a) `$PWD` (current directory) ← if Hint 1 matched
> b) `<path from ~/.wren/config.yml>` ← if Hint 2 matched
> c) something else — paste the absolute path
After the user answers, lock the path in for the whole session:
```bash
cd <chosen-path>
test -f wren_project.yml || {
echo "Error: <chosen-path> is not a Wren project (no wren_project.yml)."
exit 1
}
wren context show >/dev/null 2>&1 || {
echo "Error: wren context show failed — manifest may be invalid."
exit 1
}If either check fails, stop and tell the user — suggest if it's not a project, or if the manifest is broken.
wren-onboardingwren context validateFrom this point on, every command and file path in this skill is relative to the chosen project root. Do not switch projects mid-session — if the user wants to work a different project, end this session and re-run.
grep -E '^project_path:' ~/.wren/config.yml 2>/dev/null
然后询问:
> 你想增强哪个Wren项目?
> a) `$PWD`(当前目录) ← 如果提示1匹配
> b) `<path from ~/.wren/config.yml>` ← 如果提示2匹配
> c) 其他路径——粘贴绝对路径
用户回答后,锁定该路径用于整个会话:
```bash
cd <chosen-path>
test -f wren_project.yml || {
echo "Error: <chosen-path> is not a Wren project (no wren_project.yml)."
exit 1
}
wren context show >/dev/null 2>&1 || {
echo "Error: wren context show failed — manifest may be invalid."
exit 1
}如果任一检查失败,停止执行并告知用户——如果不是Wren项目,建议使用;如果清单无效,建议使用。
wren-onboardingwren context validate从此时起,本技能中的所有命令和文件路径均相对于所选项目根目录。会话中途不可切换项目——如果用户想处理其他项目,需结束当前会话并重新运行技能。
Step 2 — Detect memory availability
步骤2——检测记忆功能可用性
bash
wren memory --help >/dev/null 2>&1- Exit 0 → set . The fourth sink (direct
MEMORY_AVAILABLE = true) is open.wren memory store - Exit non-zero → set . Skip the memory-only paths below.
MEMORY_AVAILABLE = false
bash
wren memory --help >/dev/null 2>&1- 退出码0 → 设置。第四个目标位置(直接
MEMORY_AVAILABLE = true)可用。wren memory store - 非零退出码 → 设置。跳过以下仅适用于记忆功能的路径。
MEMORY_AVAILABLE = false
Step 3 — Ensure raw/ folder exists
步骤3——确保raw/文件夹存在
From the project root (cwd is already there from Step 1):
bash
mkdir -p rawIf you just created it (the directory was empty or new):
I've createdat the project root. Drop anything you think helps explain this project's business context — PDFs, glossaries, handbooks, financial reports, data dictionaries, sample queries, code with comments, screenshots of dashboards, anything.raw/Heads-up: the contents may be sensitive. Decide for yourself whether to committo git — I won't touchraw/..gitignoreTell me when you've added the files and I'll start reading.
Wait for the user to confirm before continuing.
从项目根目录(步骤1已切换至此)执行:
bash
mkdir -p raw如果刚刚创建了该文件夹(目录为空或新建):
我已在项目根目录创建文件夹。请放入任何有助于解释项目业务上下文的内容——PDF、术语表、手册、财务报告、数据字典、示例查询、带注释的代码、仪表板截图等均可。raw/注意: 内容可能包含敏感信息。请自行决定是否将提交至git——我不会修改raw/。.gitignore请在添加文件后告知我,我将开始读取。
等待用户确认后再继续。
Step 4 — Read everything
步骤4——读取所有内容
Read both sides — the raw material and the current Wren context — before forming any opinion.
在形成任何结论前,读取两方面内容——原始材料和当前Wren上下文。
Raw
原始内容
Read every file under . Use whatever capability your agent has natively (text, markdown, code, PDF). If you genuinely can't read a particular file, tell the user once which file and suggest converting it to text or pasting the relevant excerpt — then move on to the rest. Do not install extra Python packages, do not reach for new CLI subcommands.
raw/读取下的所有文件。使用Agent原生支持的任何能力(文本、markdown、代码、PDF)。如果确实无法读取某个文件,仅告知用户一次该文件名,并建议转换为文本或粘贴相关内容——然后继续处理其他文件。请勿安装额外Python包,请勿使用新的CLI子命令。
raw/Current Wren context
当前Wren上下文
| Source | Command |
|---|---|
| MDL (full) | |
| Project instructions | |
| Existing cubes (names) | |
| Existing cubes (measures + dimensions) | |
| Curated NL-SQL pairs | read |
| (Memory) stored pairs | |
| (Memory) schema as text | |
The memory rows only matter when . Reading cubes is essential before any Lane 3 metric proposal — see for the duplication guard.
MEMORY_AVAILABLE = truereferences/cube_proposals.md| 来源 | 命令 |
|---|---|
| 完整MDL | |
| 项目说明 | |
| 现有cube(名称) | |
| 现有cube(度量+维度) | 对每个名称执行 |
| 精选NL-SQL对 | 直接读取 |
| (记忆功能)存储的配对 | |
| (记忆功能)文本形式的架构 | |
仅当时,记忆行才有用。在提出任何第三类指标提案前,读取cube信息至关重要——请参阅中的重复检查规则。
MEMORY_AVAILABLE = truereferences/cube_proposals.mdStep 4.5 — Ground-truth probe (grill mode default; auto-pilot opt-out)
步骤4.5——基准探测(grill模式默认开启;auto-pilot模式默认关闭)
When raw is silent on a column's enum / unit / null / magic / time semantics, the catalog's column-local categories (#1, #2, #3, #5, #7 in ) can often be settled directly by sampling distinct values from the live DB. Read before this step — its Trigger column tells you which columns are probe candidates.
references/gap_catalog.mdreferences/gap_catalog.mdDefault policy by mode:
| Mode | Default | How to override |
|---|---|---|
| Grill | Probe on. Before the first query, ask the user once: "I want to sample N columns with | User says no → skip Step 4.5 entirely; rely on Lane 2 + Lane 3 instead. |
| Auto-pilot | Probe off. The skill never queries the live DB in auto-pilot mode. | None — user must re-run in grill mode if probe would unblock high-confidence inferences. |
Candidate selection (no DB call yet):
A column is a probe candidate when all hold:
- Description is empty OR does not yet contain the relevant (catalog write format).
[tag] - Column type / name pattern matches catalog #1 (enum), #3 (NULL), #5 (magic), or #7 (time grain).
- For #3 and #7, the description also lacks event-vs-record or TZ wording.
Categories #2 (unit), #4, #6, #8, #9, #10 are not probable — doesn't reveal units, default filters, synonyms, external mappings, currency conventions, or canonical-table preferences. Those need raw evidence or human judgment.
SELECT DISTINCTProbe query:
bash
wren --sql "SELECT DISTINCT <col> FROM <model> LIMIT 30" --output json当原始内容未提及列的枚举/单位/空值/魔法值/时间语义时,目录中的列本地类别(中的#1、#2、#3、#5、#7)通常可通过直接采样实时数据库中的不同值来确定。执行此步骤前请阅读——其“触发条件”列会告诉你哪些列是探测候选。
references/gap_catalog.mdreferences/gap_catalog.md按模式划分的默认策略:
| 模式 | 默认设置 | 如何覆盖 |
|---|---|---|
| Grill | 开启探测。在首次查询前,询问用户一次:“我想对N列进行采样,每列 | 用户拒绝 → 完全跳过步骤4.5;仅依赖第二类+第三类内容。 |
| Auto-pilot | 关闭探测。auto-pilot模式下技能绝不会查询实时数据库。 | 无——如果探测可实现高置信度推断,用户需重新运行grill模式。 |
候选选择(尚未调用数据库):
当以下所有条件满足时,列即为探测候选:
- 描述为空 或 尚未包含相关(目录写入格式)。
[tag] - 列类型/名称模式匹配目录#1(枚举)、#3(NULL)、#5(魔法值)或#7(时间粒度)。
- 对于#3和#7,描述还缺少事件与记录的区分或时区相关表述。
类别#2(单位)、#4、#6、#8、#9、#10 不可探测——无法揭示单位、默认过滤器、同义词、外部映射、货币约定或规范表偏好。这些需要原始证据或人工判断。
SELECT DISTINCT探测查询:
bash
wren --sql "SELECT DISTINCT <col> FROM <model> LIMIT 30" --output jsonFor magic sentinels (catalog #5), also fetch min/max:
对于魔法哨兵值(目录#5),还需获取最小值/最大值:
wren --sql "SELECT MIN(<col>) AS lo, MAX(<col>) AS hi FROM <model>" --output json
- **≤ 30 distinct values returned** → enum / sentinel / grain candidate. Draft the `[tag]` line and surface to user (grill) with confidence "med — probed values, semantics still inferred".
- **30 returned (LIMIT hit)** → cardinality too high; not an enum / sentinel candidate. Skip.
- **Query fails** (permissions, connection, large-table timeout) → do not retry. Log the failure to the audit list, surface in Step 9, continue with Lane 2 + Lane 3 only.
**Safety:**
- Probe each (model, column) at most once per session.
- Never probe a column that already has a matching `[tag]` line — Universal Rule 1.
- Probe results stay in working memory; do not write them to disk.
---wren --sql "SELECT MIN(<col>) AS lo, MAX(<col>) AS hi FROM <model>" --output json
- **返回≤30个不同值** → 枚举/哨兵值/粒度候选。生成`[tag]`行并展示给用户(grill模式),置信度标记为“中——已探测值,语义仍为推断”。
- **返回30个值(达到LIMIT上限)** → 基数过高;不是枚举/哨兵值候选。跳过。
- **查询失败**(权限、连接、大表超时) → 请勿重试。将失败记录到审计列表,在步骤9中展示,继续仅使用第二类+第三类内容。
**安全注意事项:**
- 每个(模型,列)在会话中最多探测一次。
- 绝不探测已包含匹配`[tag]`行的列——通用规则1。
- 探测结果仅保留在工作内存中;请勿写入磁盘。
---Step 5 — Three gap-detection lanes (in your head, no artifact)
步骤5——三类空白检测(仅在内存中执行,不生成工件)
Hold all three lanes in working memory. Do not write a .
gaps.ymlBefore sweeping, load — the ten business-semantic categories the schema cannot carry. Each lane consumes the catalog differently: Lane 1 walks it as type-aware mechanical triggers, Lane 2 classifies each atomic raw claim into one of the 10 categories before routing, Lane 3 seeds inference prompts when a trigger fires but raw is silent.
references/gap_catalog.md在工作内存中同时处理三类检测。请勿写入。
gaps.yml开始扫描前,加载——架构无法承载的十大业务语义类别。每类检测对目录的使用方式不同:第一类将其作为类型感知的机械触发条件,第二类将每个原始原子声明分类为10个类别之一再路由,第三类在触发条件满足但原始内容为空时生成推断提示。
references/gap_catalog.mdLane 1 — Structural coverage (mechanical)
第一类——结构覆盖(机械检测)
Scan the current MDL and check:
- Every model has a non-empty ?
properties.description - Every column has a description (at least for non-PK, non-FK ones)?
- Every model has a ?
primary_key - Every model has at least one relationship (orphan models are suspicious)?
- is more than the scaffold default?
instructions.md - has at least a few canonical pairs?
queries.yml
Plus, walk every column / model against triggers:
references/gap_catalog.md- For each column matching catalog #1 / #2 / #3 / #5 / #7 triggers → is the corresponding line present in
[tag]?properties.description - For each model with a soft-delete column (,
deleted_at,is_active, etc.) → is there aarchived_atrule in## Default filterscovering it (catalog #4)?instructions.md - For each lookalike table pair (e.g. /
users) → is there ausers_v3rule (catalog #10)?## Canonical tables - For each /
*_currency/ external-system ID column → is the matchingfx_rate(#9) or## Currency(#8) section present?## External identifiers - Business terms in or raw that don't map verbatim to model / column names → catalog #6
instructions.mdrule missing.## Naming conventions
Each unsatisfied check is a candidate. Combine with Step 4.5 probe results (if available) before moving to Lane 2.
扫描当前MDL并检查:
- 每个模型是否有非空的?
properties.description - 每个列是否有描述(至少非主键、非外键列)?
- 每个模型是否有?
primary_key - 每个模型是否至少有一个关联(孤立模型需警惕)?
- 是否超出脚手架默认内容?
instructions.md - 是否至少包含几个规范配对?
queries.yml
此外,对照的触发条件检查每个列/模型:
references/gap_catalog.md- 每个匹配目录#1/#2/#3/#5/#7触发条件的列 → 中是否存在对应的
properties.description行?[tag] - 每个包含软删除列(、
deleted_at、is_active等)的模型 →archived_at中是否有覆盖该列的instructions.md规则(目录#4)?## Default filters - 每个相似表对(如/
users) → 是否存在users_v3规则(目录#10)?## Canonical tables - 每个/
*_currency/外部系统ID列 → 是否存在匹配的fx_rate(#9)或## Currency(#8)章节?## External identifiers - 或原始内容中未直接映射到模型/列名的业务术语 → 缺少目录#6的
instructions.md规则。## Naming conventions
每个未满足的检查项均为候选。结合步骤4.5的探测结果(如果可用)后进入第二类检测。
Lane 2 — Claim-diff (raw vs current context)
第二类——声明差异(原始内容vs当前上下文)
For each raw file, internally extract 5–15 atomic claims — single statements that could be true or false, e.g. "an order has exactly one customer", "user means type=default by default", "ARR equals MRR × 12 minus refunds". Then for each claim, classify against the current Wren context:
| Class | Meaning | Resolution outcome |
|---|---|---|
| covered | already reflected in MDL / instructions / pairs | skip |
| partial | the topic exists but the wording / scope differs | propose tightening |
| new | nothing in current context matches | route to a sink |
| conflict | raw says A, current context says B | grill the user (both modes), but do not edit existing — surface for manual fix |
对每个原始文件,内部提取5–15个原子声明——可判断真假的单一陈述,例如“一个订单恰好对应一个客户”“默认情况下user指type=default”“ARR等于MRR×12减去退款”。然后将每个声明与当前Wren上下文进行分类:
| 分类 | 含义 | 解决结果 |
|---|---|---|
| 已覆盖 | 已在MDL/说明/配对中体现 | 跳过 |
| 部分覆盖 | 主题存在但表述/范围不同 | 建议优化 |
| 新增 | 当前上下文无匹配内容 | 路由至目标位置 |
| 冲突 | 原始内容说A,当前上下文说B | 询问用户(两种模式),但请勿编辑现有内容——列入手动修复列表 |
Lane 3 — Inference (your own guesses)
第三类——推断(Agent自主猜测)
After reading raw and the current MDL, propose additions the user did not literally state in raw but that would clearly help the agent later. Examples:
- "I see referenced five times in
quarterly_churn. No existing cube covers it. Want me to addfinance.pdfwith measure =cubes/quarterly_churn/metadata.yml?" — seeCOUNT(*) FILTER (WHERE churned_at IS NOT NULL) / NULLIF(COUNT(*), 0)for the YAML template and duplication guard.references/cube_proposals.md - "Your support handbook keeps mentioning without defining it. Is this
core users? Want me to make a view?"users WHERE tier = 'premium' - "The data dictionary says is JSON but the column has no description — let me draft one."
events.payload
For any aggregation-shaped proposal (, , , "by month / by status / per customer" patterns), default to a cube. Run + first to confirm no existing cube already covers the measure expression; if one does, skip the proposal and add a example pointing at the existing cube instead. The full decision tree, naming rules, and validation flow live in .
SUMCOUNTAVGwren cube listwren cube describequeries.ymlreferences/cube_proposals.mdIn grill mode, open every Lane 3 question with "I'm guessing — ". In auto-pilot, tag the audit entry with so the user sees you extrapolated.
agent inference读取原始内容和当前MDL后,提出用户未在原始内容中明确提及但显然有助于Agent后续工作的补充内容。示例:
- “我看到在
quarterly_churn中被引用五次。现有cube均未覆盖该指标。是否需要我添加finance.pdf,其中度量为cubes/quarterly_churn/metadata.yml?”——请参阅COUNT(*) FILTER (WHERE churned_at IS NOT NULL) / NULLIF(COUNT(*), 0)中的YAML模板和重复检查规则。references/cube_proposals.md - “你的支持手册多次提到但未定义。是否指
core users?是否需要我创建视图?”users WHERE tier = 'premium' - “数据字典显示是JSON类型,但该列无描述——我来生成一个草稿。”
events.payload
对于任何聚合形式的提案(、、、“按月份/按状态/按客户”模式),默认采用cube形式。先执行+确认现有cube未覆盖该度量表达式;如果已覆盖,跳过提案并添加一个指向现有cube的示例。完整决策树、命名规则和验证流程请参阅。
SUMCOUNTAVGwren cube listwren cube describequeries.ymlreferences/cube_proposals.md在grill模式下,所有第三类问题开头要加上“我猜测——”。在auto-pilot模式下,审计条目要标记以便用户知晓这是推断内容。
agent inferenceStep 6 — Resolve gaps
步骤6——解决空白
Branch on the locked in Step 0.
MODE根据步骤0中锁定的分支执行。
MODEGrill mode
Grill模式
Use this conversational pattern for every gap surfaced in Lanes 1–3:
Interview the user relentlessly about every gap until we reach a shared understanding. Walk down each branch of the decision tree, resolving dependencies between decisions one-by-one. For each question, provide your recommended answer.Ask the questions one at a time.If a question can be answered by exploring the codebase or the raw files, do that instead of asking.
For every grill turn:
- State the gap and where it came from (Lane 1 / 2 / 3, and for Lane 2 quote the raw file + a short excerpt).
- Propose the concrete answer — draft the description, the rule, the SQL pair, the relationship.
- Propose the sink ("I'll add this to as a rule" / "I'll add this to the
instructions.mdmodel description in MDL").users - Let the user accept / edit / skip.
- On accept: write back (Step 7).
- On edit: apply their wording, then write back.
- On skip: drop it, move to the next gap. Do not requeue.
When the user gives a curve-ball answer ("actually we don't track that") — pivot. The goal is shared understanding, not pushing a pre-built list.
对第一至第三类中发现的每个空白,使用以下对话模式:
针对每个空白持续询问用户,直到达成共识。逐步梳理决策树的每个分支,逐个解决决策间的依赖关系。每个问题都要提供推荐答案。一次只问一个问题。如果问题可通过探索代码库或原始文件解决,请勿询问用户。
每次询问时:
- 说明空白及其来源(第一/第二/第三类,第二类需引用原始文件+简短摘录)。
- 提出具体答案——生成描述、规则、SQL配对、关联的草稿。
- 建议目标位置(“我会将此添加到作为规则”/“我会将此添加到MDL中
instructions.md模型的描述里”)。users - 让用户选择接受/编辑/跳过。
- 接受:写入结果(步骤7)。
- 编辑:应用用户的表述,然后写入结果。
- 跳过:放弃该空白,进入下一个。请勿重新列入队列。
如果用户给出意外答案(“实际上我们不跟踪这个”)——调整方向。目标是达成共识,而非推进预设列表。
Auto-pilot mode
Auto-pilot模式
Process every finding from Lanes 1–3 directly — except for the three escalation cases from Universal Rule 7:
- Lane 2 conflict — drop into grill for this single question, never auto-resolve.
- Lane 3 proposing a new metric / view / relationship — drop into grill before applying.
- Lane 2 routing ambiguous — drop into grill for the sink choice only, then auto-apply.
For everything else (Lane 1 mechanical fixes, Lane 2 unambiguous new claims, Lane 3 low-impact description tweaks):
- Synthesize the concrete proposal (description / rule / SQL pair) using your best inference.
- Decide the sink using the routing table (Step 7).
- Write back.
- Run immediately after any MDL edit. On failure: revert the single change and log the revert with the error.
wren context validate - Append to the audit list: the finding, the sink, confidence tag (high / med / low), and source (raw file + excerpt for Lane 2, for Lane 3,
agent inferencefor Lane 1).structural
Auto-pilot does not pause for confirmation on each item — the user reviews the full diff + audit list in Step 9. They are the reviewer, not the gatekeeper.
直接处理第一至第三类中的所有发现——除了通用规则7中的三种升级情况:
- 第二类冲突——切换至grill模式处理该单个问题,绝不自动解决。
- 第三类提案涉及新增指标/视图/关联——应用前切换至grill模式询问用户。
- 第二类路由歧义——切换至grill模式询问目标位置选择,然后自动应用。
其他所有情况(第一类机械修复、第二类明确新增声明、第三类低影响描述调整):
- 使用最佳推断生成具体提案(描述/规则/SQL配对)。
- 使用路由表确定目标位置(步骤7)。
- 写入结果。
- 任何MDL编辑后立即运行。验证失败:撤销此次单独修改并记录撤销及错误信息。
wren context validate - 添加到审计列表:发现内容、目标位置、置信度标记(高/中/低)、来源(第二类为原始文件+摘录,第三类为,第一类为
agent inference)。structural
Auto-pilot模式不会逐个暂停等待确认——用户在步骤9中查看完整差异+审计列表。用户是审核者,而非把关者。
Step 7 — Routing & writeback
步骤7——路由与写入
Decide the sink as part of the proposal (Step 6.3 in grill mode; Step 6.2 in auto-pilot), so the user can correct routing in grill mode and audit it in auto-pilot.
| Finding type | Sink | How to write |
|---|---|---|
| Schema structure / relationship / view / model or column description | MDL YAML under | Edit the YAML file directly. For catalog #1 / #2 / #3 / #5 / #7 / PII, append a |
| Aggregation metric / named measure (with measures + dimensions) | | New file per cube. Default sink for any |
| Default filter / implicit rule / business convention / naming convention / external mapping / currency / canonical table | | Append under the catalog-specified |
| Canonical NL→SQL example the team should share | | Append a new entry under |
Ad-hoc NL→SQL pair (user-local, not for the repo) — only if | | |
Catalog-driven routing means every column-local proposal goes to the column's with a line; every cross-model rule goes to under a fixed heading. This keeps re-enrichment deterministic (greppable) and avoids inventing new sink locations.
properties.description[tag]instructions.md在提案中确定目标位置(grill模式步骤6.3;auto-pilot模式步骤6.2),以便用户在grill模式下纠正路由,在auto-pilot模式下审计路由。
| 发现类型 | 目标位置 | 写入方式 |
|---|---|---|
| 架构结构/关联/视图/模型或列描述 | MDL YAML(位于 | 直接编辑YAML文件。对于目录#1/#2/#3/#5/#7/PII,在 |
| 聚合指标/命名度量(包含度量+维度) | | 每个cube对应一个新文件。原始内容定义或第三类推断的任何 |
| 默认过滤器/隐式规则/业务约定/命名约定/外部映射/货币/规范表 | | 添加到目录指定的 |
| 团队应共享的规范NL→SQL示例 | | 在 |
临时NL→SQL配对(用户本地使用,不提交至仓库)——仅当 | | |
基于目录的路由意味着每个列本地提案都要添加到列的并附带行;每个跨模型规则都要添加到的固定标题下。这使得重新增强过程具有确定性(可通过 grep 查找),避免发明新的目标位置。
properties.description[tag]instructions.mdAfter every MDL edit
每次MDL编辑后
bash
wren context validateIf it fails:
- Revert the single change you just made.
- Show the user the validation error (grill mode) or log it in the audit (auto-pilot).
- In grill mode, re-grill on that specific gap with the error as new context.
- In auto-pilot, mark the finding as "revert: validation failed" and move on.
bash
wren context validate如果验证失败:
- 撤销刚刚做出的单独修改。
- 向用户展示验证错误(grill模式)或记录到审计列表(auto-pilot模式)。
- 在grill模式下,结合错误信息重新询问该空白。
- 在auto-pilot模式下,将该发现标记为“已撤销:验证失败”并继续。
Format reminders
格式提醒
- schema follows
queries.ymloutput:wren memory dump,version: 1list ofpairs:(use{nl, sql, source}).source: enrich - MDL YAML uses snake_case keys (e.g. ,
primary_key,is_calculated).not_nullconverts to camelCase forwren context build.target/mdl.json - is free-form markdown. Group rules by topic with headings.
instructions.md
- 架构遵循
queries.yml输出:wren memory dump,version: 1为pairs:列表(使用{nl, sql, source})。source: enrich - MDL YAML使用蛇形命名键(如、
primary_key、is_calculated)。not_null会将其转换为驼峰命名并存入wren context build。target/mdl.json - 为自由格式markdown。按主题分组规则并添加标题。
instructions.md
Step 8 — Session finalize
步骤8——会话收尾
After Step 6 ends (user says stop in grill mode, or every finding is processed in auto-pilot):
bash
wren context buildThis recompiles from the YAML edits.
target/mdl.jsonIf :
MEMORY_AVAILABLE = truebash
wren memory indexThis re-embeds the new schema items, the updated , and the new entries.
instructions.mdqueries.yml步骤6结束后(grill模式下用户说停止,或auto-pilot模式下所有发现已处理):
bash
wren context build此命令会根据YAML编辑重新编译。
target/mdl.json如果:
MEMORY_AVAILABLE = truebash
wren memory index此命令会重新嵌入新架构项、更新后的和新的条目。
instructions.mdqueries.ymlStep 9 — Summary
步骤9——总结
Both modes — common section
两种模式通用部分
Print a tight session report:
text
Wren Enrich Context — session summary (mode: <grill|autopilot>)
Added:
MDL : N model descriptions, N column descriptions, N relationships, N views
by tag: [enum]=N [unit]=N [null]=N [magic]=N [time]=N [pii]=N
cubes : N new (names: <list>) via cubes/<name>/metadata.yml
instructions.md : N new rules across sections
by section: Default filters=N | Naming conventions=N | External identifiers=N | Currency=N | Canonical tables=N
queries.yml : N new NL→SQL pairs
memory store : N ad-hoc pairs (only if MEMORY_AVAILABLE)
Probe : N columns sampled, M failed (grill mode only)
Please fix manually (we don't edit existing fields):
- models/orders/metadata.yml: existing description seems to contradict raw/glossary.pdf p.3
- relationships.yml: existing orders↔customers is MANY_TO_ONE but raw/data_dict.md p.7 says MANY_TO_MANY
- …打印简洁的会话报告:
text
Wren 上下文增强——会话总结 (模式: <grill|autopilot>)
已添加:
MDL : N个模型描述,N个列描述,N个关联,N个视图
按标签统计: [enum]=N [unit]=N [null]=N [magic]=N [time]=N [pii]=N
cubes : N个新增(名称: <列表>) 路径:cubes/<name>/metadata.yml
instructions.md : N条新增规则,分布在多个章节
按章节统计: Default filters=N | Naming conventions=N | External identifiers=N | Currency=N | Canonical tables=N
queries.yml : N个新增NL→SQL配对
memory store : N个临时配对 (仅当MEMORY_AVAILABLE可用时)
探测 : N个列已采样,M个探测失败 (仅grill模式)
请手动修复(我们不编辑现有字段):
- models/orders/metadata.yml: 现有描述似乎与raw/glossary.pdf第3页矛盾
- relationships.yml: 现有orders↔customers关联为MANY_TO_ONE,但raw/data_dict.md第7页显示为MANY_TO_MANY
- …Grill mode extras
Grill模式额外内容
Append:
text
Skipped this session: N gaps (re-run /wren-enrich-context to revisit)追加:
text
本次会话已跳过:N个空白(重新运行/wren-enrich-context可重新处理)Auto-pilot mode extras
Auto-pilot模式额外内容
Append a detailed audit so the user can sanity-check inferences:
text
Inferred items (please review):
high | MDL model:orders.description | from raw/glossary.pdf p.2 — "Order = ..."
high | instructions.md rule | from raw/handbook.md §4 — "default tier ..."
med | MDL column:users.signup_source.desc | agent inference from raw/onboarding.md
low | queries.yml: "weekly active customers" | agent inference, no direct raw evidence
Validation:
K successful applies, M reverted after wren context validate failed:
- relationships.yml: <error> → reverted
Escalated to grill (raw vs MDL conflicts / high-impact additions):
- <count> items — see grill transcript aboveThe user should be encouraged to skim the audit and either accept it as-is, manually tweak low-confidence rows, or re-run in grill mode if they want to revisit interactively.
追加详细审计报告,以便用户检查推断内容:
text
推断项(请审核):
高 | MDL模型:orders.description | 来自raw/glossary.pdf第2页 — "Order = ..."
高 | instructions.md规则 | 来自raw/handbook.md第4节 — "default tier ..."
中 | MDL列:users.signup_source.desc | Agent根据raw/onboarding.md推断
低 | queries.yml: "weekly active customers" | Agent推断,无直接原始证据
验证情况:
K项成功应用,M项在wren context validate失败后已撤销:
- relationships.yml: <错误信息> → 已撤销
升级至grill模式的项(原始内容与MDL冲突 / 高影响添加项):
- <数量>项 — 请查看上方grill会话记录应鼓励用户浏览审计报告,可选择直接接受、手动调整低置信度条目,或重新运行grill模式以交互式重新处理。
Things to avoid
注意事项
- Do not write a ,
gaps.yml, or any other tracking artifact. The session lives entirely in conversation.state.yml - Do not modify any existing MDL field, instructions rule, or queries.yml entry — only append / add. Surface mismatches on the manual-fix list.
- Do not install new Python packages (,
pypdf, …) to read raw. Use what your agent already has; ask the user to convert files you can't open.docling - Do not auto-resolve a conflict between raw and current MDL — always grill the user, in both modes.
- Do not present Lane 3 inferences as if they were quoted from raw. Open with "I'm guessing — " (grill) or tag (auto-pilot).
agent inference - Do not call when
wren memory store— write toMEMORY_AVAILABLE = falseinstead so the pair survives a futurequeries.yml.wren memory index - Do not commit anything to git. The user owns the commit decision.
- Do not nag about skipped questions. Skip is skip for this session (grill mode only — auto-pilot has no skip concept).
- Do not run after every single MDL edit — once at the end is enough. Do run
wren context buildafter every edit.wren context validate - Do not assume was created by
raw/— it isn't. This skill creates it.wren context init - Do not switch modes mid-session. The user re-runs to change mode.
- Do not append a line if the same category tag already exists for that column — Universal Rule 1. Surface contradictions on the manual-fix list instead.
[tag] - Do not invent new section headings. Stick to the five catalog-defined headings (
instructions.md,## Default filters,## Naming conventions,## External identifiers,## Currency). Anything that doesn't fit goes on the manual-fix list.## Canonical tables - Do not probe the live DB in auto-pilot mode. Step 4.5 is grill-only by default.
- Do not propose a cube whose measure expression already exists in another cube on the same — write a
base_objectexample pointing at the existing cube instead. Seequeries.ymlduplication guard.references/cube_proposals.md - Do not modify an existing cube YAML even when raw contradicts it — Universal Rule 1. Surface on the manual-fix list.
- Do not write a new cube alongside an old MDL entry that already covers the same logic. Surface as "consider migrating to cube" on the manual-fix list.
metrics: - Do not skip after creating a cube. Structural
wren cube query --cube <name> --sql-onlydoesn't catch unresolvable measure / dimension expressions.wren context validate - In auto-pilot, do not auto-apply Lane 2 conflicts or new metric / view / relationship inferences — always drop into grill for those.
- 请勿写入、
gaps.yml或任何其他跟踪工件。会话全程仅通过对话进行。state.yml - 请勿修改任何现有MDL字段、说明规则或queries.yml条目——仅追加/添加。将不匹配项列入手动修复列表。
- 请勿安装新Python包(、
pypdf等)以读取原始内容。使用Agent已有的能力;请用户转换无法打开的文件。docling - 请勿自动解决原始内容与当前MDL的冲突——两种模式下均需询问用户。
- 请勿将第三类推断表述为原始内容中的引用。grill模式下开头要加“我猜测——”,auto-pilot模式下要标记。
agent inference - 当时,请勿调用
MEMORY_AVAILABLE = false——改为写入wren memory store,以便配对在未来queries.yml时保留。wren memory index - 请勿提交任何内容至git。用户拥有提交决策权。
- 请勿对跳过的问题反复提及。跳过即视为本次会话终结(仅grill模式——auto-pilot模式无跳过概念)。
- 请勿在每次MDL编辑后都运行——会话结束时运行一次即可。但每次编辑后必须运行
wren context build。wren context validate - 请勿假设是由
raw/创建的——并非如此。本技能会创建该文件夹。wren context init - 会话中途不可切换模式。用户需重新运行技能以更改模式。
- 如果列已存在相同类别标签,请勿追加行——通用规则1。将矛盾项列入手动修复列表。
[tag] - 请勿创建新的章节标题。坚持使用目录定义的五个标题(
instructions.md、## Default filters、## Naming conventions、## External identifiers、## Currency)。不适合的内容列入手动修复列表。## Canonical tables - auto-pilot模式下请勿探测实时数据库。步骤4.5默认仅在grill模式下执行。
- 请勿提出度量表达式已存在于同一其他cube中的cube提案——改为写入指向现有cube的
base_object示例。请参阅queries.yml中的重复检查规则。references/cube_proposals.md - 即使原始内容与现有cube YAML矛盾,也请勿修改——通用规则1。列入手动修复列表。
- 请勿在已有MDL 条目覆盖相同逻辑的情况下新增cube。列入手动修复列表,标注“考虑迁移至cube”。
metrics: - 创建cube后请勿跳过。结构性的
wren cube query --cube <name> --sql-only无法检测无法解析的度量/维度表达式。wren context validate - auto-pilot模式下,请勿自动应用第二类冲突或新增指标/视图/关联的推断——这些情况始终要切换至grill模式。
See also
另请参阅
- — the ten business-semantic gap categories, with triggers, default sinks, and write formats. Read this before Step 4.5 and Step 5.
references/gap_catalog.md - — decision tree for when to propose a cube vs view vs calculated column, the cube YAML template, naming policy, duplication guard, and validation flow. Read this before any Lane 3 aggregation-shaped proposal.
references/cube_proposals.md
- — 十大业务语义空白类别,包含触发条件、默认目标位置和写入格式。执行步骤4.5和步骤5前请阅读。
references/gap_catalog.md - — 决定何时提案cube、视图或计算列的决策树,cube YAML模板,命名规则,重复检查规则和验证流程。提出任何第三类聚合形式提案前请阅读。
references/cube_proposals.md