wren-generate-mdl
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGenerate Wren MDL — CLI Agent Workflow
生成Wren MDL —— CLI Agent工作流
Version check
版本检查
Silently fetch .
Compare the key with this skill's version (from the frontmatter above).
If the remote version is newer, notify the user before proceeding:
https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.jsonwren-generate-mdlA newer version of the wren-generate-mdl skill is available. Update with:npx skills add Canner/wren-engine --skill wren-generate-mdlThe CLI auto-detects your installed agent. To target a specific one, add(e.g.,--agent <name>,claude-code,cursor,windsurf).cline
Then continue with the workflow below regardless of update status.
Builds an MDL project by discovering database schema and converting it
into Wren's YAML project format. The agent uses whatever database tools
are available in its environment for introspection; the wren CLI handles
type normalization, validation, and build.
For memory and query workflows after setup, see the wren-usage skill.
静默获取 。
将其中的键与本skill的版本(来自上方的前置元数据)进行对比。
如果远程版本更新,在继续操作前通知用户:
https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.jsonwren-generate-mdlwren-generate-mdl skill的新版本已发布。 更新命令:npx skills add Canner/wren-engine --skill wren-generate-mdlCLI会自动检测已安装的Agent。若要指定特定Agent,请添加(例如--agent <name>、claude-code、cursor、windsurf)。cline
无论是否更新,都继续执行下方的工作流。
通过发现数据库schema并将其转换为Wren的YAML项目格式,构建MDL项目。Agent会使用环境中可用的任何数据库工具进行自省;wren CLI负责类型标准化、验证和构建。
关于设置完成后的内存和查询工作流,请查看wren-usage skill。
Prerequisites
前置条件
- CLI installed (
wren)pip install wren-engine[<datasource>] - A working database connection (credentials available to the agent)
- A wren profile configured () or connection info ready
wren profile add
- 已安装CLI(
wren)pip install wren-engine[<datasource>] - 可用的数据库连接(Agent可获取凭证)
- 已配置wren配置文件()或准备好连接信息
wren profile add
Phase 0 — Detect existing project
阶段0 —— 检测现有项目
Goal: If the current directory is already inside a wren project, let the user decide how to proceed.
Check whether exists in the current working directory
(or any parent up to the repository root). If found:
wren_project.yml- Tell the user that an existing wren project was detected and show its path.
- Ask:
- Reset — wipe the existing project (,
models/,views/,relationships.yml, and rebuildinstructions.md) and regenerate from scratch in the same directory.wren_project.yml - New path — keep the existing project untouched and choose a
different directory for the new project. Ask the user for the new path,
then and continue from Phase 1 using that path.
wren context init --path <new_path>
- Reset — wipe the existing project (
If no existing project is detected, proceed directly to Phase 1.
目标: 如果当前目录已属于wren项目,让用户决定后续操作。
检查当前工作目录(或向上至仓库根目录的任何父目录)中是否存在。如果找到:
wren_project.yml- 告知用户检测到现有wren项目并显示其路径。
- 询问:
- 重置 —— 清空现有项目(、
models/、views/、relationships.yml,并重建instructions.md),在同一目录从头重新生成。wren_project.yml - 新路径 —— 保留现有项目不变,为新项目选择不同目录。询问用户新路径,然后执行,并使用该路径从阶段1继续。
wren context init --path <new_path>
- 重置 —— 清空现有项目(
如果未检测到现有项目,直接进入阶段1。
Phase 1 — Establish connection and scope
阶段1 —— 建立连接和确定范围
Goal: Confirm the agent can reach the database and agree on scope with the user.
-
Verify connectivity using whichever tool is available:
- If SQLAlchemy: test
engine.connect() - If database driver: simple query like
SELECT 1 - If wren profile exists: to check config
wren profile debug - If raw SQL via wren: (requires profile or connection file)
wren --sql "SELECT 1"
- If SQLAlchemy:
-
Ask the user:
- Which schema(s) or dataset(s) to include (skip if only one exists)
- Whether to include all tables or a subset
- The datasource type for wren (e.g., ,
postgres,bigquery) — needed for type normalization dialectsnowflake
目标: 确认Agent可连接到数据库,并与用户达成范围共识。
-
使用可用工具验证连通性:
- 如果有SQLAlchemy:执行测试
engine.connect() - 如果有数据库驱动:执行简单查询,如
SELECT 1 - 如果存在wren配置文件:执行检查配置
wren profile debug - 如果通过wren使用原生SQL:执行(需要配置文件或连接文件)
wren --sql "SELECT 1"
- 如果有SQLAlchemy:执行
-
询问用户:
- 要包含哪些schema或数据集(如果仅存在一个则跳过)
- 是否包含所有表或子集
- wren的数据源类型(例如、
postgres、bigquery)—— 类型标准化方言需要此信息snowflake
Phase 2 — Discover schema
阶段2 —— 发现schema
Goal: Collect table names, column names, column types, and constraints.
Use whatever introspection method is available. Here are common approaches
ranked by convenience:
目标: 收集表名、列名、列类型和约束。
使用可用的自省方法。以下是按便捷性排序的常见方法:
Option A: SQLAlchemy (recommended if available)
选项A:SQLAlchemy(如果可用则推荐)
python
from sqlalchemy import create_engine, inspect
engine = create_engine(connection_url)
inspector = inspect(engine)
tables = inspector.get_table_names(schema="public")
for table in tables:
columns = inspector.get_columns(table, schema="public")
# columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]
pk = inspector.get_pk_constraint(table, schema="public")
# pk → {"constrained_columns": ["id"], "name": "orders_pkey"}
fks = inspector.get_foreign_keys(table, schema="public")
# fks → [{"constrained_columns": ["customer_id"],
# "referred_table": "customers",
# "referred_columns": ["id"]}]python
from sqlalchemy import create_engine, inspect
engine = create_engine(connection_url)
inspector = inspect(engine)
tables = inspector.get_table_names(schema="public")
for table in tables:
columns = inspector.get_columns(table, schema="public")
# columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]
pk = inspector.get_pk_constraint(table, schema="public")
# pk → {"constrained_columns": ["id"], "name": "orders_pkey"}
fks = inspector.get_foreign_keys(table, schema="public")
# fks → [{"constrained_columns": ["customer_id"],
# "referred_table": "customers",
# "referred_columns": ["id"]}]Option B: Database-specific driver
选项B:数据库特定驱动
- psycopg / asyncpg (Postgres): Query and
information_schema.columnsinformation_schema.table_constraints - google-cloud-bigquery: ,
client.list_tables()→client.get_table()table.schema - snowflake-connector-python: ,
SHOW COLUMNS IN TABLESHOW PRIMARY KEYS IN TABLE - clickhouse-driver: ,
DESCRIBE TABLEsystem.tables
- psycopg / asyncpg(Postgres): 查询和
information_schema.columnsinformation_schema.table_constraints - google-cloud-bigquery: 、
client.list_tables()→client.get_table()table.schema - snowflake-connector-python: 、
SHOW COLUMNS IN TABLESHOW PRIMARY KEYS IN TABLE - clickhouse-driver: 、
DESCRIBE TABLEsystem.tables
Option C: Raw SQL via wren
选项C:通过wren使用原生SQL
If no driver is available but a wren profile is configured, query
through wren itself:
information_schemabash
wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o jsonNote: this goes through the MDL layer, so it only works if you already
have a minimal MDL or if the database supports as
regular tables. For bootstrapping from zero, Option A or B is preferred.
information_schema如果没有驱动但已配置wren配置文件,通过wren本身查询:
information_schemabash
wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json注意:此方法通过MDL层,因此仅在已有最小MDL或数据库支持将作为常规表时有效。从零开始搭建时,优先选择选项A或B。
information_schemaPhase 3 — Normalize types
阶段3 —— 类型标准化
Goal: Convert raw database types to wren-core-compatible types.
目标: 将原始数据库类型转换为wren-core兼容类型。
Python import (recommended for batch processing)
Python导入(批量处理推荐)
python
from wren.type_mapping import parse_type, parse_typespython
from wren.type_mapping import parse_type, parse_typesSingle type
单个类型
normalized = parse_type("character varying(255)", "postgres") # → "VARCHAR(255)"
normalized = parse_type("character varying(255)", "postgres") # → "VARCHAR(255)"
Batch — entire table at once
批量处理 —— 一次性处理整张表
columns = [
{"column": "id", "raw_type": "int8"},
{"column": "name", "raw_type": "character varying"},
{"column": "total", "raw_type": "numeric(10,2)"},
]
normalized_cols = parse_types(columns, dialect="postgres")
columns = [
{"column": "id", "raw_type": "int8"},
{"column": "name", "raw_type": "character varying"},
{"column": "total", "raw_type": "numeric(10,2)"},
]
normalized_cols = parse_types(columns, dialect="postgres")
Each dict now has a "type" key with the normalized value
每个字典现在包含一个"type"键,对应标准化后的值
undefinedundefinedCLI (if Python import not available)
CLI(如果无法使用Python导入)
Single type:
bash
wren utils parse-type --type "character varying(255)" --dialect postgres单个类型:
bash
wren utils parse-type --type "character varying(255)" --dialect postgres→ VARCHAR(255)
→ VARCHAR(255)
Batch (stdin JSON):
```bash
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
| wren utils parse-types --dialect postgres
批量处理(标准输入JSON):
```bash
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
| wren utils parse-types --dialect postgresPhase 4 — Scaffold and write MDL project
阶段4 —— 搭建并写入MDL项目
Goal: Create the YAML project structure.
目标: 创建YAML项目结构。
Step 1 — Initialize project
步骤1 —— 初始化项目
bash
wren context init --path /path/to/projectThis creates:
text
project/
├── wren_project.yml
├── models/
├── views/
├── relationships.yml
└── instructions.mdIMPORTANT:andcataloginschemawren_project.ymlThese are Wren Engine's internal namespace — they are NOT the database's native catalog or schema. Keep the defaults (,catalog: wren) unless you are intentionally configuring a multi-project namespace.schema: publicYour database's actual catalog/schema is specified per-model in(see Step 2). Do not copy database catalog/schema values intotable_reference.wren_project.yml
bash
wren context init --path /path/to/project此命令会创建:
text
project/
├── wren_project.yml
├── models/
├── views/
├── relationships.yml
└── instructions.md重要提示:中的wren_project.yml和catalogschema这些是Wren Engine的内部命名空间 —— 并非数据库的原生catalog或schema。除非有意配置多项目命名空间,否则请保留默认值(,catalog: wren)。schema: public数据库实际的catalog/schema是在中按模型指定的(见步骤2)。请勿将数据库的catalog/schema值复制到table_reference中。wren_project.yml
Step 2 — Write model files
步骤2 —— 写入模型文件
For each table, create a YAML file under . Use snake_case
naming (the build step converts to camelCase automatically).
models/yaml
undefined为每张表在下创建一个YAML文件。使用蛇形命名法(构建步骤会自动转换为驼峰命名法)。
models/yaml
undefinedmodels/orders/metadata.yml
models/orders/metadata.yml
name: orders
table_reference:
catalog: "" # database catalog (empty string if not applicable;
# for DuckDB, use the DB file name without extension,
# e.g. jaffle_shop.duckdb → catalog: jaffle_shop)
schema: public # database schema (this IS the DB schema)
table: orders # database table name
primary_key: order_id
columns:
- name: order_id type: INTEGER not_null: true
- name: customer_id type: INTEGER
- name: total type: "DECIMAL(10, 2)"
- name: status type: VARCHAR properties: description: "Order status: pending, shipped, delivered, cancelled"
undefinedname: orders
table_reference:
catalog: "" # 数据库catalog(若不适用则为空字符串;
# 对于DuckDB,使用不带扩展名的DB文件名,
# 例如 jaffle_shop.duckdb → catalog: jaffle_shop)
schema: public # 数据库schema(此处为数据库实际的schema)
table: orders # 数据库表名
primary_key: order_id
columns:
- name: order_id type: INTEGER not_null: true
- name: customer_id type: INTEGER
- name: total type: "DECIMAL(10, 2)"
- name: status type: VARCHAR properties: description: "订单状态:pending、shipped、delivered、cancelled"
undefinedStep 3 — Write relationships
步骤3 —— 写入关系
From foreign key constraints discovered in Phase 2:
yaml
undefined根据阶段2中发现的外键约束:
yaml
undefinedrelationships.yml
relationships.yml
- name: orders_customers
models:
- orders
- customers join_type: many_to_one condition: "orders.customer_id = customers.customer_id"
Join type mapping:
- FK table → PK table: `many_to_one`
- PK table → FK table: `one_to_many`
- Unique FK: `one_to_one`
- Junction table: `many_to_many`
If no foreign keys were found, infer from naming conventions:
- Column `<table>_id` or `<table_singular>_id` → likely FK to `<table>`
- Ask the user to confirm inferred relationships- name: orders_customers
models:
- orders
- customers join_type: many_to_one condition: "orders.customer_id = customers.customer_id"
连接类型映射:
- 外键表 → 主键表:`many_to_one`
- 主键表 → 外键表:`one_to_many`
- 唯一外键:`one_to_one`
- 关联表:`many_to_many`
如果未发现外键,可根据命名规则推断:
- 列名为`<table>_id`或`<table_singular>_id` → 可能是指向`<table>`的外键
- 请用户确认推断的关系Step 4 — Add descriptions (optional but valuable)
步骤4 —— 添加描述(可选但有价值)
Ask the user to describe:
- Each model (1-2 sentences about what the table represents)
- Key columns (especially calculated fields or non-obvious names)
These descriptions are indexed by and significantly
improve LLM query accuracy.
wren memory index请用户描述:
- 每个模型(1-2句话说明表的用途)
- 关键列(尤其是计算字段或名称不明确的列)
这些描述会被索引,显著提升LLM查询的准确性。
wren memory indexPhase 5 — Validate and build
阶段5 —— 验证并构建
bash
undefinedbash
undefinedValidate YAML structure and integrity
验证YAML结构和完整性
wren context validate --path /path/to/project
wren context validate --path /path/to/project
If strict mode is desired:
如果需要严格模式:
wren context validate --path /path/to/project --strict
wren context validate --path /path/to/project --strict
Build JSON manifest
构建JSON清单
wren context build --path /path/to/project
wren context build --path /path/to/project
Verify against database
与数据库进行验证
wren --sql "SELECT * FROM <model_name> LIMIT 1"
If validation fails, fix the reported issues and re-run. Common errors:
- Duplicate model/column names
- Missing primary key
- Relationship referencing non-existent model
- Invalid column type (try re-running through `parse_type`)
---wren --sql "SELECT * FROM <model_name> LIMIT 1"
如果验证失败,修复报告的问题后重新运行。常见错误:
- 重复的模型/列名
- 缺少主键
- 关系引用不存在的模型
- 无效的列类型(尝试重新通过`parse_type`处理)
---Phase 6 — Initialize memory
阶段6 —— 初始化内存
bash
undefinedbash
undefinedIndex schema (generates seed NL-SQL examples automatically)
索引schema(自动生成种子NL-SQL示例)
wren memory index
wren memory index
Verify
验证
wren memory status
After this step, `wren memory fetch` and `wren memory recall` are
operational. See the **wren-usage** skill for query workflows.
---wren memory status
完成此步骤后,`wren memory fetch`和`wren memory recall`即可使用。查询工作流请查看**wren-usage** skill。
---Phase 7 — Iterate with the user
阶段7 —— 与用户迭代优化
The initial MDL is a starting point. Improve it by:
- Adding calculated columns based on business logic
- Adding views for common query patterns
- Refining descriptions based on actual query usage
- Adding access control (RLAC/CLAC) if needed
Each change follows: edit YAML → →
→ .
wren context validatewren context buildwren memory index初始MDL只是起点。可通过以下方式改进:
- 根据业务逻辑添加计算列
- 为常见查询模式添加视图
- 根据实际查询使用情况优化描述
- 如有需要添加访问控制(RLAC/CLAC)
每次更改遵循:编辑YAML → → → 。
wren context validatewren context buildwren memory indexQuick reference
快速参考
| Task | Command / Method |
|---|---|
| Discover tables | Agent's own tools (SQLAlchemy, driver, raw SQL) |
| Discover columns + types | Agent's own tools |
| Discover constraints | Agent's own tools |
| Normalize types (Python) | |
| Normalize types (CLI) | |
| Normalize types (batch) | |
| Scaffold project | |
| Write models | Create |
| Write relationships | Edit |
| Validate | |
| Build manifest | |
| Test query | |
| Index memory | |
| 任务 | 命令/方法 |
|---|---|
| 发现表 | Agent自有工具(SQLAlchemy、驱动、原生SQL) |
| 发现列+类型 | Agent自有工具 |
| 发现约束 | Agent自有工具 |
| 类型标准化(Python) | |
| 类型标准化(CLI) | |
| 类型标准化(批量) | |
| 搭建项目 | |
| 写入模型 | 创建 |
| 写入关系 | 编辑 |
| 验证 | |
| 构建清单 | |
| 测试查询 | |
| 索引内存 | |
Things to avoid
注意事项
- Do not hardcode database-specific type strings in MDL — always normalize via
parse_type - Do not skip validation before build — invalid YAML produces broken manifests silently
- Do not guess column types — introspect from the actual database
- Do not write relationships without confirming join conditions — wrong conditions cause silent query errors
- Do not skip after build — stale indexes degrade recall quality
wren memory index
- 不要在MDL中硬编码数据库特定类型字符串 —— 始终通过进行标准化
parse_type - 不要在构建前跳过验证 —— 无效YAML会静默生成损坏的清单
- 不要猜测列类型 —— 从实际数据库自省获取
- 不要在未确认连接条件的情况下写入关系 —— 错误的条件会导致静默查询错误
- 不要在构建后跳过—— 过期索引会降低召回质量
wren memory index