wren-generate-mdl

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Generate Wren MDL — CLI Agent Workflow

生成Wren MDL —— CLI Agent工作流

Version check

版本检查

Silently fetch

https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.json

. Compare the

wren-generate-mdl

key with this skill's version (from the frontmatter above). If the remote version is newer, notify the user before proceeding:

A newer version of the wren-generate-mdl skill is available. Update with:
npx skills add Canner/wren-engine --skill wren-generate-mdl
The CLI auto-detects your installed agent. To target a specific one, add
--agent <name>
(e.g.,
claude-code
,
cursor
,
windsurf
,
cline
).

Then continue with the workflow below regardless of update status.

Builds an MDL project by discovering database schema and converting it into Wren's YAML project format. The agent uses whatever database tools are available in its environment for introspection; the wren CLI handles type normalization, validation, and build.

For memory and query workflows after setup, see the wren-usage skill.

静默获取

https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.json

。将其中的

wren-generate-mdl

键与本skill的版本（来自上方的前置元数据）进行对比。如果远程版本更新，在继续操作前通知用户：

wren-generate-mdl skill的新版本已发布。更新命令：
npx skills add Canner/wren-engine --skill wren-generate-mdl
CLI会自动检测已安装的Agent。若要指定特定Agent，请添加
--agent <name>
（例如
claude-code
、
cursor
、
windsurf
、
cline
）。

无论是否更新，都继续执行下方的工作流。

通过发现数据库schema并将其转换为Wren的YAML项目格式，构建MDL项目。Agent会使用环境中可用的任何数据库工具进行自省；wren CLI负责类型标准化、验证和构建。

关于设置完成后的内存和查询工作流，请查看wren-usage skill。

Prerequisites

前置条件

wren

CLI installed (

pip install wren-engine[<datasource>]

)

A working database connection (credentials available to the agent)
A wren profile configured (
```
wren profile add
```
) or connection info ready

已安装

wren

CLI（

pip install wren-engine[<datasource>]

）

可用的数据库连接（Agent可获取凭证）
已配置wren配置文件（
```
wren profile add
```
）或准备好连接信息

Phase 0 — Detect existing project

阶段0 —— 检测现有项目

Goal: If the current directory is already inside a wren project, let the user decide how to proceed.

Check whether

wren_project.yml

exists in the current working directory (or any parent up to the repository root). If found:

Tell the user that an existing wren project was detected and show its path.
Ask:
- Reset — wipe the existing project (
```
models/
```
  ,
```
views/
```
  ,
```
relationships.yml
```
  ,
```
instructions.md
```
  , and rebuild
```
wren_project.yml
```
  ) and regenerate from scratch in the same directory.
- New path — keep the existing project untouched and choose a different directory for the new project. Ask the user for the new path, then
```
wren context init --path <new_path>
```
  and continue from Phase 1 using that path.

If no existing project is detected, proceed directly to Phase 1.

目标： 如果当前目录已属于wren项目，让用户决定后续操作。

检查当前工作目录（或向上至仓库根目录的任何父目录）中是否存在

wren_project.yml

。如果找到：

告知用户检测到现有wren项目并显示其路径。
询问：
- 重置 —— 清空现有项目（
```
models/
```
  、
```
views/
```
  、
```
relationships.yml
```
  、
```
instructions.md
```
  ，并重建
```
wren_project.yml
```
  ），在同一目录从头重新生成。
- 新路径 —— 保留现有项目不变，为新项目选择不同目录。询问用户新路径，然后执行
```
wren context init --path <new_path>
```
  ，并使用该路径从阶段1继续。

如果未检测到现有项目，直接进入阶段1。

Phase 1 — Establish connection and scope

阶段1 —— 建立连接和确定范围

Goal: Confirm the agent can reach the database and agree on scope with the user.

Verify connectivity using whichever tool is available:
- If SQLAlchemy:
```
engine.connect()
```
  test
- If database driver: simple query like
```
SELECT 1
```
- If wren profile exists:
```
wren profile debug
```
  to check config
- If raw SQL via wren:
```
wren --sql "SELECT 1"
```
  (requires profile or connection file)
Ask the user:
- Which schema(s) or dataset(s) to include (skip if only one exists)
- Whether to include all tables or a subset
- The datasource type for wren (e.g.,
```
postgres
```
  ,
```
bigquery
```
  ,
```
snowflake
```
  ) — needed for type normalization dialect

目标： 确认Agent可连接到数据库，并与用户达成范围共识。

使用可用工具验证连通性：
- 如果有SQLAlchemy：执行
```
engine.connect()
```
  测试
- 如果有数据库驱动：执行简单查询，如
```
SELECT 1
```
- 如果存在wren配置文件：执行
```
wren profile debug
```
  检查配置
- 如果通过wren使用原生SQL：执行
```
wren --sql "SELECT 1"
```
  （需要配置文件或连接文件）
询问用户：
- 要包含哪些schema或数据集（如果仅存在一个则跳过）
- 是否包含所有表或子集
- wren的数据源类型（例如
```
postgres
```
  、
```
bigquery
```
  、
```
snowflake
```
  ）—— 类型标准化方言需要此信息

Phase 2 — Discover schema

阶段2 —— 发现schema

Goal: Collect table names, column names, column types, and constraints.

Use whatever introspection method is available. Here are common approaches ranked by convenience:

目标： 收集表名、列名、列类型和约束。

使用可用的自省方法。以下是按便捷性排序的常见方法：

Option A: SQLAlchemy (recommended if available)

选项A：SQLAlchemy（如果可用则推荐）

python

from sqlalchemy import create_engine, inspect

engine = create_engine(connection_url)
inspector = inspect(engine)

tables = inspector.get_table_names(schema="public")

for table in tables:
    columns = inspector.get_columns(table, schema="public")
    # columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]

    pk = inspector.get_pk_constraint(table, schema="public")
    # pk → {"constrained_columns": ["id"], "name": "orders_pkey"}

    fks = inspector.get_foreign_keys(table, schema="public")
    # fks → [{"constrained_columns": ["customer_id"],
    #          "referred_table": "customers",
    #          "referred_columns": ["id"]}]

python

from sqlalchemy import create_engine, inspect

engine = create_engine(connection_url)
inspector = inspect(engine)

tables = inspector.get_table_names(schema="public")

for table in tables:
    columns = inspector.get_columns(table, schema="public")
    # columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]

    pk = inspector.get_pk_constraint(table, schema="public")
    # pk → {"constrained_columns": ["id"], "name": "orders_pkey"}

    fks = inspector.get_foreign_keys(table, schema="public")
    # fks → [{"constrained_columns": ["customer_id"],
    #          "referred_table": "customers",
    #          "referred_columns": ["id"]}]

Option B: Database-specific driver

选项B：数据库特定驱动

psycopg / asyncpg (Postgres): Query

information_schema.columns

and

information_schema.table_constraints

google-cloud-bigquery:

client.list_tables()

client.get_table()

→

table.schema

snowflake-connector-python:

SHOW COLUMNS IN TABLE

SHOW PRIMARY KEYS IN TABLE

clickhouse-driver:
```
DESCRIBE TABLE
```
,
```
system.tables
```

psycopg / asyncpg（Postgres）： 查询

information_schema.columns

和

information_schema.table_constraints

google-cloud-bigquery：

client.list_tables()

、

client.get_table()

→

table.schema

snowflake-connector-python：

SHOW COLUMNS IN TABLE

、

SHOW PRIMARY KEYS IN TABLE

clickhouse-driver：
```
DESCRIBE TABLE
```
、
```
system.tables
```

Option C: Raw SQL via wren

选项C：通过wren使用原生SQL

If no driver is available but a wren profile is configured, query

information_schema

through wren itself:

bash

wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json

Note: this goes through the MDL layer, so it only works if you already have a minimal MDL or if the database supports

information_schema

as regular tables. For bootstrapping from zero, Option A or B is preferred.

如果没有驱动但已配置wren配置文件，通过wren本身查询

information_schema

：

bash

wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json

注意：此方法通过MDL层，因此仅在已有最小MDL或数据库支持将

information_schema

作为常规表时有效。从零开始搭建时，优先选择选项A或B。

Phase 3 — Normalize types

阶段3 —— 类型标准化

Goal: Convert raw database types to wren-core-compatible types.

目标： 将原始数据库类型转换为wren-core兼容类型。

Python import (recommended for batch processing)

Python导入（批量处理推荐）

python

from wren.type_mapping import parse_type, parse_types

python

from wren.type_mapping import parse_type, parse_types

Single type

单个类型

normalized = parse_type("character varying(255)", "postgres") # → "VARCHAR(255)"

Batch — entire table at once

批量处理 —— 一次性处理整张表

columns = [ {"column": "id", "raw_type": "int8"}, {"column": "name", "raw_type": "character varying"}, {"column": "total", "raw_type": "numeric(10,2)"}, ] normalized_cols = parse_types(columns, dialect="postgres")

Each dict now has a "type" key with the normalized value

每个字典现在包含一个"type"键，对应标准化后的值

undefined

undefined

CLI (if Python import not available)

CLI（如果无法使用Python导入）

Single type:

bash

wren utils parse-type --type "character varying(255)" --dialect postgres

单个类型：

bash

wren utils parse-type --type "character varying(255)" --dialect postgres

→ VARCHAR(255)


Batch (stdin JSON):
```bash
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
  | wren utils parse-types --dialect postgres


批量处理（标准输入JSON）：
```bash
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
  | wren utils parse-types --dialect postgres

Phase 4 — Scaffold and write MDL project

阶段4 —— 搭建并写入MDL项目

Goal: Create the YAML project structure.

目标： 创建YAML项目结构。

Step 1 — Initialize project

步骤1 —— 初始化项目

bash

wren context init --path /path/to/project

This creates:

text

project/
├── wren_project.yml
├── models/
├── views/
├── relationships.yml
└── instructions.md

IMPORTANT:
catalog
and
schema
in
wren_project.yml
These are Wren Engine's internal namespace — they are NOT the database's native catalog or schema. Keep the defaults (
catalog: wren
,
schema: public
) unless you are intentionally configuring a multi-project namespace.
Your database's actual catalog/schema is specified per-model in
table_reference
(see Step 2). Do not copy database catalog/schema values into
wren_project.yml
.

bash

wren context init --path /path/to/project

此命令会创建：

text

project/
├── wren_project.yml
├── models/
├── views/
├── relationships.yml
└── instructions.md

重要提示：
wren_project.yml
中的
catalog
和
schema
这些是Wren Engine的内部命名空间 —— 并非数据库的原生catalog或schema。除非有意配置多项目命名空间，否则请保留默认值（
catalog: wren
，
schema: public
）。
数据库实际的catalog/schema是在
table_reference
中按模型指定的（见步骤2）。请勿将数据库的catalog/schema值复制到
wren_project.yml
中。

Step 2 — Write model files

步骤2 —— 写入模型文件

For each table, create a YAML file under

models/

. Use snake_case naming (the build step converts to camelCase automatically).

yaml

undefined

为每张表在

models/

下创建一个YAML文件。使用蛇形命名法（构建步骤会自动转换为驼峰命名法）。

yaml

undefined

models/orders/metadata.yml

name: orders table_reference: catalog: "" # database catalog (empty string if not applicable; # for DuckDB, use the DB file name without extension, # e.g. jaffle_shop.duckdb → catalog: jaffle_shop) schema: public # database schema (this IS the DB schema) table: orders # database table name primary_key: order_id columns:

name: order_id type: INTEGER not_null: true
name: customer_id type: INTEGER
name: total type: "DECIMAL(10, 2)"
name: status type: VARCHAR properties: description: "Order status: pending, shipped, delivered, cancelled"

undefined

name: orders table_reference: catalog: "" # 数据库catalog（若不适用则为空字符串； # 对于DuckDB，使用不带扩展名的DB文件名， # 例如 jaffle_shop.duckdb → catalog: jaffle_shop） schema: public # 数据库schema（此处为数据库实际的schema） table: orders # 数据库表名 primary_key: order_id columns:

name: order_id type: INTEGER not_null: true
name: customer_id type: INTEGER
name: total type: "DECIMAL(10, 2)"
name: status type: VARCHAR properties: description: "订单状态：pending、shipped、delivered、cancelled"

undefined

Step 3 — Write relationships

步骤3 —— 写入关系

From foreign key constraints discovered in Phase 2:

yaml

undefined

根据阶段2中发现的外键约束：

yaml

undefined

relationships.yml

name: orders_customers models:
- orders
- customers join_type: many_to_one condition: "orders.customer_id = customers.customer_id"


Join type mapping:
- FK table → PK table: `many_to_one`
- PK table → FK table: `one_to_many`
- Unique FK: `one_to_one`
- Junction table: `many_to_many`

If no foreign keys were found, infer from naming conventions:
- Column `<table>_id` or `<table_singular>_id` → likely FK to `<table>`
- Ask the user to confirm inferred relationships

name: orders_customers models:
- orders
- customers join_type: many_to_one condition: "orders.customer_id = customers.customer_id"


连接类型映射：
- 外键表 → 主键表：`many_to_one`
- 主键表 → 外键表：`one_to_many`
- 唯一外键：`one_to_one`
- 关联表：`many_to_many`

如果未发现外键，可根据命名规则推断：
- 列名为`<table>_id`或`<table_singular>_id` → 可能是指向`<table>`的外键
- 请用户确认推断的关系

Step 4 — Add descriptions (optional but valuable)

步骤4 —— 添加描述（可选但有价值）

Ask the user to describe:

Each model (1-2 sentences about what the table represents)
Key columns (especially calculated fields or non-obvious names)

These descriptions are indexed by

wren memory index

and significantly improve LLM query accuracy.

请用户描述：

每个模型（1-2句话说明表的用途）
关键列（尤其是计算字段或名称不明确的列）

这些描述会被

wren memory index

索引，显著提升LLM查询的准确性。

Phase 5 — Validate and build

阶段5 —— 验证并构建

bash

undefined

bash

undefined

Validate YAML structure and integrity

验证YAML结构和完整性

wren context validate --path /path/to/project

If strict mode is desired:

如果需要严格模式：

wren context validate --path /path/to/project --strict

Build JSON manifest

构建JSON清单

wren context build --path /path/to/project

Verify against database

与数据库进行验证

wren --sql "SELECT * FROM <model_name> LIMIT 1"


If validation fails, fix the reported issues and re-run. Common errors:
- Duplicate model/column names
- Missing primary key
- Relationship referencing non-existent model
- Invalid column type (try re-running through `parse_type`)

---

wren --sql "SELECT * FROM <model_name> LIMIT 1"


如果验证失败，修复报告的问题后重新运行。常见错误：
- 重复的模型/列名
- 缺少主键
- 关系引用不存在的模型
- 无效的列类型（尝试重新通过`parse_type`处理）

---

Phase 6 — Initialize memory

阶段6 —— 初始化内存

bash

undefined

bash

undefined

Index schema (generates seed NL-SQL examples automatically)

索引schema（自动生成种子NL-SQL示例）

wren memory index

Verify

验证

wren memory status


After this step, `wren memory fetch` and `wren memory recall` are
operational. See the **wren-usage** skill for query workflows.

---

wren memory status


完成此步骤后，`wren memory fetch`和`wren memory recall`即可使用。查询工作流请查看**wren-usage** skill。

---

Phase 7 — Iterate with the user

阶段7 —— 与用户迭代优化

The initial MDL is a starting point. Improve it by:

Adding calculated columns based on business logic
Adding views for common query patterns
Refining descriptions based on actual query usage
Adding access control (RLAC/CLAC) if needed

Each change follows: edit YAML →

wren context validate

→

wren context build

→

wren memory index

初始MDL只是起点。可通过以下方式改进：

根据业务逻辑添加计算列
为常见查询模式添加视图
根据实际查询使用情况优化描述
如有需要添加访问控制（RLAC/CLAC）

每次更改遵循：编辑YAML →

wren context validate

→

wren context build

→

wren memory index

。

Quick reference

快速参考

Task	Command / Method
Discover tables	Agent's own tools (SQLAlchemy, driver, raw SQL)
Discover columns + types	Agent's own tools
Discover constraints	Agent's own tools
Normalize types (Python)	`from wren.type_mapping import parse_type`
Normalize types (CLI)	`wren utils parse-type --type T --dialect D`
Normalize types (batch)	`wren utils parse-types --dialect D < columns.json`
Scaffold project	`wren context init`
Write models	Create `models/<name>/metadata.yml`
Write relationships	Edit `relationships.yml`
Validate	`wren context validate`
Build manifest	`wren context build`
Test query	`wren --sql "SELECT * FROM <model> LIMIT 1"`
Index memory	`wren memory index`

任务	命令/方法
发现表	Agent自有工具（SQLAlchemy、驱动、原生SQL）
发现列+类型	Agent自有工具
发现约束	Agent自有工具
类型标准化（Python）	`from wren.type_mapping import parse_type`
类型标准化（CLI）	`wren utils parse-type --type T --dialect D`
类型标准化（批量）	`wren utils parse-types --dialect D < columns.json`
搭建项目	`wren context init`
写入模型	创建 `models/<name>/metadata.yml`
写入关系	编辑 `relationships.yml`
验证	`wren context validate`
构建清单	`wren context build`
测试查询	`wren --sql "SELECT * FROM <model> LIMIT 1"`
索引内存	`wren memory index`

Things to avoid

注意事项

Do not hardcode database-specific type strings in MDL — always normalize via
```
parse_type
```
Do not skip validation before build — invalid YAML produces broken manifests silently
Do not guess column types — introspect from the actual database
Do not write relationships without confirming join conditions — wrong conditions cause silent query errors
Do not skip
```
wren memory index
```
after build — stale indexes degrade recall quality

不要在MDL中硬编码数据库特定类型字符串 —— 始终通过
```
parse_type
```
进行标准化
不要在构建前跳过验证 —— 无效YAML会静默生成损坏的清单
不要猜测列类型 —— 从实际数据库自省获取
不要在未确认连接条件的情况下写入关系 —— 错误的条件会导致静默查询错误
不要在构建后跳过
```
wren memory index
```
—— 过期索引会降低召回质量