wren-generate-mdl

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Generate Wren MDL — CLI Agent Workflow

生成Wren MDL —— CLI Agent工作流

Version check

版本检查

Silently fetch
https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.json
. Compare the
wren-generate-mdl
key with this skill's version (from the frontmatter above). If the remote version is newer, notify the user before proceeding:
A newer version of the wren-generate-mdl skill is available. Update with:
npx skills add Canner/wren-engine --skill wren-generate-mdl
The CLI auto-detects your installed agent. To target a specific one, add
--agent <name>
(e.g.,
claude-code
,
cursor
,
windsurf
,
cline
).
Then continue with the workflow below regardless of update status.

Builds an MDL project by discovering database schema and converting it into Wren's YAML project format. The agent uses whatever database tools are available in its environment for introspection; the wren CLI handles type normalization, validation, and build.
For memory and query workflows after setup, see the wren-usage skill.

静默获取
https://raw.githubusercontent.com/Canner/wren-engine/main/skills/versions.json
。 将其中的
wren-generate-mdl
键与本skill的版本(来自上方的前置元数据)进行对比。 如果远程版本更新,在继续操作前通知用户:
wren-generate-mdl skill的新版本已发布。 更新命令:
npx skills add Canner/wren-engine --skill wren-generate-mdl
CLI会自动检测已安装的Agent。若要指定特定Agent,请添加
--agent <name>
(例如
claude-code
cursor
windsurf
cline
)。
无论是否更新,都继续执行下方的工作流。

通过发现数据库schema并将其转换为Wren的YAML项目格式,构建MDL项目。Agent会使用环境中可用的任何数据库工具进行自省;wren CLI负责类型标准化、验证和构建。
关于设置完成后的内存和查询工作流,请查看wren-usage skill。

Prerequisites

前置条件

  • wren
    CLI installed (
    pip install wren-engine[<datasource>]
    )
  • A working database connection (credentials available to the agent)
  • A wren profile configured (
    wren profile add
    ) or connection info ready

  • 已安装
    wren
    CLI(
    pip install wren-engine[<datasource>]
  • 可用的数据库连接(Agent可获取凭证)
  • 已配置wren配置文件(
    wren profile add
    )或准备好连接信息

Phase 0 — Detect existing project

阶段0 —— 检测现有项目

Goal: If the current directory is already inside a wren project, let the user decide how to proceed.
Check whether
wren_project.yml
exists in the current working directory (or any parent up to the repository root). If found:
  1. Tell the user that an existing wren project was detected and show its path.
  2. Ask:
    • Reset — wipe the existing project (
      models/
      ,
      views/
      ,
      relationships.yml
      ,
      instructions.md
      , and rebuild
      wren_project.yml
      ) and regenerate from scratch in the same directory.
    • New path — keep the existing project untouched and choose a different directory for the new project. Ask the user for the new path, then
      wren context init --path <new_path>
      and continue from Phase 1 using that path.
If no existing project is detected, proceed directly to Phase 1.

目标: 如果当前目录已属于wren项目,让用户决定后续操作。
检查当前工作目录(或向上至仓库根目录的任何父目录)中是否存在
wren_project.yml
。如果找到:
  1. 告知用户检测到现有wren项目并显示其路径。
  2. 询问:
    • 重置 —— 清空现有项目(
      models/
      views/
      relationships.yml
      instructions.md
      ,并重建
      wren_project.yml
      ),在同一目录从头重新生成。
    • 新路径 —— 保留现有项目不变,为新项目选择不同目录。询问用户新路径,然后执行
      wren context init --path <new_path>
      ,并使用该路径从阶段1继续。
如果未检测到现有项目,直接进入阶段1。

Phase 1 — Establish connection and scope

阶段1 —— 建立连接和确定范围

Goal: Confirm the agent can reach the database and agree on scope with the user.
  1. Verify connectivity using whichever tool is available:
    • If SQLAlchemy:
      engine.connect()
      test
    • If database driver: simple query like
      SELECT 1
    • If wren profile exists:
      wren profile debug
      to check config
    • If raw SQL via wren:
      wren --sql "SELECT 1"
      (requires profile or connection file)
  2. Ask the user:
    • Which schema(s) or dataset(s) to include (skip if only one exists)
    • Whether to include all tables or a subset
    • The datasource type for wren (e.g.,
      postgres
      ,
      bigquery
      ,
      snowflake
      ) — needed for type normalization dialect

目标: 确认Agent可连接到数据库,并与用户达成范围共识。
  1. 使用可用工具验证连通性:
    • 如果有SQLAlchemy:执行
      engine.connect()
      测试
    • 如果有数据库驱动:执行简单查询,如
      SELECT 1
    • 如果存在wren配置文件:执行
      wren profile debug
      检查配置
    • 如果通过wren使用原生SQL:执行
      wren --sql "SELECT 1"
      (需要配置文件或连接文件)
  2. 询问用户:
    • 要包含哪些schema数据集(如果仅存在一个则跳过)
    • 是否包含所有表或子集
    • wren的数据源类型(例如
      postgres
      bigquery
      snowflake
      )—— 类型标准化方言需要此信息

Phase 2 — Discover schema

阶段2 —— 发现schema

Goal: Collect table names, column names, column types, and constraints.
Use whatever introspection method is available. Here are common approaches ranked by convenience:
目标: 收集表名、列名、列类型和约束。
使用可用的自省方法。以下是按便捷性排序的常见方法:

Option A: SQLAlchemy (recommended if available)

选项A:SQLAlchemy(如果可用则推荐)

python
from sqlalchemy import create_engine, inspect

engine = create_engine(connection_url)
inspector = inspect(engine)

tables = inspector.get_table_names(schema="public")

for table in tables:
    columns = inspector.get_columns(table, schema="public")
    # columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]

    pk = inspector.get_pk_constraint(table, schema="public")
    # pk → {"constrained_columns": ["id"], "name": "orders_pkey"}

    fks = inspector.get_foreign_keys(table, schema="public")
    # fks → [{"constrained_columns": ["customer_id"],
    #          "referred_table": "customers",
    #          "referred_columns": ["id"]}]
python
from sqlalchemy import create_engine, inspect

engine = create_engine(connection_url)
inspector = inspect(engine)

tables = inspector.get_table_names(schema="public")

for table in tables:
    columns = inspector.get_columns(table, schema="public")
    # columns → [{"name": "id", "type": INTEGER(), "nullable": False, ...}]

    pk = inspector.get_pk_constraint(table, schema="public")
    # pk → {"constrained_columns": ["id"], "name": "orders_pkey"}

    fks = inspector.get_foreign_keys(table, schema="public")
    # fks → [{"constrained_columns": ["customer_id"],
    #          "referred_table": "customers",
    #          "referred_columns": ["id"]}]

Option B: Database-specific driver

选项B:数据库特定驱动

  • psycopg / asyncpg (Postgres): Query
    information_schema.columns
    and
    information_schema.table_constraints
  • google-cloud-bigquery:
    client.list_tables()
    ,
    client.get_table()
    table.schema
  • snowflake-connector-python:
    SHOW COLUMNS IN TABLE
    ,
    SHOW PRIMARY KEYS IN TABLE
  • clickhouse-driver:
    DESCRIBE TABLE
    ,
    system.tables
  • psycopg / asyncpg(Postgres): 查询
    information_schema.columns
    information_schema.table_constraints
  • google-cloud-bigquery:
    client.list_tables()
    client.get_table()
    table.schema
  • snowflake-connector-python:
    SHOW COLUMNS IN TABLE
    SHOW PRIMARY KEYS IN TABLE
  • clickhouse-driver:
    DESCRIBE TABLE
    system.tables

Option C: Raw SQL via wren

选项C:通过wren使用原生SQL

If no driver is available but a wren profile is configured, query
information_schema
through wren itself:
bash
wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json
Note: this goes through the MDL layer, so it only works if you already have a minimal MDL or if the database supports
information_schema
as regular tables. For bootstrapping from zero, Option A or B is preferred.

如果没有驱动但已配置wren配置文件,通过wren本身查询
information_schema
bash
wren --sql "SELECT table_name FROM information_schema.tables WHERE table_schema = 'public'" -o json
wren --sql "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'orders'" -o json
注意:此方法通过MDL层,因此仅在已有最小MDL或数据库支持将
information_schema
作为常规表时有效。从零开始搭建时,优先选择选项A或B。

Phase 3 — Normalize types

阶段3 —— 类型标准化

Goal: Convert raw database types to wren-core-compatible types.
目标: 将原始数据库类型转换为wren-core兼容类型。

Python import (recommended for batch processing)

Python导入(批量处理推荐)

python
from wren.type_mapping import parse_type, parse_types
python
from wren.type_mapping import parse_type, parse_types

Single type

单个类型

normalized = parse_type("character varying(255)", "postgres") # → "VARCHAR(255)"
normalized = parse_type("character varying(255)", "postgres") # → "VARCHAR(255)"

Batch — entire table at once

批量处理 —— 一次性处理整张表

columns = [ {"column": "id", "raw_type": "int8"}, {"column": "name", "raw_type": "character varying"}, {"column": "total", "raw_type": "numeric(10,2)"}, ] normalized_cols = parse_types(columns, dialect="postgres")
columns = [ {"column": "id", "raw_type": "int8"}, {"column": "name", "raw_type": "character varying"}, {"column": "total", "raw_type": "numeric(10,2)"}, ] normalized_cols = parse_types(columns, dialect="postgres")

Each dict now has a "type" key with the normalized value

每个字典现在包含一个"type"键,对应标准化后的值

undefined
undefined

CLI (if Python import not available)

CLI(如果无法使用Python导入)

Single type:
bash
wren utils parse-type --type "character varying(255)" --dialect postgres
单个类型:
bash
wren utils parse-type --type "character varying(255)" --dialect postgres

→ VARCHAR(255)

→ VARCHAR(255)


Batch (stdin JSON):
```bash
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
  | wren utils parse-types --dialect postgres


批量处理(标准输入JSON):
```bash
echo '[{"column":"id","raw_type":"int8"},{"column":"name","raw_type":"character varying"}]' \
  | wren utils parse-types --dialect postgres

Phase 4 — Scaffold and write MDL project

阶段4 —— 搭建并写入MDL项目

Goal: Create the YAML project structure.
目标: 创建YAML项目结构。

Step 1 — Initialize project

步骤1 —— 初始化项目

bash
wren context init --path /path/to/project
This creates:
text
project/
├── wren_project.yml
├── models/
├── views/
├── relationships.yml
└── instructions.md
IMPORTANT:
catalog
and
schema
in
wren_project.yml
These are Wren Engine's internal namespace — they are NOT the database's native catalog or schema. Keep the defaults (
catalog: wren
,
schema: public
) unless you are intentionally configuring a multi-project namespace.
Your database's actual catalog/schema is specified per-model in
table_reference
(see Step 2). Do not copy database catalog/schema values into
wren_project.yml
.
bash
wren context init --path /path/to/project
此命令会创建:
text
project/
├── wren_project.yml
├── models/
├── views/
├── relationships.yml
└── instructions.md
重要提示:
wren_project.yml
中的
catalog
schema
这些是Wren Engine的内部命名空间 —— 并非数据库的原生catalog或schema。除非有意配置多项目命名空间,否则请保留默认值(
catalog: wren
schema: public
)。
数据库实际的catalog/schema是在
table_reference
中按模型指定的(见步骤2)。请勿将数据库的catalog/schema值复制到
wren_project.yml
中。

Step 2 — Write model files

步骤2 —— 写入模型文件

For each table, create a YAML file under
models/
. Use snake_case naming (the build step converts to camelCase automatically).
yaml
undefined
为每张表在
models/
下创建一个YAML文件。使用蛇形命名法(构建步骤会自动转换为驼峰命名法)。
yaml
undefined

models/orders/metadata.yml

models/orders/metadata.yml

name: orders table_reference: catalog: "" # database catalog (empty string if not applicable; # for DuckDB, use the DB file name without extension, # e.g. jaffle_shop.duckdb → catalog: jaffle_shop) schema: public # database schema (this IS the DB schema) table: orders # database table name primary_key: order_id columns:
  • name: order_id type: INTEGER not_null: true
  • name: customer_id type: INTEGER
  • name: total type: "DECIMAL(10, 2)"
  • name: status type: VARCHAR properties: description: "Order status: pending, shipped, delivered, cancelled"
undefined
name: orders table_reference: catalog: "" # 数据库catalog(若不适用则为空字符串; # 对于DuckDB,使用不带扩展名的DB文件名, # 例如 jaffle_shop.duckdb → catalog: jaffle_shop) schema: public # 数据库schema(此处为数据库实际的schema) table: orders # 数据库表名 primary_key: order_id columns:
  • name: order_id type: INTEGER not_null: true
  • name: customer_id type: INTEGER
  • name: total type: "DECIMAL(10, 2)"
  • name: status type: VARCHAR properties: description: "订单状态:pending、shipped、delivered、cancelled"
undefined

Step 3 — Write relationships

步骤3 —— 写入关系

From foreign key constraints discovered in Phase 2:
yaml
undefined
根据阶段2中发现的外键约束:
yaml
undefined

relationships.yml

relationships.yml

  • name: orders_customers models:
    • orders
    • customers join_type: many_to_one condition: "orders.customer_id = customers.customer_id"

Join type mapping:
- FK table → PK table: `many_to_one`
- PK table → FK table: `one_to_many`
- Unique FK: `one_to_one`
- Junction table: `many_to_many`

If no foreign keys were found, infer from naming conventions:
- Column `<table>_id` or `<table_singular>_id` → likely FK to `<table>`
- Ask the user to confirm inferred relationships
  • name: orders_customers models:
    • orders
    • customers join_type: many_to_one condition: "orders.customer_id = customers.customer_id"

连接类型映射:
- 外键表 → 主键表:`many_to_one`
- 主键表 → 外键表:`one_to_many`
- 唯一外键:`one_to_one`
- 关联表:`many_to_many`

如果未发现外键,可根据命名规则推断:
- 列名为`<table>_id`或`<table_singular>_id` → 可能是指向`<table>`的外键
- 请用户确认推断的关系

Step 4 — Add descriptions (optional but valuable)

步骤4 —— 添加描述(可选但有价值)

Ask the user to describe:
  • Each model (1-2 sentences about what the table represents)
  • Key columns (especially calculated fields or non-obvious names)
These descriptions are indexed by
wren memory index
and significantly improve LLM query accuracy.

请用户描述:
  • 每个模型(1-2句话说明表的用途)
  • 关键列(尤其是计算字段或名称不明确的列)
这些描述会被
wren memory index
索引,显著提升LLM查询的准确性。

Phase 5 — Validate and build

阶段5 —— 验证并构建

bash
undefined
bash
undefined

Validate YAML structure and integrity

验证YAML结构和完整性

wren context validate --path /path/to/project
wren context validate --path /path/to/project

If strict mode is desired:

如果需要严格模式:

wren context validate --path /path/to/project --strict
wren context validate --path /path/to/project --strict

Build JSON manifest

构建JSON清单

wren context build --path /path/to/project
wren context build --path /path/to/project

Verify against database

与数据库进行验证

wren --sql "SELECT * FROM <model_name> LIMIT 1"

If validation fails, fix the reported issues and re-run. Common errors:
- Duplicate model/column names
- Missing primary key
- Relationship referencing non-existent model
- Invalid column type (try re-running through `parse_type`)

---
wren --sql "SELECT * FROM <model_name> LIMIT 1"

如果验证失败,修复报告的问题后重新运行。常见错误:
- 重复的模型/列名
- 缺少主键
- 关系引用不存在的模型
- 无效的列类型(尝试重新通过`parse_type`处理)

---

Phase 6 — Initialize memory

阶段6 —— 初始化内存

bash
undefined
bash
undefined

Index schema (generates seed NL-SQL examples automatically)

索引schema(自动生成种子NL-SQL示例)

wren memory index
wren memory index

Verify

验证

wren memory status

After this step, `wren memory fetch` and `wren memory recall` are
operational. See the **wren-usage** skill for query workflows.

---
wren memory status

完成此步骤后,`wren memory fetch`和`wren memory recall`即可使用。查询工作流请查看**wren-usage** skill。

---

Phase 7 — Iterate with the user

阶段7 —— 与用户迭代优化

The initial MDL is a starting point. Improve it by:
  • Adding calculated columns based on business logic
  • Adding views for common query patterns
  • Refining descriptions based on actual query usage
  • Adding access control (RLAC/CLAC) if needed
Each change follows: edit YAML →
wren context validate
wren context build
wren memory index
.

初始MDL只是起点。可通过以下方式改进:
  • 根据业务逻辑添加计算列
  • 为常见查询模式添加视图
  • 根据实际查询使用情况优化描述
  • 如有需要添加访问控制(RLAC/CLAC)
每次更改遵循:编辑YAML →
wren context validate
wren context build
wren memory index

Quick reference

快速参考

TaskCommand / Method
Discover tablesAgent's own tools (SQLAlchemy, driver, raw SQL)
Discover columns + typesAgent's own tools
Discover constraintsAgent's own tools
Normalize types (Python)
from wren.type_mapping import parse_type
Normalize types (CLI)
wren utils parse-type --type T --dialect D
Normalize types (batch)
wren utils parse-types --dialect D < columns.json
Scaffold project
wren context init
Write modelsCreate
models/<name>/metadata.yml
Write relationshipsEdit
relationships.yml
Validate
wren context validate
Build manifest
wren context build
Test query
wren --sql "SELECT * FROM <model> LIMIT 1"
Index memory
wren memory index

任务命令/方法
发现表Agent自有工具(SQLAlchemy、驱动、原生SQL)
发现列+类型Agent自有工具
发现约束Agent自有工具
类型标准化(Python)
from wren.type_mapping import parse_type
类型标准化(CLI)
wren utils parse-type --type T --dialect D
类型标准化(批量)
wren utils parse-types --dialect D < columns.json
搭建项目
wren context init
写入模型创建
models/<name>/metadata.yml
写入关系编辑
relationships.yml
验证
wren context validate
构建清单
wren context build
测试查询
wren --sql "SELECT * FROM <model> LIMIT 1"
索引内存
wren memory index

Things to avoid

注意事项

  • Do not hardcode database-specific type strings in MDL — always normalize via
    parse_type
  • Do not skip validation before build — invalid YAML produces broken manifests silently
  • Do not guess column types — introspect from the actual database
  • Do not write relationships without confirming join conditions — wrong conditions cause silent query errors
  • Do not skip
    wren memory index
    after build — stale indexes degrade recall quality
  • 不要在MDL中硬编码数据库特定类型字符串 —— 始终通过
    parse_type
    进行标准化
  • 不要在构建前跳过验证 —— 无效YAML会静默生成损坏的清单
  • 不要猜测列类型 —— 从实际数据库自省获取
  • 不要在未确认连接条件的情况下写入关系 —— 错误的条件会导致静默查询错误
  • 不要在构建后跳过
    wren memory index
    —— 过期索引会降低召回质量