testing-dbt-models
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesedbt Testing
dbt 测试
Every model deserves at least one test. Primary keys need unique + not_null.
每个模型都至少需要一个测试。主键需要同时配置unique和not_null测试。
Workflow
工作流程
1. Study Existing Test Patterns
1. 研究现有测试模式
CRITICAL: Match the project's existing testing style before adding new tests.
bash
undefined重要提示:添加新测试前必须匹配项目现有的测试风格。
bash
undefinedFind all schema.yml files with tests
Find all schema.yml files with tests
find . -name "schema.yml" -exec grep -l "tests:" {} ;
find . -name "schema.yml" -exec grep -l "tests:" {} ;
Read existing tests to learn patterns
Read existing tests to learn patterns
cat models/staging/schema.yml | head -100
cat models/marts/schema.yml | head -100
cat models/staging/schema.yml | head -100
cat models/marts/schema.yml | head -100
Check for custom tests or dbt packages
Check for custom tests or dbt packages
ls tests/
cat packages.yml 2>/dev/null
**Extract from existing tests:**
- YAML formatting style (indentation, spacing)
- Test coverage depth (all columns vs key columns only)
- Use of custom tests (dbt_utils, dbt_expectations, custom macros)
- Description style (brief vs detailed)
- Severity levels used (warn vs error)ls tests/
cat packages.yml 2>/dev/null
**从现有测试中提取以下信息:**
- YAML格式风格(缩进、间距)
- 测试覆盖深度(所有列 vs 仅关键列)
- 自定义测试的使用情况(dbt_utils、dbt_expectations、自定义宏)
- 描述风格(简洁 vs 详细)
- 使用的严重级别(warn vs error)2. Read Model SQL
2. 阅读模型SQL文件
bash
cat models/<path>/<model_name>.sqlIdentify: primary keys, foreign keys, categorical columns, date columns, business-critical fields.
bash
cat models/<path>/<model_name>.sql识别:主键、外键、分类列、日期列、业务关键字段。
3. Check Existing Tests for This Model
3. 检查该模型的现有测试
bash
cat models/<path>/schema.yml | grep -A 50 "<model_name>"bash
cat models/<path>/schema.yml | grep -A 50 "<model_name>"or
or
find . -name "schema.yml" -exec grep -l "<model_name>" {} ;
undefinedfind . -name "schema.yml" -exec grep -l "<model_name>" {} ;
undefined4. Identify Testable Columns
4. 确定可测试列
| Column Type | Recommended Tests |
|---|---|
| Primary key | |
| Foreign key | |
| Categorical | |
| Required field | |
| Date/timestamp | |
| Boolean | |
| 列类型 | 推荐测试 |
|---|---|
| 主键 | |
| 外键 | |
| 分类列 | |
| 必填字段 | |
| 日期/时间戳 | |
| 布尔值 | |
5. Write Tests in schema.yml
5. 在schema.yml中编写测试
Match the existing style from step 1. Example format (adapt to project):
yaml
version: 2
models:
- name: model_name
description: "Brief description of what this model contains"
columns:
- name: primary_key_column
description: "Unique identifier for this record"
tests:
- unique
- not_null
- name: foreign_key_column
description: "Reference to related_model"
tests:
- not_null
- relationships:
to: ref('related_model')
field: related_key_column
- name: status
description: "Current status of the record"
tests:
- not_null
- accepted_values:
values: ['pending', 'active', 'completed', 'cancelled']
- name: created_at
description: "Timestamp when record was created"
tests:
- not_null匹配步骤1中提取的现有风格。示例格式(需根据项目调整):
yaml
version: 2
models:
- name: model_name
description: "Brief description of what this model contains"
columns:
- name: primary_key_column
description: "Unique identifier for this record"
tests:
- unique
- not_null
- name: foreign_key_column
description: "Reference to related_model"
tests:
- not_null
- relationships:
to: ref('related_model')
field: related_key_column
- name: status
description: "Current status of the record"
tests:
- not_null
- accepted_values:
values: ['pending', 'active', 'completed', 'cancelled']
- name: created_at
description: "Timestamp when record was created"
tests:
- not_null6. Run Tests
6. 运行测试
bash
undefinedbash
undefinedTest specific model
Test specific model
dbt test --select <model_name>
dbt test --select <model_name>
Test with upstream
Test with upstream
dbt test --select +<model_name>
undefineddbt test --select +<model_name>
undefined7. Fix Failing Tests
7. 修复测试失败问题
Common failures and fixes:
| Failure | Likely Cause | Fix |
|---|---|---|
| Duplicate records | Add deduplication in model |
| NULL values in source | Add COALESCE or filter |
| Orphan records | Add WHERE clause or fix upstream |
| New/unexpected values | Update accepted values list |
常见失败场景及修复方案:
| 失败类型 | 可能原因 | 修复方案 |
|---|---|---|
| 存在重复记录 | 在模型中添加去重逻辑 |
| 源数据中存在NULL值 | 添加COALESCE函数或过滤条件 |
| 存在孤立记录 | 添加WHERE子句或修复上游数据 |
| 出现新的/未预期的值 | 更新有效值列表 |
Test Types Reference
测试类型参考
Generic Tests (built-in)
通用测试(内置)
yaml
tests:
- unique
- not_null
- accepted_values:
values: ['a', 'b', 'c']
- relationships:
to: ref('other_model')
field: idyaml
tests:
- unique
- not_null
- accepted_values:
values: ['a', 'b', 'c']
- relationships:
to: ref('other_model')
field: idCustom Generic Tests
自定义通用测试
yaml
tests:
- dbt_utils.expression_is_true:
expression: "amount >= 0"
- dbt_utils.recency:
datepart: day
field: created_at
interval: 1yaml
tests:
- dbt_utils.expression_is_true:
expression: "amount >= 0"
- dbt_utils.recency:
datepart: day
field: created_at
interval: 1Singular Tests
单个测试
Create :
tests/<test_name>.sqlsql
-- tests/assert_positive_revenue.sql
select *
from {{ ref('orders') }}
where revenue < 0创建 文件:
tests/<test_name>.sqlsql
-- tests/assert_positive_revenue.sql
select *
from {{ ref('orders') }}
where revenue < 0Anti-Patterns
反模式
- Adding tests without checking existing project patterns first
- Using different YAML formatting style than existing tests
- Models without any tests
- Primary keys without both unique AND not_null
- Testing only obvious columns, ignoring business-critical ones
- Hardcoding accepted_values without confirming with stakeholders
- Adding dbt_utils tests when project doesn't use that package
- 未先检查现有项目模式就添加测试
- 使用与现有测试不同的YAML格式风格
- 模型未配置任何测试
- 主键未同时配置unique和not_null测试
- 仅测试明显列,忽略业务关键列
- 未与相关人员确认就硬编码accepted_values的值
- 项目未使用dbt_utils包却添加该包的测试