database-migration

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Database Migration

数据库迁移

Migration File Conventions

迁移文件规范

Every schema change must be captured in a versioned migration file. Follow these conventions:
  • Timestamped naming: Use
    YYYYMMDDHHMMSS_descriptive_name
    (e.g.,
    20240315143022_add_status_to_orders.sql
    ).
  • Sequential ordering: Migrations run in order. Never insert a migration before one that has already been applied.
  • One concern per file: Each migration handles a single logical change. Do not combine unrelated schema changes.
  • Descriptive names: The filename should describe the change, not the ticket number. Use verbs like
    add
    ,
    remove
    ,
    create
    ,
    drop
    ,
    rename
    ,
    alter
    .
  • Idempotent when possible: Use
    IF NOT EXISTS
    /
    IF EXISTS
    guards to make re-runs safe.
migrations/
  20240301100000_create_users_table.sql
  20240305120000_add_email_index_to_users.sql
  20240310090000_create_orders_table.sql
  20240315143022_add_status_to_orders.sql
所有模式变更都必须记录在带版本号的迁移文件中。遵循以下规范:
  • 带时间戳的命名:使用
    YYYYMMDDHHMMSS_描述性名称
    格式(例如
    20240315143022_add_status_to_orders.sql
    )。
  • 顺序执行:迁移按顺序运行。绝不要在已执行的迁移之前插入新迁移。
  • 单一关注点:每个迁移仅处理一个逻辑变更。不要合并不相关的模式变更。
  • 描述性名称:文件名应描述变更内容,而非工单编号。使用
    add
    remove
    create
    drop
    rename
    alter
    等动词。
  • 尽可能幂等:使用
    IF NOT EXISTS
    /
    IF EXISTS
    保护语句,确保重复执行安全。
migrations/
  20240301100000_create_users_table.sql
  20240305120000_add_email_index_to_users.sql
  20240310090000_create_orders_table.sql
  20240315143022_add_status_to_orders.sql

Forward and Rollback Pairs

正向与回滚配对

Every migration must have a corresponding rollback. If you cannot write a safe rollback, document why and flag it for review.
sql
-- Migration: 20240315143022_add_status_to_orders.sql

-- UP
ALTER TABLE orders ADD COLUMN status VARCHAR(50) DEFAULT 'pending';
CREATE INDEX idx_orders_status ON orders (status);

-- DOWN
DROP INDEX IF EXISTS idx_orders_status;
ALTER TABLE orders DROP COLUMN IF EXISTS status;
每个迁移都必须有对应的回滚脚本。如果无法编写安全的回滚脚本,请说明原因并提交审核。
sql
-- 迁移文件: 20240315143022_add_status_to_orders.sql

-- 正向执行
ALTER TABLE orders ADD COLUMN status VARCHAR(50) DEFAULT 'pending';
CREATE INDEX idx_orders_status ON orders (status);

-- 回滚执行
DROP INDEX IF EXISTS idx_orders_status;
ALTER TABLE orders DROP COLUMN IF EXISTS status;

Rollback Rules

回滚规则

ScenarioRollback Strategy
Add columnDrop column
Add indexDrop index
Create tableDrop table
Add constraintDrop constraint
Rename columnRename back
Drop columnCannot auto-rollback — must restore from backup or use prior migration to re-add
Data backfillReverse backfill or accept data state
场景回滚策略
添加列删除列
添加索引删除索引
创建表删除表
添加约束删除约束
重命名列改回原名称
删除列无法自动回滚 —— 必须从备份恢复或使用之前的迁移重新添加
数据回填反向回填或接受当前数据状态

Zero-Downtime Schema Changes

零停机模式变更

When your application cannot tolerate downtime during deploys, follow the expand-and-contract pattern. Never make breaking schema changes in a single step.
当应用在部署期间无法容忍停机时,请遵循扩展-收缩模式。绝不要通过单一步骤执行破坏性模式变更。

Step-by-Step: Renaming a Column

分步指南:重命名列

Renaming a column directly (
ALTER TABLE ... RENAME COLUMN
) breaks running application instances that reference the old name. Instead:
  1. Deploy 1 — Expand: Add the new column alongside the old one.
    sql
    ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
  2. Deploy 2 — Dual Write: Update application code to write to both columns.
    javascript
    // Write to both columns
    await db.query(
      'UPDATE users SET name = $1, full_name = $1 WHERE id = $2',
      [name, userId]
    );
  3. Deploy 3 — Backfill: Copy existing data from old column to new column.
    sql
    UPDATE users SET full_name = name WHERE full_name IS NULL;
  4. Deploy 4 — Cut Over: Update application to read from and write to new column only.
    javascript
    // Read from new column
    const user = await db.query('SELECT full_name FROM users WHERE id = $1', [userId]);
  5. Deploy 5 — Contract: Drop the old column after a safe observation period.
    sql
    ALTER TABLE users DROP COLUMN name;
直接重命名列(
ALTER TABLE ... RENAME COLUMN
)会破坏引用旧列名的运行中应用实例。正确做法:
  1. 部署1 —— 扩展:在旧列旁添加新列。
    sql
    ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
  2. 部署2 —— 双写:更新应用代码,同时向新旧两列写入数据。
    javascript
    // 同时写入两列
    await db.query(
      'UPDATE users SET name = $1, full_name = $1 WHERE id = $2',
      [name, userId]
    );
  3. 部署3 —— 回填:将旧列的现有数据复制到新列。
    sql
    UPDATE users SET full_name = name WHERE full_name IS NULL;
  4. 部署4 —— 切换:更新应用,仅从新列读取并向新列写入。
    javascript
    // 从新列读取
    const user = await db.query('SELECT full_name FROM users WHERE id = $1', [userId]);
  5. 部署5 —— 收缩:经过一段安全观察期后,删除旧列。
    sql
    ALTER TABLE users DROP COLUMN name;

Step-by-Step: Adding a NOT NULL Constraint

分步指南:添加NOT NULL约束

Adding a
NOT NULL
constraint on an existing column with null values will fail or lock the table. Safe approach:
  1. Add a check constraint as
    NOT VALID
    (PostgreSQL) to avoid scanning existing rows:
    sql
    ALTER TABLE orders ADD CONSTRAINT orders_status_not_null
      CHECK (status IS NOT NULL) NOT VALID;
  2. Backfill any null values:
    sql
    UPDATE orders SET status = 'unknown' WHERE status IS NULL;
  3. Validate the constraint (acquires lighter lock):
    sql
    ALTER TABLE orders VALIDATE CONSTRAINT orders_status_not_null;
  4. Optionally convert to a proper
    NOT NULL
    column constraint:
    sql
    ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
    ALTER TABLE orders DROP CONSTRAINT orders_status_not_null;
在包含空值的现有列上添加
NOT NULL
约束会失败或锁定表。安全方法:
  1. 添加
    NOT VALID
    的检查约束(PostgreSQL),避免扫描现有行:
    sql
    ALTER TABLE orders ADD CONSTRAINT orders_status_not_null
      CHECK (status IS NOT NULL) NOT VALID;
  2. 回填所有空值:
    sql
    UPDATE orders SET status = 'unknown' WHERE status IS NULL;
  3. 验证约束(获取轻量级锁):
    sql
    ALTER TABLE orders VALIDATE CONSTRAINT orders_status_not_null;
  4. (可选)转换为正式的列级
    NOT NULL
    约束:
    sql
    ALTER TABLE orders ALTER COLUMN status SET NOT NULL;
    ALTER TABLE orders DROP CONSTRAINT orders_status_not_null;

Dangerous Operations

危险操作

OperationRiskSafe Alternative
RENAME COLUMN
Breaks queries referencing old nameExpand-and-contract pattern (add new, migrate, drop old)
CHANGE COLUMN TYPE
May require full table rewrite, long lockAdd new column with new type, backfill, swap
DROP COLUMN
Irreversible data lossVerify no code references, back up data, then drop
DROP TABLE
Irreversible data and schema lossRename to
_deprecated_
first, drop after observation period
ADD NOT NULL
(to existing column)
Fails if nulls exist; full table scanAdd as nullable, backfill, then add constraint
ADD COLUMN WITH DEFAULT
(pre-PG11)
Full table rewrite on older PostgreSQLAdd nullable column, set default, backfill
CREATE INDEX
Blocks writes on table (non-concurrent)Use
CREATE INDEX CONCURRENTLY
(PostgreSQL)
ADD FOREIGN KEY
Validates all existing rows, long lockAdd as
NOT VALID
, then
VALIDATE
separately
ALTER TABLE ... LOCK
Exclusive lock blocks all accessMinimize lock duration, run during low traffic
操作风险安全替代方案
RENAME COLUMN
破坏引用旧列名的查询扩展-收缩模式(添加新列、迁移数据、删除旧列)
CHANGE COLUMN TYPE
可能需要全表重写,锁定时间长添加带新类型的新列、回填数据、切换使用
DROP COLUMN
不可逆的数据丢失确认无代码引用、备份数据后再删除
DROP TABLE
不可逆的数据和模式丢失先重命名为
_deprecated_
,观察期后再删除
ADD NOT NULL
(现有列)
若存在空值则失败;全表扫描先添加可为空的列、回填数据,再添加约束
ADD COLUMN WITH DEFAULT
(PG11之前版本)
旧版PostgreSQL会触发全表重写添加可为空列、设置默认值、回填数据
CREATE INDEX
锁定表,阻止写入(非并发)使用
CREATE INDEX CONCURRENTLY
(PostgreSQL)
ADD FOREIGN KEY
验证所有现有行,锁定时间长先添加为
NOT VALID
,再单独执行
VALIDATE
ALTER TABLE ... LOCK
排他锁阻止所有访问最小化锁定时长,在低流量时段执行

Data Migration Strategies

数据迁移策略

Backfill Scripts

回填脚本

For populating new columns with computed or default data:
javascript
// Batch backfill to avoid locking and memory issues
async function backfillStatus(db, batchSize = 1000) {
  let totalUpdated = 0;
  let updated;

  do {
    const result = await db.query(`
      UPDATE orders
      SET status = 'pending'
      WHERE id IN (
        SELECT id FROM orders
        WHERE status IS NULL
        LIMIT $1
        FOR UPDATE SKIP LOCKED
      )
      RETURNING id
    `, [batchSize]);

    updated = result.rowCount;
    totalUpdated += updated;
    console.log(`Backfilled ${totalUpdated} rows...`);

    // Pause between batches to reduce load
    await new Promise(resolve => setTimeout(resolve, 100));
  } while (updated === batchSize);

  console.log(`Backfill complete. Total rows updated: ${totalUpdated}`);
}
用于用计算值或默认值填充新列:
javascript
// 批量回填以避免锁定和内存问题
async function backfillStatus(db, batchSize = 1000) {
  let totalUpdated = 0;
  let updated;

  do {
    const result = await db.query(`
      UPDATE orders
      SET status = 'pending'
      WHERE id IN (
        SELECT id FROM orders
        WHERE status IS NULL
        LIMIT $1
        FOR UPDATE SKIP LOCKED
      )
      RETURNING id
    `, [batchSize]);

    updated = result.rowCount;
    totalUpdated += updated;
    console.log(`已回填 ${totalUpdated} 行...`);

    // 批次间暂停以降低负载
    await new Promise(resolve => setTimeout(resolve, 100));
  } while (updated === batchSize);

  console.log(`回填完成。更新总行数:${totalUpdated}`);
}

Dual-Write Pattern

双写模式

When migrating data to a new structure or new system:
  1. Start writing to both old and new locations.
  2. Backfill historical data from old to new.
  3. Verify consistency between old and new.
  4. Switch reads to new location.
  5. Stop writing to old location.
  6. Decommission old structure after observation period.
将数据迁移到新结构或新系统时:
  1. 开始同时向旧位置和新位置写入数据。
  2. 将历史数据从旧位置回填到新位置。
  3. 验证新旧位置的数据一致性。
  4. 将读取切换到新位置。
  5. 停止向旧位置写入数据。
  6. 观察期后停用旧结构。

Shadow Tables

影子表

For large structural changes where you cannot modify the original table in place:
sql
-- 1. Create the new table with desired schema
CREATE TABLE users_v2 (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  full_name VARCHAR(255) NOT NULL,
  email VARCHAR(255) NOT NULL UNIQUE,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- 2. Copy data in batches
INSERT INTO users_v2 (full_name, email, created_at)
SELECT
  COALESCE(first_name || ' ' || last_name, first_name, 'Unknown'),
  email,
  created_at
FROM users
WHERE id BETWEEN 1 AND 10000;
-- Repeat for remaining batches

-- 3. Set up triggers or dual-write for new data during migration

-- 4. Swap tables atomically
ALTER TABLE users RENAME TO users_deprecated;
ALTER TABLE users_v2 RENAME TO users;

-- 5. Drop old table after observation period
-- DROP TABLE users_deprecated;
对于无法直接修改原表的大型结构变更:
sql
-- 1. 创建具有所需模式的新表
CREATE TABLE users_v2 (
  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  full_name VARCHAR(255) NOT NULL,
  email VARCHAR(255) NOT NULL UNIQUE,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- 2. 批量复制数据
INSERT INTO users_v2 (full_name, email, created_at)
SELECT
  COALESCE(first_name || ' ' || last_name, first_name, 'Unknown'),
  email,
  created_at
FROM users
WHERE id BETWEEN 1 AND 10000;
-- 对剩余批次重复执行

-- 3. 迁移期间为新数据设置触发器或双写

-- 4. 原子性交换表
ALTER TABLE users RENAME TO users_deprecated;
ALTER TABLE users_v2 RENAME TO users;

-- 5. 观察期后删除旧表
-- DROP TABLE users_deprecated;

Migration Testing

迁移测试

Pre-Deployment Testing Checklist

部署前测试清单

  • Migration runs successfully on a fresh database.
  • Migration runs successfully on a copy of production data.
  • Rollback runs successfully and leaves database in prior state.
  • Migration completes within acceptable time on production-sized data.
  • No exclusive locks held for more than a few seconds.
  • Application code is compatible with both before and after schema states (for zero-downtime deploys).
  • Data integrity is preserved (row counts, checksums on critical columns).
  • 迁移在全新数据库上执行成功。
  • 迁移在生产数据副本上执行成功。
  • 回滚执行成功,且数据库恢复到之前的状态。
  • 迁移在生产规模的数据上完成时间符合预期。
  • 没有持有超过几秒的排他锁。
  • 应用代码兼容迁移前后的模式状态(针对零停机部署)。
  • 数据完整性得到保留(行计数、关键列校验和)。

Testing Against Production-Like Data

针对类生产数据的测试

Always test migrations against a database with realistic data volume:
bash
undefined
始终在具有真实数据量的数据库上测试迁移:
bash
undefined

Restore a production backup to a test environment

将生产备份恢复到测试环境

pg_restore --dbname=test_db production_backup.dump
pg_restore --dbname=test_db production_backup.dump

Run the migration

执行迁移

psql -d test_db -f migrations/20240315143022_add_status_to_orders.sql
psql -d test_db -f migrations/20240315143022_add_status_to_orders.sql

Verify results

验证结果

psql -d test_db -c "SELECT COUNT(*) FROM orders WHERE status IS NULL;"
psql -d test_db -c "SELECT COUNT(*) FROM orders WHERE status IS NULL;"

Time the migration

统计迁移耗时

time psql -d test_db -f migrations/20240315143022_add_status_to_orders.sql
undefined
time psql -d test_db -f migrations/20240315143022_add_status_to_orders.sql
undefined

Database Seeding for Development

开发环境数据库初始化

Maintain seed files that populate development databases with realistic test data:
javascript
// seeds/001_users.js
exports.seed = async function(knex) {
  await knex('users').del();
  await knex('users').insert([
    { id: 1, name: 'Alice Developer', email: 'alice@example.com', role: 'admin' },
    { id: 2, name: 'Bob Tester', email: 'bob@example.com', role: 'user' },
    { id: 3, name: 'Carol Manager', email: 'carol@example.com', role: 'manager' },
  ]);
};

// seeds/002_orders.js
exports.seed = async function(knex) {
  await knex('orders').del();
  await knex('orders').insert([
    { id: 1, user_id: 1, status: 'completed', total: 99.99 },
    { id: 2, user_id: 2, status: 'pending', total: 49.50 },
    { id: 3, user_id: 1, status: 'shipped', total: 150.00 },
  ]);
};
维护初始化文件,为开发数据库填充真实的测试数据:
javascript
// seeds/001_users.js
exports.seed = async function(knex) {
  await knex('users').del();
  await knex('users').insert([
    { id: 1, name: 'Alice Developer', email: 'alice@example.com', role: 'admin' },
    { id: 2, name: 'Bob Tester', email: 'bob@example.com', role: 'user' },
    { id: 3, name: 'Carol Manager', email: 'carol@example.com', role: 'manager' },
  ]);
};

// seeds/002_orders.js
exports.seed = async function(knex) {
  await knex('orders').del();
  await knex('orders').insert([
    { id: 1, user_id: 1, status: 'completed', total: 99.99 },
    { id: 2, user_id: 2, status: 'pending', total: 49.50 },
    { id: 3, user_id: 1, status: 'shipped', total: 150.00 },
  ]);
};

Seed Data Principles

初始化数据原则

  • Seeds are idempotent: running them twice produces the same state.
  • Seeds use fixed IDs for referential integrity.
  • Seeds cover all enum values and edge cases.
  • Seeds never contain real user data or secrets.
  • 初始化操作是幂等的:执行两次会产生相同的状态。
  • 初始化使用固定ID以保证引用完整性。
  • 初始化覆盖所有枚举值和边缘情况。
  • 初始化绝不包含真实用户数据或敏感信息。

ORM Migration Tools Comparison

ORM迁移工具对比

ToolLanguageMigration FormatRollbackAuto-GenerateKey Feature
Prisma MigrateJS/TSSQL files from schema diffLimited (reset-based)Yes, from
schema.prisma
Declarative schema, drift detection
Knex.jsJS/TSJavaScript filesManual
down()
function
No (manual)Flexible, raw SQL support
AlembicPythonPython filesManual
downgrade()
Yes, from SQLAlchemy modelsBranching support, auto-detect
ActiveRecordRubyRuby DSL filesAutomatic
change
reversible
No (manual)Reversible DSL methods
DieselRustSQL filesManual
down.sql
Yes, from schema diffCompile-time schema verification
FlywayJava/JVMSQL or Java filesPaid feature (undo)No (manual)Convention-based, polyglot
golang-migrateGoSQL filesManual
down.sql
No (manual)CLI-first, driver-agnostic
工具语言迁移格式回滚支持自动生成核心特性
Prisma MigrateJS/TS基于模式差异生成的SQL文件有限(基于重置)是,从
schema.prisma
生成
声明式模式、漂移检测
Knex.jsJS/TSJavaScript文件手动
down()
函数
否(手动编写)灵活、支持原生SQL
AlembicPythonPython文件手动
downgrade()
是,从SQLAlchemy模型生成分支支持、自动检测
ActiveRecordRubyRuby DSL文件自动可逆
change
方法
否(手动编写)可逆DSL方法
DieselRustSQL文件手动
down.sql
是,基于模式差异生成编译时模式验证
FlywayJava/JVMSQL或Java文件付费功能(撤销)否(手动编写)基于约定、多语言支持
golang-migrateGoSQL文件手动
down.sql
否(手动编写)命令行优先、驱动无关

Locking Considerations

锁定注意事项

Database schema changes acquire locks that can block application queries. Understand the lock implications:
数据库模式变更会获取锁,可能阻塞应用查询。了解锁的影响:

PostgreSQL Lock Levels

PostgreSQL锁级别

DDL OperationLock AcquiredBlocks ReadsBlocks WritesDuration
CREATE INDEX
ShareLock
NoYesDuration of index build
CREATE INDEX CONCURRENTLY
ShareUpdateExclusiveLock
NoNoLonger build, but non-blocking
ADD COLUMN
(nullable, no default)
AccessExclusiveLock
YesYesNear-instant (metadata only)
ADD COLUMN
(with default, PG11+)
AccessExclusiveLock
YesYesNear-instant (virtual default)
DROP COLUMN
AccessExclusiveLock
YesYesNear-instant (marks as dropped)
ALTER COLUMN TYPE
AccessExclusiveLock
YesYesFull table rewrite
ADD CONSTRAINT
AccessExclusiveLock
YesYesFull table scan to validate
ADD CONSTRAINT ... NOT VALID
ShareRowExclusiveLock
NoPartiallyNear-instant
VALIDATE CONSTRAINT
ShareUpdateExclusiveLock
NoNoScans table, non-blocking
DDL操作获取的锁阻塞读阻塞写持续时间
CREATE INDEX
ShareLock
索引构建期间
CREATE INDEX CONCURRENTLY
ShareUpdateExclusiveLock
构建时间更长,但非阻塞
ADD COLUMN
(可为空,无默认值)
AccessExclusiveLock
几乎即时(仅元数据操作)
ADD COLUMN
(带默认值,PG11+)
AccessExclusiveLock
几乎即时(虚拟默认值)
DROP COLUMN
AccessExclusiveLock
几乎即时(标记为已删除)
ALTER COLUMN TYPE
AccessExclusiveLock
全表重写期间
ADD CONSTRAINT
AccessExclusiveLock
全表扫描验证期间
ADD CONSTRAINT ... NOT VALID
ShareRowExclusiveLock
部分阻塞几乎即时
VALIDATE CONSTRAINT
ShareUpdateExclusiveLock
表扫描期间,非阻塞

Mitigating Lock Contention

缓解锁竞争

sql
-- Set a lock timeout to fail fast instead of waiting indefinitely
SET lock_timeout = '5s';

-- Retry the operation if it times out
-- Application code should handle this and retry with backoff

-- Kill long-running queries that block migrations
SELECT pg_cancel_backend(pid)
FROM pg_stat_activity
WHERE state = 'active'
  AND query_start < NOW() - INTERVAL '5 minutes'
  AND query NOT LIKE '%pg_stat_activity%';
sql
-- 设置锁超时,快速失败而非无限等待
SET lock_timeout = '5s';

-- 如果超时则重试操作
-- 应用代码应处理此情况并退避重试

-- 终止阻塞迁移的长时间运行查询
SELECT pg_cancel_backend(pid)
FROM pg_stat_activity
WHERE state = 'active'
  AND query_start < NOW() - INTERVAL '5 minutes'
  AND query NOT LIKE '%pg_stat_activity%';

Migration Deployment Checklist

迁移部署清单

Run through this checklist before applying any migration to production:
在将任何迁移应用到生产环境前,完成以下检查:

Planning

规划阶段

  • Migration has been reviewed by another engineer.
  • Forward migration and rollback have been written and tested.
  • Migration has been tested against a production-sized dataset.
  • Estimated execution time is known and acceptable.
  • Lock impact has been analyzed (see locking table above).
  • Application code is compatible with schema before and after migration.
  • 迁移已由其他工程师审核。
  • 正向迁移和回滚脚本已编写并测试。
  • 迁移已在生产规模的数据集上测试。
  • 预估执行时间已知且可接受。
  • 已分析锁的影响(参见上方锁级别表)。
  • 应用代码兼容迁移前后的模式。

Execution

执行阶段

  • Database backup taken immediately before migration.
  • Migration applied during low-traffic period (if it requires locks).
  • Migration output monitored for errors.
  • Application health verified after migration completes.
  • Row counts and data integrity spot-checked.
  • 迁移前立即完成数据库备份。
  • 迁移在低流量时段执行(若需要锁)。
  • 监控迁移输出是否有错误。
  • 迁移完成后验证应用健康状态。
  • 抽查行计数和数据完整性。

Post-Migration

迁移后阶段

  • Old schema artifacts cleaned up (after observation period).
  • Rollback script archived but accessible.
  • Monitoring confirms no query performance regressions.
  • Migration documented in changelog or release notes.
  • 观察期后清理旧模式工件。
  • 回滚脚本已归档但可访问。
  • 监控确认无查询性能退化。
  • 迁移已记录在变更日志或发布说明中。