clickhouse-migrations

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ClickHouse Migrations

ClickHouse迁移

Read
posthog/clickhouse/migrations/AGENTS.md
for comprehensive patterns, cluster setup, examples, and ingestion layer details.
请阅读
posthog/clickhouse/migrations/AGENTS.md
以获取完整的模式、集群配置、示例以及摄入层细节。

Quick reference

快速参考

Migration structure

迁移结构

python
operations = [
    run_sql_with_exceptions(
        SQL_FUNCTION(),
        node_roles=[...],
        sharded=False,  # True for sharded tables
        is_alter_on_replicated_table=False  # True for ALTER on replicated tables
    ),
]
python
operations = [
    run_sql_with_exceptions(
        SQL_FUNCTION(),
        node_roles=[...],
        sharded=False,  # 分片表设为True
        is_alter_on_replicated_table=False  # 对复制表执行ALTER时设为True
    ),
]

Node roles (choose based on table type)

节点角色(根据表类型选择)

  • [NodeRole.DATA]
    : Sharded tables (data nodes only)
  • [NodeRole.DATA, NodeRole.COORDINATOR]
    : Non-sharded data tables, distributed read tables, replicated tables, views, dictionaries
  • [NodeRole.INGESTION_SMALL]
    : Writable tables, Kafka tables, materialized views on ingestion layer
  • [NodeRole.DATA]
    : 分片表(仅数据节点)
  • [NodeRole.DATA, NodeRole.COORDINATOR]
    : 非分片数据表、分布式只读表、复制表、视图、字典
  • [NodeRole.INGESTION_SMALL]
    : 可写表、Kafka表、摄入层的物化视图

Table engines quick reference

表引擎快速参考

MergeTree engines:
  • AggregatingMergeTree(table, replication_scheme=ReplicationScheme.SHARDED)
    for sharded tables
  • ReplacingMergeTree(table, replication_scheme=ReplicationScheme.REPLICATED)
    for non-sharded
  • Other variants:
    CollapsingMergeTree
    ,
    ReplacingMergeTreeDeleted
Distributed engine:
  • Sharded:
    Distributed(data_table="sharded_events", sharding_key="sipHash64(person_id)")
  • Non-sharded:
    Distributed(data_table="my_table", cluster=settings.CLICKHOUSE_SINGLE_SHARD_CLUSTER)
MergeTree引擎:
  • AggregatingMergeTree(table, replication_scheme=ReplicationScheme.SHARDED)
    用于分片表
  • ReplacingMergeTree(table, replication_scheme=ReplicationScheme.REPLICATED)
    用于非分片表
  • 其他变体:
    CollapsingMergeTree
    ,
    ReplacingMergeTreeDeleted
分布式引擎:
  • 分片:
    Distributed(data_table="sharded_events", sharding_key="sipHash64(person_id)")
  • 非分片:
    Distributed(data_table="my_table", cluster=settings.CLICKHOUSE_SINGLE_SHARD_CLUSTER)

Critical rules

重要规则

  • NEVER use
    ON CLUSTER
    clause in SQL statements
  • Always use
    IF EXISTS
    /
    IF NOT EXISTS
    clauses
  • When dropping and recreating replicated table in same migration, use
    DROP TABLE IF EXISTS ... SYNC
  • If a function generating SQL has on_cluster param, always set
    on_cluster=False
  • Use
    sharded=True
    when altering sharded tables
  • Use
    is_alter_on_replicated_table=True
    when altering non-sharded replicated tables
  • 绝对不要在SQL语句中使用
    ON CLUSTER
    子句
  • 始终使用
    IF EXISTS
    /
    IF NOT EXISTS
    子句
  • 若在同一迁移中删除并重建复制表,请使用
    DROP TABLE IF EXISTS ... SYNC
  • 如果生成SQL的函数带有on_cluster参数,请始终设置
    on_cluster=False
  • 修改分片表时使用
    sharded=True
  • 修改非分片复制表时使用
    is_alter_on_replicated_table=True

Testing

测试

Delete entry from
infi_clickhouse_orm_migrations
table to re-run a migration.
infi_clickhouse_orm_migrations
表中删除条目以重新运行迁移。