database-clickhouse-weaviate

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ClickHouse and Weaviate

ClickHouse 与 Weaviate

When to use: ClickHouse queries, Goose migrations, chdb test schema, Weaviate collections/migrations, or telemetry storage paths.
适用场景: ClickHouse 查询、Goose 迁移、chdb 测试模式、Weaviate 集合/迁移或遥测存储路径。

ClickHouse queries

ClickHouse 查询

ClickHouse adapter stack remains SQL-oriented in
packages/platform/db-clickhouse
.
All ClickHouse queries must use parameterized bindings (
{name:Type}
syntax with
query_params
) — never interpolate user-supplied values directly into SQL strings.
packages/platform/db-clickhouse
中的 ClickHouse 适配器栈仍以 SQL 为核心。
所有 ClickHouse 查询必须使用参数化绑定(结合
query_params
使用
{name:Type}
语法)——绝对不要将用户提供的值直接插入到 SQL 字符串中。

ClickHouse migrations (Goose)

ClickHouse 迁移(Goose)

Install goose (if not already installed):
bash
brew install goose
Migration files live in
packages/platform/db-clickhouse/clickhouse/migrations/
:
  • unclustered/
    — single-node deployments (local dev, default)
  • clustered/
    — distributed deployments (
    CLICKHOUSE_CLUSTER_ENABLED=true
    )
Goose tracks applied migrations automatically in the
goose_db_version
table. The repo also keeps
packages/platform/db-clickhouse/clickhouse/.migration-lock
, regenerated by
ch:create
, solely to force git conflicts when developers create migrations in parallel.
安装 Goose(若尚未安装):
bash
brew install goose
迁移文件存放于
packages/platform/db-clickhouse/clickhouse/migrations/
  • unclustered/
    — 单节点部署(本地开发环境,默认配置)
  • clustered/
    — 分布式部署(需设置
    CLICKHOUSE_CLUSTER_ENABLED=true
Goose 会自动在
goose_db_version
表中记录已应用的迁移。仓库中还保留了
packages/platform/db-clickhouse/clickhouse/.migration-lock
文件,由
ch:create
命令重新生成,仅用于在开发者并行创建迁移时触发 Git 冲突。

Migration execution safety (agents)

迁移执行安全(Agent)

Same rule as Postgres: do not run
ch:*
or
ch:schema:dump
unless the user explicitly asked in this conversation.
Commands (run from repo root):
bash
undefined
与 Postgres 相同的规则:请勿运行
ch:*
ch:schema:dump
命令,除非用户在对话中明确要求。
命令(从仓库根目录执行):
bash
undefined

Apply all pending migrations

应用所有待处理的迁移

pnpm --filter @platform/db-clickhouse ch:up
pnpm --filter @platform/db-clickhouse ch:up

Roll back last migration

回滚最后一次迁移

pnpm --filter @platform/db-clickhouse ch:down
pnpm --filter @platform/db-clickhouse ch:down

Show migration status

查看迁移状态

pnpm --filter @platform/db-clickhouse ch:status
pnpm --filter @platform/db-clickhouse ch:status

Create a new migration (creates the next sequential file in both unclustered/ and clustered/)

创建新迁移(在 unclustered/ 和 clustered/ 中创建下一个按顺序命名的文件)

pnpm --filter @platform/db-clickhouse ch:create <migration_name>
pnpm --filter @platform/db-clickhouse ch:create <migration_name>

Roll back ALL migrations (equivalent to drop)

回滚所有迁移(等同于删除)

pnpm --filter @platform/db-clickhouse ch:drop
pnpm --filter @platform/db-clickhouse ch:drop

Reset ClickHouse volume and re-migrate (nuclear option)

重置 ClickHouse 卷并重新执行迁移(终极方案)

pnpm --filter @platform/db-clickhouse ch:reset
pnpm --filter @platform/db-clickhouse ch:reset

Seed sample span data

导入示例链路数据

pnpm --filter @platform/db-clickhouse ch:seed
undefined
pnpm --filter @platform/db-clickhouse ch:seed
undefined

Creating migrations

创建迁移

  1. ch:create <name>
    — creates the next sequential migration (for example
    00016_name.sql
    ) in both
    unclustered/
    and
    clustered/
    , and updates
    clickhouse/.migration-lock
  2. Fill in both files (see rules below)
  3. Commit both migration files plus
    clickhouse/.migration-lock
  1. ch:create <name>
    — 在
    unclustered/
    clustered/
    目录中创建下一个按顺序命名的迁移文件(例如
    00016_name.sql
    ),并更新
    clickhouse/.migration-lock
    文件
  2. 填写两个文件的内容(请参阅下方规则)
  3. 提交两个迁移文件以及
    clickhouse/.migration-lock
    文件

Migration file rules

迁移文件规则

  • Each migration is a single
    .sql
    file with
    -- +goose Up
    and
    -- +goose Down
    sections
  • Always include
    -- +goose NO TRANSACTION
    (ClickHouse does not support transactions)
  • ClickHouse migration history is append-only in this repository. Do not edit existing Goose migration files; add a new migration in both
    unclustered/
    and
    clustered/
    instead.
  • For additive changes to existing tables, prefer ordinary
    ALTER TABLE
    or additive projection migrations with sensible defaults unless the change truly requires a table rebuild.
  • unclustered/
    : use standard table engines (e.g.
    ReplacingMergeTree
    )
  • clustered/
    : add
    ON CLUSTER default
    and use
    Replicated*
    engines
  • 每个迁移对应一个
    .sql
    文件,包含
    -- +goose Up
    -- +goose Down
    两个部分
  • 必须包含
    -- +goose NO TRANSACTION
    (ClickHouse 不支持事务)
  • 本仓库中 ClickHouse 的迁移历史仅允许追加。请勿编辑已有的 Goose 迁移文件,而是在
    unclustered/
    clustered/
    目录中添加新的迁移文件。
  • 对现有表进行增量变更时,除非确实需要重建表,否则优先使用常规的
    ALTER TABLE
    语句或带有合理默认值的增量投影迁移。
  • unclustered/
    :使用标准表引擎(例如
    ReplacingMergeTree
  • clustered/
    :添加
    ON CLUSTER default
    语句并使用
    Replicated*
    系列引擎

Clustered migration reliability (replica lag / Code 517)

集群化迁移可靠性(副本延迟 / 错误码 517)

In clustered ClickHouse, replicas can temporarily lag DDL metadata propagation. A migration can fail with:
  • code: 517
  • Code: 517
  • doesn't catchup with latest ALTER query updates
Use these authoring rules to reduce failures:
  • Keep migrations idempotent (
    IF EXISTS
    /
    IF NOT EXISTS
    ) so retries are safe.
  • Prefer additive schema changes over destructive rewrites.
  • Keep DDL batches small; avoid chaining many dependent
    ALTER
    statements in one migration.
  • For tightly-coupled changes on the same table in replicated clusters, prefer one
    ALTER TABLE ...
    with multiple actions over multiple dependent ALTER statements.
  • If statement B depends on metadata introduced by statement A, prefer splitting them into separate migration files.
  • Avoid coupling view rebuilds and many base-table changes in one large migration when possible.
  • Run one migration runner per environment (never concurrent
    ch:up
    against the same cluster).
Execution safety:
  • packages/platform/db-clickhouse/clickhouse/scripts/up.sh
    retries transient replica lag errors from
    goose ... up
    .
  • In clustered mode, migration sessions set
    alter_sync
    ,
    distributed_ddl_task_timeout
    , and
    replication_wait_for_inactive_replica_timeout
    to improve DDL convergence.
  • Retry tuning env vars:
    • CLICKHOUSE_MIGRATION_MAX_RETRIES
      (default
      20
      )
    • CLICKHOUSE_MIGRATION_RETRY_DELAY_SECONDS
      (default
      5
      )
    • CLICKHOUSE_MIGRATION_MAX_RETRY_DELAY_SECONDS
      (default
      30
      )
  • Clustered DDL tuning env vars:
    • CLICKHOUSE_MIGRATION_ALTER_SYNC
      (default
      2
      )
    • CLICKHOUSE_MIGRATION_DISTRIBUTED_DDL_TASK_TIMEOUT_SECONDS
      (default
      300
      )
    • CLICKHOUSE_MIGRATION_REPLICA_WAIT_TIMEOUT_SECONDS
      (default
      300
      )
在集群化的 ClickHouse 中,副本可能会暂时延迟 DDL 元数据的传播。迁移可能会因以下情况失败:
  • code: 517
  • Code: 517
  • doesn't catchup with latest ALTER query updates
遵循以下编写规则以减少失败:
  • 确保迁移具有幂等性(使用
    IF EXISTS
    /
    IF NOT EXISTS
    ),以便重试操作是安全的。
  • 优先选择增量模式变更而非破坏性重写。
  • 保持 DDL 批次规模较小;避免在一个迁移中链式执行多个依赖的
    ALTER
    语句。
  • 在复制集群中对同一张表进行紧密耦合的变更时,优先使用单个包含多个操作的
    ALTER TABLE ...
    语句,而非多个依赖的 ALTER 语句。
  • 如果语句 B 依赖于语句 A 引入的元数据,优先将它们拆分为单独的迁移文件。
  • 尽可能避免在一个大型迁移中同时进行视图重建和多个基础表变更。
  • 每个环境仅运行一个迁移执行器(切勿对同一集群并行执行
    ch:up
    )。
执行安全:
  • packages/platform/db-clickhouse/clickhouse/scripts/up.sh
    脚本会重试
    goose ... up
    命令中出现的临时副本延迟错误。
  • 在集群模式下,迁移会话会设置
    alter_sync
    distributed_ddl_task_timeout
    replication_wait_for_inactive_replica_timeout
    以提升 DDL 收敛性。
  • 重试调优环境变量:
    • CLICKHOUSE_MIGRATION_MAX_RETRIES
      (默认值
      20
    • CLICKHOUSE_MIGRATION_RETRY_DELAY_SECONDS
      (默认值
      5
    • CLICKHOUSE_MIGRATION_MAX_RETRY_DELAY_SECONDS
      (默认值
      30
  • 集群化 DDL 调优环境变量:
    • CLICKHOUSE_MIGRATION_ALTER_SYNC
      (默认值
      2
    • CLICKHOUSE_MIGRATION_DISTRIBUTED_DDL_TASK_TIMEOUT_SECONDS
      (默认值
      300
    • CLICKHOUSE_MIGRATION_REPLICA_WAIT_TIMEOUT_SECONDS
      (默认值
      300

Weaviate collections and migrations

Weaviate 集合与迁移

Use the dedicated Weaviate package for connection and schema bootstrapping:
  • Connection API:
    packages/platform/db-weaviate/src/client.ts
    createWeaviateClient()
    and
    createWeaviateClientEffect()
    connect and perform health checks. For the general platform pattern (Effect-first client, tagged errors, env, layer wiring), see architecture-boundariesPlatform adapters: Effect-based clients.
  • Collection definitions:
    packages/platform/db-weaviate/src/collections.ts
    — define all collections in code via
    defineWeaviateCollections([...])
    .
  • Migration logic:
    packages/platform/db-weaviate/src/migrations.ts
    — idempotent: checks
    collections.exists()
    before create and tolerates "already exists" race conditions.
  • Manual migration command:
    pnpm --filter @platform/db-weaviate wv:migrate
    — entrypoint is
    packages/platform/db-weaviate/src/migrate.ts
    .
使用专用的 Weaviate 包进行连接和模式初始化:
  • 连接 API:
    packages/platform/db-weaviate/src/client.ts
    createWeaviateClient()
    createWeaviateClientEffect()
    用于建立连接并执行健康检查。关于通用平台模式(优先使用 Effect 的客户端、标记化错误、环境配置、层间关联),请参阅 architecture-boundariesPlatform adapters: Effect-based clients
  • 集合定义:
    packages/platform/db-weaviate/src/collections.ts
    — 通过
    defineWeaviateCollections([...])
    在代码中定义所有集合。
  • 迁移逻辑:
    packages/platform/db-weaviate/src/migrations.ts
    — 具有幂等性:在创建前检查
    collections.exists()
    ,并容忍“已存在”的竞争条件。
  • 手动迁移命令:
    pnpm --filter @platform/db-weaviate wv:migrate
    — 入口为
    packages/platform/db-weaviate/src/migrate.ts

Rules

规则

  • Do not define Weaviate collections in app/domain packages.
  • Do not add ad-hoc Weaviate migration scripts outside
    packages/platform/db-weaviate
    .
  • Keep collection schema changes centralized in
    src/collections.ts
    and rely on the package migration flow.
  • 请勿在应用/领域包中定义 Weaviate 集合。
  • 请勿在
    packages/platform/db-weaviate
    之外添加临时的 Weaviate 迁移脚本。
  • 将集合模式变更集中在
    src/collections.ts
    中,并依赖包的迁移流程。

Weaviate migrations (agents)

Weaviate 迁移(Agent)

Do not run
wv:migrate
unless the user explicitly asked in this conversation.
除非用户在对话中明确要求,否则请勿运行
wv:migrate
命令。