database-clickhouse-weaviate
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseClickHouse and Weaviate
ClickHouse 与 Weaviate
When to use: ClickHouse queries, Goose migrations, chdb test schema, Weaviate collections/migrations, or telemetry storage paths.
适用场景: ClickHouse 查询、Goose 迁移、chdb 测试模式、Weaviate 集合/迁移或遥测存储路径。
ClickHouse queries
ClickHouse 查询
ClickHouse adapter stack remains SQL-oriented in .
packages/platform/db-clickhouseAll ClickHouse queries must use parameterized bindings ( syntax with ) — never interpolate user-supplied values directly into SQL strings.
{name:Type}query_paramspackages/platform/db-clickhouse所有 ClickHouse 查询必须使用参数化绑定(结合 使用 语法)——绝对不要将用户提供的值直接插入到 SQL 字符串中。
query_params{name:Type}ClickHouse migrations (Goose)
ClickHouse 迁移(Goose)
Install goose (if not already installed):
bash
brew install gooseMigration files live in :
packages/platform/db-clickhouse/clickhouse/migrations/- — single-node deployments (local dev, default)
unclustered/ - — distributed deployments (
clustered/)CLICKHOUSE_CLUSTER_ENABLED=true
Goose tracks applied migrations automatically in the table. The repo also keeps , regenerated by , solely to force git conflicts when developers create migrations in parallel.
goose_db_versionpackages/platform/db-clickhouse/clickhouse/.migration-lockch:create安装 Goose(若尚未安装):
bash
brew install goose迁移文件存放于 :
packages/platform/db-clickhouse/clickhouse/migrations/- — 单节点部署(本地开发环境,默认配置)
unclustered/ - — 分布式部署(需设置
clustered/)CLICKHOUSE_CLUSTER_ENABLED=true
Goose 会自动在 表中记录已应用的迁移。仓库中还保留了 文件,由 命令重新生成,仅用于在开发者并行创建迁移时触发 Git 冲突。
goose_db_versionpackages/platform/db-clickhouse/clickhouse/.migration-lockch:createMigration execution safety (agents)
迁移执行安全(Agent)
Same rule as Postgres: do not run or unless the user explicitly asked in this conversation.
ch:*ch:schema:dumpCommands (run from repo root):
bash
undefined与 Postgres 相同的规则:请勿运行 或 命令,除非用户在对话中明确要求。
ch:*ch:schema:dump命令(从仓库根目录执行):
bash
undefinedApply all pending migrations
应用所有待处理的迁移
pnpm --filter @platform/db-clickhouse ch:up
pnpm --filter @platform/db-clickhouse ch:up
Roll back last migration
回滚最后一次迁移
pnpm --filter @platform/db-clickhouse ch:down
pnpm --filter @platform/db-clickhouse ch:down
Show migration status
查看迁移状态
pnpm --filter @platform/db-clickhouse ch:status
pnpm --filter @platform/db-clickhouse ch:status
Create a new migration (creates the next sequential file in both unclustered/ and clustered/)
创建新迁移(在 unclustered/ 和 clustered/ 中创建下一个按顺序命名的文件)
pnpm --filter @platform/db-clickhouse ch:create <migration_name>
pnpm --filter @platform/db-clickhouse ch:create <migration_name>
Roll back ALL migrations (equivalent to drop)
回滚所有迁移(等同于删除)
pnpm --filter @platform/db-clickhouse ch:drop
pnpm --filter @platform/db-clickhouse ch:drop
Reset ClickHouse volume and re-migrate (nuclear option)
重置 ClickHouse 卷并重新执行迁移(终极方案)
pnpm --filter @platform/db-clickhouse ch:reset
pnpm --filter @platform/db-clickhouse ch:reset
Seed sample span data
导入示例链路数据
pnpm --filter @platform/db-clickhouse ch:seed
undefinedpnpm --filter @platform/db-clickhouse ch:seed
undefinedCreating migrations
创建迁移
- — creates the next sequential migration (for example
ch:create <name>) in both00016_name.sqlandunclustered/, and updatesclustered/clickhouse/.migration-lock - Fill in both files (see rules below)
- Commit both migration files plus
clickhouse/.migration-lock
- — 在
ch:create <name>和unclustered/目录中创建下一个按顺序命名的迁移文件(例如clustered/),并更新00016_name.sql文件clickhouse/.migration-lock - 填写两个文件的内容(请参阅下方规则)
- 提交两个迁移文件以及 文件
clickhouse/.migration-lock
Migration file rules
迁移文件规则
- Each migration is a single file with
.sqland-- +goose Upsections-- +goose Down - Always include (ClickHouse does not support transactions)
-- +goose NO TRANSACTION - ClickHouse migration history is append-only in this repository. Do not edit existing Goose migration files; add a new migration in both and
unclustered/instead.clustered/ - For additive changes to existing tables, prefer ordinary or additive projection migrations with sensible defaults unless the change truly requires a table rebuild.
ALTER TABLE - : use standard table engines (e.g.
unclustered/)ReplacingMergeTree - : add
clustered/and useON CLUSTER defaultenginesReplicated*
- 每个迁移对应一个 文件,包含
.sql和-- +goose Up两个部分-- +goose Down - 必须包含 (ClickHouse 不支持事务)
-- +goose NO TRANSACTION - 本仓库中 ClickHouse 的迁移历史仅允许追加。请勿编辑已有的 Goose 迁移文件,而是在 和
unclustered/目录中添加新的迁移文件。clustered/ - 对现有表进行增量变更时,除非确实需要重建表,否则优先使用常规的 语句或带有合理默认值的增量投影迁移。
ALTER TABLE - :使用标准表引擎(例如
unclustered/)ReplacingMergeTree - :添加
clustered/语句并使用ON CLUSTER default系列引擎Replicated*
Clustered migration reliability (replica lag / Code 517)
集群化迁移可靠性(副本延迟 / 错误码 517)
In clustered ClickHouse, replicas can temporarily lag DDL metadata propagation. A migration can fail with:
code: 517Code: 517doesn't catchup with latest ALTER query updates
Use these authoring rules to reduce failures:
- Keep migrations idempotent (/
IF EXISTS) so retries are safe.IF NOT EXISTS - Prefer additive schema changes over destructive rewrites.
- Keep DDL batches small; avoid chaining many dependent statements in one migration.
ALTER - For tightly-coupled changes on the same table in replicated clusters, prefer one with multiple actions over multiple dependent ALTER statements.
ALTER TABLE ... - If statement B depends on metadata introduced by statement A, prefer splitting them into separate migration files.
- Avoid coupling view rebuilds and many base-table changes in one large migration when possible.
- Run one migration runner per environment (never concurrent against the same cluster).
ch:up
Execution safety:
- retries transient replica lag errors from
packages/platform/db-clickhouse/clickhouse/scripts/up.sh.goose ... up - In clustered mode, migration sessions set ,
alter_sync, anddistributed_ddl_task_timeoutto improve DDL convergence.replication_wait_for_inactive_replica_timeout - Retry tuning env vars:
- (default
CLICKHOUSE_MIGRATION_MAX_RETRIES)20 - (default
CLICKHOUSE_MIGRATION_RETRY_DELAY_SECONDS)5 - (default
CLICKHOUSE_MIGRATION_MAX_RETRY_DELAY_SECONDS)30
- Clustered DDL tuning env vars:
- (default
CLICKHOUSE_MIGRATION_ALTER_SYNC)2 - (default
CLICKHOUSE_MIGRATION_DISTRIBUTED_DDL_TASK_TIMEOUT_SECONDS)300 - (default
CLICKHOUSE_MIGRATION_REPLICA_WAIT_TIMEOUT_SECONDS)300
在集群化的 ClickHouse 中,副本可能会暂时延迟 DDL 元数据的传播。迁移可能会因以下情况失败:
code: 517Code: 517doesn't catchup with latest ALTER query updates
遵循以下编写规则以减少失败:
- 确保迁移具有幂等性(使用 /
IF EXISTS),以便重试操作是安全的。IF NOT EXISTS - 优先选择增量模式变更而非破坏性重写。
- 保持 DDL 批次规模较小;避免在一个迁移中链式执行多个依赖的 语句。
ALTER - 在复制集群中对同一张表进行紧密耦合的变更时,优先使用单个包含多个操作的 语句,而非多个依赖的 ALTER 语句。
ALTER TABLE ... - 如果语句 B 依赖于语句 A 引入的元数据,优先将它们拆分为单独的迁移文件。
- 尽可能避免在一个大型迁移中同时进行视图重建和多个基础表变更。
- 每个环境仅运行一个迁移执行器(切勿对同一集群并行执行 )。
ch:up
执行安全:
- 脚本会重试
packages/platform/db-clickhouse/clickhouse/scripts/up.sh命令中出现的临时副本延迟错误。goose ... up - 在集群模式下,迁移会话会设置 、
alter_sync和distributed_ddl_task_timeout以提升 DDL 收敛性。replication_wait_for_inactive_replica_timeout - 重试调优环境变量:
- (默认值
CLICKHOUSE_MIGRATION_MAX_RETRIES)20 - (默认值
CLICKHOUSE_MIGRATION_RETRY_DELAY_SECONDS)5 - (默认值
CLICKHOUSE_MIGRATION_MAX_RETRY_DELAY_SECONDS)30
- 集群化 DDL 调优环境变量:
- (默认值
CLICKHOUSE_MIGRATION_ALTER_SYNC)2 - (默认值
CLICKHOUSE_MIGRATION_DISTRIBUTED_DDL_TASK_TIMEOUT_SECONDS)300 - (默认值
CLICKHOUSE_MIGRATION_REPLICA_WAIT_TIMEOUT_SECONDS)300
Weaviate collections and migrations
Weaviate 集合与迁移
Use the dedicated Weaviate package for connection and schema bootstrapping:
- Connection API: —
packages/platform/db-weaviate/src/client.tsandcreateWeaviateClient()connect and perform health checks. For the general platform pattern (Effect-first client, tagged errors, env, layer wiring), see architecture-boundaries — Platform adapters: Effect-based clients.createWeaviateClientEffect() - Collection definitions: — define all collections in code via
packages/platform/db-weaviate/src/collections.ts.defineWeaviateCollections([...]) - Migration logic: — idempotent: checks
packages/platform/db-weaviate/src/migrations.tsbefore create and tolerates "already exists" race conditions.collections.exists() - Manual migration command: — entrypoint is
pnpm --filter @platform/db-weaviate wv:migrate.packages/platform/db-weaviate/src/migrate.ts
使用专用的 Weaviate 包进行连接和模式初始化:
- 连接 API: —
packages/platform/db-weaviate/src/client.ts和createWeaviateClient()用于建立连接并执行健康检查。关于通用平台模式(优先使用 Effect 的客户端、标记化错误、环境配置、层间关联),请参阅 architecture-boundaries — Platform adapters: Effect-based clients。createWeaviateClientEffect() - 集合定义: — 通过
packages/platform/db-weaviate/src/collections.ts在代码中定义所有集合。defineWeaviateCollections([...]) - 迁移逻辑: — 具有幂等性:在创建前检查
packages/platform/db-weaviate/src/migrations.ts,并容忍“已存在”的竞争条件。collections.exists() - 手动迁移命令: — 入口为
pnpm --filter @platform/db-weaviate wv:migrate。packages/platform/db-weaviate/src/migrate.ts
Rules
规则
- Do not define Weaviate collections in app/domain packages.
- Do not add ad-hoc Weaviate migration scripts outside .
packages/platform/db-weaviate - Keep collection schema changes centralized in and rely on the package migration flow.
src/collections.ts
- 请勿在应用/领域包中定义 Weaviate 集合。
- 请勿在 之外添加临时的 Weaviate 迁移脚本。
packages/platform/db-weaviate - 将集合模式变更集中在 中,并依赖包的迁移流程。
src/collections.ts
Weaviate migrations (agents)
Weaviate 迁移(Agent)
Do not run unless the user explicitly asked in this conversation.
wv:migrate除非用户在对话中明确要求,否则请勿运行 命令。
wv:migrate