kafka-schema-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Kafka Schema Evolution Review

Kafka 模式演进审查

Reviews schema changes for compatibility and evolution best practices. A single breaking schema change can take down every consumer of a topic.
审查模式变更的兼容性与演进最佳实践。一次破坏性的模式变更可能导致某个主题的所有消费者崩溃。

Workflow

工作流程

Copy this checklist and track your progress:
Schema Review Progress:
- [ ] Step 1: Fetch registered schemas
- [ ] Step 2: Scan codebase for schema files
- [ ] Step 3: Detect breaking changes
- [ ] Step 4: Check schema quality
- [ ] Step 5: Check schema drift
- [ ] Step 6: Generate report
  1. Fetch registered schemas from the live cluster via Lenses MCP
  2. Scan codebase for schema definition files (see
    references/compatibility-rules.md
    for file types)
  3. Detect breaking changes against compatibility rules in
    references/compatibility-rules.md
  4. Check schema quality against best practices
  5. Check schema drift between repo and cluster
  6. Report findings with migration guidance
复制此检查清单并跟踪进度:
Schema Review Progress:
- [ ] Step 1: Fetch registered schemas
- [ ] Step 2: Scan codebase for schema files
- [ ] Step 3: Detect breaking changes
- [ ] Step 4: Check schema quality
- [ ] Step 5: Check schema drift
- [ ] Step 6: Generate report
  1. 获取已注册模式:通过Lenses MCP从运行中的集群获取当前模式状态
  2. 扫描代码库:查找模式定义文件(文件类型可参考
    references/compatibility-rules.md
  3. 检测破坏性变更:对照
    references/compatibility-rules.md
    中的兼容性规则进行检测
  4. 检查模式质量:依据最佳实践检查模式质量
  5. 检查模式漂移:对比代码库与集群中的模式差异
  6. 生成报告:输出包含迁移指导的检查结果

Step 1: Fetch Registered Schemas

步骤1:获取已注册模式

Use Lenses MCP tools to get the current state of schemas in the cluster:
  • list_topic_metadata
    - get all schemas registered against topics (key and value)
  • get_topic_metadata
    - get the current schema for a specific topic
  • get_dataset
    - get dataset field-level details, policies and governance metadata
  • list_datasets
    with
    schema_format
    filter - find all topics using a given format (AVRO, JSON, PROTOBUF)
Expected output: Map of topics to their registered schemas (key and value) with format and version info.
Validation: If no schemas are registered, note this as a governance gap and proceed with codebase-only analysis.
使用Lenses MCP工具获取集群中模式的当前状态:
  • list_topic_metadata
    - 获取所有与主题关联的已注册模式(键和值)
  • get_topic_metadata
    - 获取特定主题的当前模式
  • get_dataset
    - 获取数据集的字段级详情、策略和治理元数据
  • list_datasets
    搭配
    schema_format
    过滤器 - 查找使用指定格式(AVRO、JSON、PROTOBUF)的所有主题
预期输出:主题到其已注册模式(键和值)的映射,包含格式和版本信息。
验证:如果没有已注册的模式,需将此记录为治理缺口,然后仅基于代码库进行分析。

Step 2: Codebase Inspection

步骤2:代码库检查

Search the codebase for schema definition files. Consult
references/compatibility-rules.md
for the full list of file types and search patterns.
Use
git diff
to identify recently changed schema files if reviewing a PR.
在代码库中搜索模式定义文件。完整的文件类型和搜索模式可参考
references/compatibility-rules.md
如果是审查PR,使用
git diff
识别最近修改的模式文件。

Step 3: Compatibility Checks

步骤3:兼容性检查

For each schema change, evaluate against the compatibility rules in
references/compatibility-rules.md
. Check backward, forward and full compatibility depending on the topic's configured compatibility level.
针对每一项模式变更,对照
references/compatibility-rules.md
中的兼容性规则进行评估。根据主题配置的兼容性级别,检查向后、向前及完全兼容性。

Step 4: Schema Quality Checks

步骤4:模式质量检查

Apply the quality checks from
references/compatibility-rules.md
:
  • Fields without documentation annotations
  • Missing default values on optional fields
  • Inconsistent naming conventions
  • Unused or overly generic field names
应用
references/compatibility-rules.md
中的质量检查项:
  • 无文档注释的字段
  • 可选字段缺失默认值
  • 命名约定不一致
  • 未使用或过于通用的字段名

Step 5: Schema Drift Detection

步骤5:模式漂移检测

Compare schema files in the repo against schemas registered in the cluster:
  • Use
    execute_sql
    to sample live data and verify it matches the expected schema
  • Flag schemas in the repo that differ from what's registered
  • Flag topics with registered schemas that have no corresponding file in the repo
对比代码库中的模式文件与集群中已注册的模式:
  • 使用
    execute_sql
    采样实时数据,验证其是否符合预期模式
  • 标记代码库中与已注册模式存在差异的模式
  • 标记存在已注册模式但代码库中无对应文件的主题

Success Criteria

成功标准

Quantitative

量化标准

  • Triggers on 90% of schema-related queries (test with 10-20 varied phrasings)
  • Completes review in under 12 tool calls (MCP + codebase search)
  • 0 false positives on breaking change detection
  • 对90%的模式相关查询做出响应(使用10-20种不同表述测试)
  • 在12次工具调用内完成审查(MCP + 代码库搜索)
  • 破坏性变更检测零误报

Qualitative

定性标准

  • Breaking changes include clear migration guidance
  • Schema drift is reported with both repo and cluster versions
  • Quality findings are actionable without external documentation
  • 破坏性变更包含清晰的迁移指导
  • 模式漂移报告同时展示代码库和集群版本
  • 质量问题结果无需外部文档即可直接执行

Examples

示例

Example 1: Pre-merge schema review

示例1:合并前模式审查

User says: "Review the schema changes in this PR"
Actions:
  1. Run
    git diff
    to find changed
    .avsc
    ,
    .proto
    or
    .json
    schema files
  2. Fetch the currently registered schema from the cluster via Lenses MCP
  3. Evaluate each change against compatibility rules Result: Report listing any breaking changes with migration guidance
用户说:“审查此PR中的模式变更”
操作:
  1. 运行
    git diff
    查找修改后的
    .avsc
    .proto
    .json
    模式文件
  2. 通过Lenses MCP从集群获取当前已注册的模式
  3. 对照兼容性规则评估每一项变更 结果:输出包含破坏性变更及迁移指导的报告

Example 2: Full schema audit

示例2:完整模式审计

User says: "Audit all schemas in the staging environment"
Actions:
  1. Fetch all registered schemas via
    list_topic_metadata
  2. Scan the codebase for schema files
  3. Check for drift between repo and cluster
  4. Run quality checks on all schemas Result: Comprehensive report covering compatibility, quality and drift
用户说:“审计 staging 环境中的所有模式”
操作:
  1. 通过
    list_topic_metadata
    获取所有已注册模式
  2. 扫描代码库查找模式文件
  3. 检查代码库与集群之间的模式漂移
  4. 对所有模式执行质量检查 结果:涵盖兼容性、质量和漂移情况的综合报告

Example 3: Investigating a consumer failure

示例3:排查消费者故障

User says: "Consumers are failing to deserialise messages from orders.payment.completed"
Actions:
  1. Fetch the registered schema for that topic via
    get_topic_metadata
  2. Sample live data with
    execute_sql
    to see actual message format
  3. Compare against the schema file in the repo Result: Diagnosis of schema mismatch with remediation steps
用户说:“消费者无法反序列化orders.payment.completed主题的消息”
操作:
  1. 通过
    get_topic_metadata
    获取该主题的已注册模式
  2. 使用
    execute_sql
    采样实时数据查看实际消息格式
  3. 与代码库中的模式文件进行对比 结果:诊断模式不匹配问题并提供修复步骤

Troubleshooting

故障排除

No schemas registered in the cluster

集群中无已注册模式

Cause: Schema Registry is not configured or topics use schemaless formats (plain JSON, CSV). Solution: This is a valid finding - report it as a governance gap rather than an error. Recommend adding schema registration.
原因:未配置Schema Registry,或主题使用无模式格式(纯JSON、CSV)。 解决方案:这是一个有效的检查结果 - 将其报告为治理缺口而非错误。建议添加模式注册。

Schema drift detected but intentional

检测到模式漂移但为有意操作

Cause: The cluster schema was updated independently of the repo (e.g., via Schema Registry UI). Solution: Report the drift and recommend syncing the repo to match the cluster as the source of truth.
原因:集群模式独立于代码库进行了更新(例如通过Schema Registry UI)。 解决方案:报告漂移情况,并建议将代码库与集群同步,以集群作为可信源。

Cannot sample data with execute_sql

无法使用execute_sql采样数据

Cause: Topic is empty, permissions are restricted or the topic uses an unsupported format. Solution: Note the limitation in the report. Use
get_topic_metadata
as a fallback for schema information.
原因:主题为空、权限受限或主题使用不支持的格式。 解决方案:在报告中注明此限制。使用
get_topic_metadata
作为模式信息的备选方案。

Output Format

输出格式

undefined
undefined

Schema Review Report

Schema Review Report

Environment: {name}

Environment: {name}

Breaking Changes (must fix before merge)

Breaking Changes (must fix before merge)

  • [schema-file] Description of the breaking change Affected topics: {list} Migration: {guidance}
  • [schema-file] Description of the breaking change Affected topics: {list} Migration: {guidance}

Compatibility Warnings

Compatibility Warnings

  • [schema-file] Description of the issue Recommendation: How to fix it
  • [schema-file] Description of the issue Recommendation: How to fix it

Schema Quality

Schema Quality

  • [schema-file:field] Description of the quality issue Recommendation: How to improve it
  • [schema-file:field] Description of the quality issue Recommendation: How to improve it

Schema Drift

Schema Drift

  • [topic-name] Schema in repo differs from registered schema Repo version: {summary} | Cluster version: {summary}
  • [topic-name] Schema in repo differs from registered schema Repo version: {summary} | Cluster version: {summary}

Summary

Summary

  • X breaking changes found
  • Y compatibility warnings found
  • Z quality issues found
  • Schema files scanned: N
  • Topics with drift: M
undefined
  • X breaking changes found
  • Y compatibility warnings found
  • Z quality issues found
  • Schema files scanned: N
  • Topics with drift: M
undefined