kafka-topic-audit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseKafka Topic Configuration Audit
Kafka主题配置审计
Audits all topic configurations against production best practices. Misconfigured topics are the #1 cause of Kafka data loss - engineers create topics and forget to tune them.
对照生产环境最佳实践审计所有主题配置。配置错误的主题是Kafka数据丢失的首要原因——工程师创建主题后往往会忘记调优配置。
Workflow
工作流程
Copy this checklist and track your progress:
Audit Progress:
- [ ] Step 1: Check environment health
- [ ] Step 2: Fetch all topics
- [ ] Step 3: Audit configurations against best practices
- [ ] Step 4: Check metadata completeness
- [ ] Step 5: Detect orphaned topics
- [ ] Step 6: Run consistency checks
- [ ] Step 7: Generate report- Check environment health for a high-level summary
- Fetch all topics and their configurations
- Audit each topic against best practices (see )
references/audit-rules.md - Cross-reference metadata for completeness
- Detect orphaned topics with no consumers
- Report findings with prioritised recommendations
复制以下清单并跟踪进度:
Audit Progress:
- [ ] Step 1: Check environment health
- [ ] Step 2: Fetch all topics
- [ ] Step 3: Audit configurations against best practices
- [ ] Step 4: Check metadata completeness
- [ ] Step 5: Detect orphaned topics
- [ ] Step 6: Run consistency checks
- [ ] Step 7: Generate report- 检查环境健康状况,获取高层级汇总信息
- 获取所有主题及其配置
- 审计每个主题是否符合最佳实践(详见)
references/audit-rules.md - 交叉验证元数据的完整性
- 检测孤立主题(无消费者的主题)
- 输出审计结果并给出优先级排序的建议
Step 1: Environment Overview
步骤1:环境概览
Use the Lenses MCP tool to get a quick summary:
check_environment_health- Broker count, topic count, consumer count, connector count
- Any existing issues flagged by Lenses
Expected output: Environment health summary with broker, topic and consumer counts.
Validation: If the environment is unhealthy or unreachable, stop and report the connection issue before proceeding.
使用Lenses MCP的工具快速获取汇总信息:
check_environment_health- Broker数量、主题数量、消费者数量、连接器数量
- Lenses标记的所有现存问题
预期输出:包含Broker、主题和消费者数量的环境健康状况汇总。
验证:如果环境不健康或无法访问,请先停止操作并报告连接问题,再继续后续步骤。
Step 2: Fetch All Topics
步骤2:获取所有主题
Use the Lenses MCP tool to retrieve all topics with their configurations in one call.
list_topicsFor topics that need deeper inspection, use:
- for detailed config including partitions and consumers
get_topic - for broker-level config overrides
get_topic_broker_configs - for partition-level message counts and bytes
get_topic_partitions
Expected output: Full list of topics with their configurations. If zero topics are returned, report this and stop.
使用Lenses MCP的工具一次性获取所有主题及其配置。
list_topics对于需要深入检查的主题,可使用以下工具:
- :获取包含分区和消费者信息的详细配置
get_topic - :获取Broker级别的配置覆盖项
get_topic_broker_configs - :获取分区级别的消息数量和字节数
get_topic_partitions
预期输出:包含所有主题配置的完整列表。如果未返回任何主题,请报告此情况并停止操作。
Step 3: Audit Configurations
步骤3:配置审计
For each topic, check against the thresholds in :
references/audit-rules.md- Replication factor - RF=1 is critical, RF=2 is a warning in production
- Retention policies - unbounded growth, too short or excessively long
- Partition count - single-partition bottlenecks or excessive partitions
- Compaction settings - compact without keys, delete for state topics
- Naming conventions - must follow pattern
{domain}.{entity}.{event}
针对每个主题,对照中的阈值进行检查:
references/audit-rules.md- 副本因子(replication factor):生产环境中RF=1属于严重问题,RF=2属于警告级别
- 保留策略:无限制增长、保留时间过短或过长
- 分区数量:单分区瓶颈或分区数量过多
- 压缩设置:无键值的压缩配置、状态主题使用删除策略
- 命名规范:必须遵循格式
{domain}.{entity}.{event}
Step 4: Metadata Completeness
步骤4:元数据完整性检查
Use the Lenses MCP tool to check:
list_topic_metadata- Topics missing descriptions
- Topics missing tags
- Topics without registered schemas (key or value)
Use with filters (, ) to find anomalies.
list_datasetsis_compactedhas_records使用Lenses MCP的工具检查:
list_topic_metadata- 缺少描述的主题
- 缺少标签的主题
- 未注册Schema(键或值)的主题
使用带筛选条件的(、)查找异常情况。
list_datasetsis_compactedhas_recordsStep 5: Orphan Detection
步骤5:孤立主题检测
For each topic, use to check for active consumers.
list_consumer_groups_by_topic- Warning: Topics with zero consumer groups (may be orphaned/dead)
- Suggestion: Topics with only inactive/empty consumer groups
针对每个主题,使用工具检查是否存在活跃消费者:
list_consumer_groups_by_topic- 警告:无消费者组的主题(可能是孤立/废弃主题)
- 建议:仅包含非活跃/空消费者组的主题
Step 6: Consistency Checks
步骤6:一致性检查
- Flag topics in the same domain with different retention policies
- Flag topics in the same domain with different replication factors
- Flag topics with inconsistent serialisation formats within a domain
- 标记同一领域内保留策略不同的主题
- 标记同一领域内副本因子不同的主题
- 标记同一领域内序列化格式不一致的主题
Success Criteria
成功标准
Quantitative
量化标准
- Triggers on 90% of topic-related queries (test with 10-20 varied phrasings)
- Completes full audit in under 15 MCP tool calls
- 0 failed MCP calls per run
- 响应90%的主题相关查询(用10-20种不同表述测试)
- 在15次以内的MCP工具调用中完成完整审计
- 每次运行无MCP调用失败
Qualitative
定性标准
- Report is actionable without follow-up questions from the user
- Consistent severity categorisation (critical/warning/suggestion) across runs
- Every finding includes a concrete remediation step
- 输出的报告无需用户后续提问即可直接执行
- 跨运行保持一致的严重程度分类(严重/警告/建议)
- 每个问题都包含具体的修复步骤
Examples
示例
Example 1: Routine weekly audit
示例1:每周例行审计
User says: "Run a topic audit on the staging environment"
Actions:
- Check staging environment health via Lenses MCP
- Fetch all topics and configs
- Audit each topic against rules in
references/audit-rules.md - Check metadata completeness and orphaned topics Result: Full audit report with prioritised findings
用户说:“在预发布环境运行主题审计”
操作步骤:
- 通过Lenses MCP检查预发布环境健康状况
- 获取所有主题及其配置
- 对照中的规则审计每个主题
references/audit-rules.md - 检查元数据完整性和孤立主题 结果:包含优先级排序结果的完整审计报告
Example 2: Pre-deployment check
示例2:部署前检查
User says: "Check if my topic configs are production-ready"
Actions:
- Audit all topics for RF < 3, unbounded retention, single partitions
- Flag any critical issues that would block a production deployment Result: Report highlighting critical issues that must be fixed before go-live
用户说:“检查我的主题配置是否符合生产环境要求”
操作步骤:
- 审计所有主题是否存在RF < 3、无限制保留、单分区等问题
- 标记所有会阻碍生产部署的严重问题 结果:突出显示部署前必须修复的严重问题的报告
Example 3: Investigate a specific topic
示例3:排查特定主题
User says: "Is the orders.payment.completed topic configured correctly?"
Actions:
- Fetch detailed config for the specific topic using
get_topic - Check broker-level overrides with
get_topic_broker_configs - Verify metadata and consumer groups Result: Focused report on a single topic with all findings
用户说:“orders.payment.completed主题的配置是否正确?”
操作步骤:
- 使用获取该特定主题的详细配置
get_topic - 使用检查Broker级别的覆盖配置
get_topic_broker_configs - 验证元数据和消费者组 结果:针对单个主题的聚焦式报告,包含所有检查结果
Troubleshooting
故障排除
Lenses MCP connection failed
Lenses MCP连接失败
Cause: Environment name is incorrect or Lenses agent is offline.
Solution: Run first. Verify the environment name matches what returns.
check_environment_healthlist_environments原因:环境名称不正确或Lenses代理离线。
解决方案:先运行。验证环境名称与返回的名称一致。
check_environment_healthlist_environmentsNo topics returned
未返回任何主题
Cause: Environment exists but has no topics or permissions are restricted.
Solution: Confirm the cluster has topics via the Lenses UI. Check that the Lenses agent has read access.
原因:环境存在但无主题,或权限受限。
解决方案:通过Lenses UI确认集群存在主题。检查Lenses代理是否具有读取权限。
Metadata endpoint returns empty
元数据端点返回空值
Cause: Schema Registry is not configured or topics have no registered schemas.
Solution: This is a valid finding - report it as missing metadata rather than treating it as an error.
原因:Schema Registry未配置,或主题未注册Schema。
解决方案:这是有效的审计发现——将其报告为缺失元数据,而非错误。
Output Format
输出格式
undefinedundefinedTopic Audit Report
Topic Audit Report
Environment: {name}
Environment: {name}
- Brokers: X | Topics: Y | Consumer groups: Z
- Brokers: X | Topics: Y | Consumer groups: Z
Critical (must fix)
Critical (must fix)
- [topic-name] Description of the issue Current: {current value} | Recommended: {recommended value}
- [topic-name] Description of the issue Current: {current value} | Recommended: {recommended value}
Warning (should fix)
Warning (should fix)
- [topic-name] Description of the issue Current: {current value} | Recommended: {recommended value}
- [topic-name] Description of the issue Current: {current value} | Recommended: {recommended value}
Suggestion (consider improving)
Suggestion (consider improving)
- [topic-name] Description of the issue Recommendation: How to fix it
- [topic-name] Description of the issue Recommendation: How to fix it
Summary
Summary
- X critical issues found
- Y warnings found
- Z suggestions found
- Topics audited: N
- Orphaned topics: M
undefined- X critical issues found
- Y warnings found
- Z suggestions found
- Topics audited: N
- Orphaned topics: M
undefined