quality-auditor

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Quality Auditor

质量审核工具

Overview

概述

Evaluates tools, frameworks, systems, and codebases against the highest industry standards across 12 weighted dimensions. Produces evidence-based scores, identifies anti-patterns, and generates prioritized improvement roadmaps. Applies extra scrutiny to AI-generated code through the verification gap protocol, ensuring velocity does not compromise integrity.
When to use: Auditing code quality, reviewing AI-generated code, scoring codebases against industry benchmarks, enforcing pre-commit quality gates, comparing tools or frameworks, assessing technical debt.
When NOT to use: Quick code reviews without scoring, style-only linting (use a linter), feature implementation, routine PR reviews that do not require a full audit.
评估工具、框架、系统和代码库是否符合最高行业标准,涵盖12个加权维度。生成基于证据的评分,识别反模式,并制定优先级明确的改进路线图。通过验证缺口协议对AI生成的代码进行额外审查,确保开发速度不影响代码完整性。
适用场景:审核代码质量、审查AI生成的代码、对标行业基准为代码库评分、执行提交前质量门禁、对比工具或框架、评估技术债务。
不适用场景:无需评分的快速代码审查、仅针对代码风格的检查(请使用linter工具)、功能开发实现、无需完整审核的常规PR审查。

Quick Reference

快速参考

DimensionWeightWhat to Evaluate
Code Quality10%Structure, patterns, SOLID, duplication, complexity, error handling
Architecture10%Design, modularity, scalability, coupling/cohesion, API design
Documentation10%Completeness, clarity, accuracy, examples, troubleshooting
Usability10%Learning curve, installation ease, error messages, ergonomics
Performance8%Speed, resource usage, caching, bundle size, Core Web Vitals
Security10%OWASP Top 10, input validation, auth, secrets, dependencies
Testing8%Coverage (unit/integration/e2e), quality, automation, organization
Maintainability8%Technical debt, readability, refactorability, versioning
Developer Experience10%Setup ease, debugging, tooling, hot reload, IDE integration
Accessibility8%WCAG compliance, keyboard nav, screen readers, cognitive load
CI/CD5%Automation, pipelines, deployment, rollback, monitoring
Innovation3%Novel approaches, forward-thinking design, unique value
维度权重评估内容
代码质量10%结构、设计模式、SOLID原则、代码重复度、复杂度、错误处理
架构设计10%设计方案、模块化、可扩展性、耦合/内聚性、API设计
文档10%完整性、清晰度、准确性、示例、故障排查指南
易用性10%学习曲线、安装便捷性、错误提示、人机工程学
性能8%速度、资源占用、缓存、包体积、Core Web Vitals
安全性10%OWASP Top 10、输入验证、身份认证、密钥管理、依赖项安全
测试8%覆盖率(单元/集成/端到端)、测试质量、自动化程度、测试组织
可维护性8%技术债务、可读性、可重构性、版本管理
开发者体验10%搭建便捷性、调试体验、工具链、热重载、IDE集成
可访问性8%WCAG合规性、键盘导航、屏幕阅读器兼容性、认知负荷
CI/CD流程5%自动化程度、流水线、部署、回滚、监控
创新性3%新颖方案、前瞻性设计、独特价值

Audit Phases

审核阶段

PhaseNamePurpose
0Resource CompletenessVerify registry/filesystem parity; audit fails if this fails
1DiscoveryRead docs, examine code, test system, review supporting materials
2EvaluationScore each dimension with evidence, strengths, and weaknesses
3SynthesisExecutive summary, detailed scores, recommendations, risk matrix
阶段名称目的
0资源完整性检查验证注册表/文件系统一致性;若不通过则审核直接失败
1发现调研阅读文档、检查代码、测试系统、查阅支撑材料
2维度评估为每个维度评分并提供证据,标注优势与不足
3结果整合执行摘要、详细评分、改进建议、风险矩阵

Scoring Scale

评分标准

ScoreRatingMeaning
10ExceptionalIndustry-leading, sets new standards
8-9ExcellentExceeds expectations significantly
6-7GoodMeets expectations with improvements needed
5AcceptableBelow average, significant improvements
3-4PoorMajor gaps and fundamental problems
1-2CriticalBarely functional or non-functional
分数评级含义
10卓越级行业领先,树立新标准
8-9优秀级显著超出预期
6-7良好级符合预期,但仍需改进
5合格级低于平均水平,需大幅改进
3-4较差级存在重大缺口与基础问题
1-2危急级基本无法使用或完全不可用

Common Mistakes

常见误区

MistakeCorrect Pattern
Giving inflated scores without evidenceEvery score must cite specific files, metrics, or code examples as evidence
Skipping Phase 0 resource completeness checkAlways verify registry completeness first; missing resources cap the overall score at 6/10
Evaluating only code quality, ignoring dimensionsScore all 12 dimensions with appropriate weights; architecture, security, and DX matter equally
Accepting superficial "LGTM" reviewsPerform deep semantic audits checking contract integrity, security sanitization, and performance hygiene
Trusting AI-generated code without verificationApply the verification gap protocol: critic agents, verifiable goals, human oversight for critical paths
Proceeding after audit failure without re-auditStop, analyze the deviation, remediate, then restart the checklist from step 1
Using 10/10 scores without exceptional evidenceReserve 10/10 for truly industry-leading work; most quality tools score 6-7
Surface-level static analysis onlyCombine linting with architectural fit checks, risk-based PR categorization, and context-aware validation
误区正确做法
无证据给出虚高评分所有评分必须引用具体文件、指标或代码示例作为证据
跳过阶段0的资源完整性检查始终先验证注册表完整性;资源缺失会将整体评分上限限制为6/10
仅评估代码质量,忽略其他维度需为所有12个维度按权重评分;架构、安全和开发者体验同样重要
接受表面化的"LGTM"(看起来没问题)审查执行深度语义审核,检查契约完整性、安全清理、性能健康状况
未经验证就信任AI生成的代码应用验证缺口协议:使用批评Agent、可验证目标、关键路径需人工监督
审核失败后未重新审核就继续推进停止流程,分析偏差原因,修复问题后从步骤1重新开始检查
无卓越证据就给出10/10评分10/10评分仅授予真正行业领先的成果;大多数质量工具评分在6-7分之间
仅做表层静态分析将代码检查与架构适配性检查、基于风险的PR分类、上下文感知验证相结合

Delegation

任务委派

  • Discover codebase structure and gather audit evidence: Use
    Explore
    agent to survey file organization, dependencies, test coverage, and documentation
  • Execute targeted quality checks across dimensions: Use
    Task
    agent to run linters, security scanners, performance profilers, and accessibility audits
  • Design quality improvement roadmap: Use
    Plan
    agent to prioritize quick wins, short-term, and long-term recommendations from audit findings
For stylistic cleanup of AI-generated prose and code (emdash overuse, slop vocabulary, over-commenting, verbose naming), use the
de-slopify
skill.
If the
usability-tester
skill is available, delegate usability dimension evaluation and user flow validation to it. Otherwise, recommend:
pnpm dlx skills add oakoss/agent-skills -s usability-tester -a claude-code -y
  • 探索代码库结构并收集审核证据:使用
    Explore
    Agent调查文件组织、依赖项、测试覆盖率和文档情况
  • 跨维度执行针对性质量检查:使用
    Task
    Agent运行代码检查工具、安全扫描器、性能分析器和可访问性审核工具
  • 设计质量改进路线图:使用
    Plan
    Agent根据审核结果优先规划速赢项、短期和长期改进建议
若要清理AI生成的 prose 和代码中的风格问题(过度使用破折号、冗余词汇、过度注释、命名冗长),请使用
de-slopify
技能。
usability-tester
技能可用,可将易用性维度评估和用户流验证委派给它。 否则,建议执行:
pnpm dlx skills add oakoss/agent-skills -s usability-tester -a claude-code -y

References

参考资料

  • Audit Rubric -- pass/warn/fail thresholds, weighted scoring methodology, automated vs manual checklists, score caps, report format
  • Dimension Rubrics -- detailed scoring criteria, evidence requirements, and rubric tables for all 12 dimensions
  • Audit Report Template -- structured report format, executive summary, recommendations, risk assessment
  • Anti-Patterns Guide -- code, architecture, security, testing, and process anti-patterns to identify during audits
  • Verification Gap Protocol -- AI code verification methodology, critic agents, rejection protocol, risk-based review strategies
  • 审核规则 -- 通过/警告/失败阈值、加权评分方法、自动化与手动检查清单、评分上限、报告格式
  • 维度规则 -- 所有12个维度的详细评分标准、证据要求和规则表格
  • 审核报告模板 -- 结构化报告格式、执行摘要、改进建议、风险评估
  • 反模式指南 -- 审核中需识别的代码、架构、安全、测试和流程反模式
  • 验证缺口协议 -- AI代码验证方法、批评Agent、拒绝协议、基于风险的审查策略