testability-scoring
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTestability Scoring
可测试性评分
<default_to_action>
When assessing testability:
- RUN assessment against target URL
- ANALYZE all 10 principles automatically
- GENERATE HTML report with radar chart
- PRIORITIZE improvements by impact/effort
- INTEGRATE with QX Partner for holistic view
Quick Assessment:
bash
undefined<default_to_action>
评估可测试性时:
- 针对目标URL执行评估
- 自动分析全部10项原则
- 生成带雷达图的HTML报告
- 按影响/投入优先级排序改进项
- 与QX Partner集成以获取全面视图
快速评估:
bash
undefinedRun assessment on any URL
对任意URL运行评估
TEST_URL='https://example.com/' npx playwright test tests/testability-scoring/testability-scoring.spec.js --project=chromium --workers=1
TEST_URL='https://example.com/' npx playwright test tests/testability-scoring/testability-scoring.spec.js --project=chromium --workers=1
Or use shell script wrapper
或使用Shell脚本封装
.claude/skills/testability-scoring/scripts/run-assessment.sh https://example.com/
**The 10 Principles at a Glance:**
| Principle | Weight | Key Question |
|-----------|--------|--------------|
| **Observability** | 15% | Can we see what's happening? |
| **Controllability** | 15% | Can we control the application? |
| **Algorithmic Simplicity** | 10% | Are behaviors predictable? |
| **Algorithmic Transparency** | 10% | Can we understand what it does? |
| **Algorithmic Stability** | 10% | Does behavior remain consistent? |
| **Explainability** | 10% | Is the interface understandable? |
| **Unbugginess** | 10% | How error-free is it? |
| **Smallness** | 10% | Are components appropriately sized? |
| **Decomposability** | 5% | Can we test parts in isolation? |
| **Similarity** | 5% | Is the tech stack familiar? |
**Grade Scale:**
- **A (90-100)**: Excellent testability
- **B (80-89)**: Good testability
- **C (70-79)**: Adequate testability
- **D (60-69)**: Below average
- **F (0-59)**: Poor testability
</default_to_action>.claude/skills/testability-scoring/scripts/run-assessment.sh https://example.com/
**10项核心原则概览:**
| 原则 | 权重 | 核心问题 |
|-----------|--------|--------------|
| **可观测性** | 15% | 我们能否了解系统内部状态? |
| **可控制性** | 15% | 我们能否控制应用行为? |
| **算法简洁性** | 10% | 系统行为是否可预测? |
| **算法透明性** | 10% | 我们能否理解系统的运作逻辑? |
| **算法稳定性** | 10% | 系统行为是否保持一致? |
| **可解释性** | 10% | 界面是否易于理解? |
| **低缺陷性** | 10% | 系统的无错误运行能力如何? |
| **轻量化** | 10% | 组件规模是否合理? |
| **可分解性** | 5% | 我们能否独立测试各个组件? |
| **相似性** | 5% | 技术栈是否为人熟知? |
**评分等级:**
- **A (90-100)**: 优秀可测试性
- **B (80-89)**: 良好可测试性
- **C (70-79)**: 合格可测试性
- **D (60-69)**: 低于平均水平
- **F (0-59)**: 较差可测试性
</default_to_action>Quick Reference Card
快速参考卡片
Running Assessments
运行评估
| Method | Command | When to Use |
|---|---|---|
| Shell Script | | One-time assessment |
| ENV Override | | CI/CD integration |
| Config File | Update | Repeated runs |
| 方法 | 命令 | 适用场景 |
|---|---|---|
| Shell脚本 | | 一次性评估 |
| 环境变量覆盖 | | CI/CD集成 |
| 配置文件 | 更新 | 重复运行 |
Principle Details
原则详情
High Weight (15% each)
高权重(各15%)
| Principle | Measures | Indicators |
|---|---|---|
| Observability | State visibility, logging, monitoring | Console output, network tracking, error visibility |
| Controllability | Input control, state manipulation | API access, test data injection, determinism |
| 原则 | 衡量维度 | 评估指标 |
|---|---|---|
| 可观测性 | 状态可见性、日志、监控 | 控制台输出、网络追踪、错误可见性 |
| 可控制性 | 输入控制、状态操控 | API访问、测试数据注入、确定性 |
Medium Weight (10% each)
中权重(各10%)
| Principle | Measures | Indicators |
|---|---|---|
| Simplicity | Predictable behavior | Clear I/O relationships, low complexity |
| Transparency | Understanding what system does | Visible processes, readable code |
| Stability | Consistent behavior | Change resilience, maintainability |
| Explainability | Interface understanding | Good docs, semantic structure, help text |
| Unbugginess | Error-free operation | Console errors, warnings, runtime issues |
| Smallness | Component size | Element count, script bloat, page complexity |
| 原则 | 衡量维度 | 评估指标 |
|---|---|---|
| 简洁性 | 行为可预测性 | 清晰的输入输出关系、低复杂度 |
| 透明性 | 系统运作逻辑的可理解性 | 可见的流程、可读性强的代码 |
| 稳定性 | 行为一致性 | 变更适应性、可维护性 |
| 可解释性 | 界面可理解性 | 完善的文档、语义化结构、帮助文本 |
| 低缺陷性 | 无错误运行能力 | 控制台错误、警告、运行时问题 |
| 轻量化 | 组件规模 | 元素数量、脚本冗余、页面复杂度 |
Low Weight (5% each)
低权重(各5%)
| Principle | Measures | Indicators |
|---|---|---|
| Decomposability | Isolation testing | Component separation, modular design |
| Similarity | Technology familiarity | Standard frameworks, known patterns |
| 原则 | 衡量维度 | 评估指标 |
|---|---|---|
| 可分解性 | 隔离测试能力 | 组件解耦、模块化设计 |
| 相似性 | 技术栈熟悉度 | 标准框架、已知模式 |
Assessment Workflow
评估流程
1. Navigate to URL → 2. Collect Metrics → 3. Score Principles
↓
4. Generate JSON ← 5. Calculate Grades ← 6. Apply Weights
↓
7. Generate HTML Report with Radar Chart
↓
8. Open in Browser (auto-opens)1. 访问URL → 2. 收集指标 → 3. 为各项原则评分
↓
4. 生成JSON数据 ← 5. 计算等级 ← 6. 应用权重
↓
7. 生成带雷达图的HTML报告
↓
8. 在浏览器中打开(自动打开)Output Files
输出文件
tests/reports/
├── testability-results-<timestamp>.json # Raw data
├── testability-report-<timestamp>.html # Visual report
└── latest.json # Symlinktests/reports/
├── testability-results-<timestamp>.json # 原始数据
├── testability-report-<timestamp>.html # 可视化报告
└── latest.json # 符号链接Integration Examples
集成示例
CI/CD Integration
CI/CD集成
yaml
undefinedyaml
undefinedGitHub Actions
GitHub Actions
-
name: Testability Assessment run: | timeout 180 .claude/skills/testability-scoring/scripts/run-assessment.sh ${{ env.APP_URL }}
-
name: Upload Reports uses: actions/upload-artifact@v3 with: name: testability-reports path: tests/reports/testability-*.html
undefined-
name: 可测试性评估 run: | timeout 180 .claude/skills/testability-scoring/scripts/run-assessment.sh ${{ env.APP_URL }}
-
name: 上传报告 uses: actions/upload-artifact@v3 with: name: testability-reports path: tests/reports/testability-*.html
undefinedQX Partner Integration
QX Partner集成
typescript
// Combine testability with QX analysis
const qxAnalysis = await Task("QX Analysis", {
target: 'https://example.com',
integrateTestability: true
}, "qx-partner");
// Returns combined insights:
// - QX Score: 78/100
// - Testability Integration: Observability 72/100
// - Combined Insight: Low observability may mask UX issuestypescript
// 结合可测试性与QX分析
const qxAnalysis = await Task("QX Analysis", {
target: 'https://example.com',
integrateTestability: true
}, "qx-partner");
// 返回合并后的洞察结果:
// - QX评分: 78/100
// - 可测试性集成: 可观测性 72/100
// - 综合洞察: 低可观测性可能掩盖UX问题Programmatic Usage
程序化调用
typescript
import { runTestabilityAssessment } from './testability';
const results = await runTestabilityAssessment('https://example.com');
console.log(`Overall: ${results.overallScore}/100 (${results.grade})`);
console.log('Recommendations:', results.recommendations);typescript
import { runTestabilityAssessment } from './testability';
const results = await runTestabilityAssessment('https://example.com');
console.log(`总体评分: ${results.overallScore}/100 (等级: ${results.grade})`);
console.log('改进建议:', results.recommendations);Agent Integration
Agent集成
typescript
// Run testability assessment
const assessment = await Task("Testability Assessment", {
url: 'https://example.com',
generateReport: true,
openBrowser: true
}, "qe-quality-analyzer");
// Use with QX Partner for holistic analysis
const qxReport = await Task("Full QX Analysis", {
target: 'https://example.com',
integrateTestability: true,
detectOracleProblems: true
}, "qx-partner");typescript
// 运行可测试性评估
const assessment = await Task("Testability Assessment", {
url: 'https://example.com',
generateReport: true,
openBrowser: true
}, "qe-quality-analyzer");
// 与QX Partner集成以获取全面分析
const qxReport = await Task("Full QX Analysis", {
target: 'https://example.com',
integrateTestability: true,
detectOracleProblems: true
}, "qx-partner");Vibium Integration (Optional)
Vibium集成(可选)
Overview
概述
Vibium browser automation can be used alongside Playwright for enhanced testability assessment. While Playwright remains the primary engine, Vibium offers complementary capabilities for certain metrics.
Installation:
bash
claude mcp add vibium -- npx -y vibiumVibium浏览器自动化可与Playwright配合使用,增强可测试性评估能力。Playwright仍为核心引擎,Vibium为特定指标提供补充能力。
安装:
bash
claude mcp add vibium -- npx -y vibiumVibium-Enhanced Metrics
Vibium增强指标
| Principle | Vibium Enhancement | Benefit |
|---|---|---|
| Observability | Auto-wait duration tracking | Measures DOM stability (30s timeout, 100ms polling) |
| Controllability | Element interaction success rate | Validates automation readiness via MCP |
| Stability | Screenshot consistency | Visual regression detection for layout stability |
| Explainability | Element attribute extraction | ARIA labels, semantic HTML validation |
| 原则 | Vibium增强能力 | 优势 |
|---|---|---|
| 可观测性 | 自动等待时长追踪 | 衡量DOM稳定性(30秒超时,100毫秒轮询) |
| 可控制性 | 元素交互成功率 | 通过MCP验证自动化就绪度 |
| 稳定性 | 截图一致性 | 检测布局稳定性的视觉回归 |
| 可解释性 | 元素属性提取 | ARIA标签、语义化HTML验证 |
When to Use Vibium
何时使用Vibium
✅ USE Vibium for:
- Element stability metrics (auto-wait duration analysis)
- Visual consistency checks (screenshot comparison)
- MCP-native AI agent integration
- Lightweight Docker images (400MB vs 1.2GB)
❌ USE Playwright for:
- Console error detection (Vibium V1 lacks console API)
- Network performance metrics (BiDi network APIs coming in V2)
- Comprehensive browser coverage (Firefox, Safari)
- Production-proven stability (Vibium V1 released Dec 2024)
✅ 推荐使用Vibium的场景:
- 元素稳定性指标(自动等待时长分析)
- 视觉一致性检查(截图对比)
- MCP原生AI Agent集成
- 轻量级Docker镜像(400MB vs 1.2GB)
❌ 推荐使用Playwright的场景:
- 控制台错误检测(Vibium V1缺少控制台API)
- 网络性能指标(BiDi网络API将在V2版本中推出)
- 全面的浏览器覆盖(Firefox、Safari)
- 经过生产环境验证的稳定性(Vibium V1于2024年12月发布)
Hybrid Assessment Example
混合评估示例
typescript
// Testability assessment using both engines
const assessment = {
// Playwright: Comprehensive metrics
playwright: await runPlaywrightAssessment(url),
// Vibium: Stability metrics
vibium: {
elementStability: await measureAutoWaitDuration(url),
visualConsistency: await compareScreenshots(url),
accessibilityAttributes: await extractARIALabels(url)
}
};
// Enhanced Observability Score
const observability =
(assessment.playwright.consoleErrors * 0.6) +
(assessment.vibium.elementStability * 0.4);typescript
// 使用双引擎进行可测试性评估
const assessment = {
// Playwright: 所有10项原则的核心评估
playwright: await runPlaywrightAssessment(url),
// Vibium: 稳定性指标的可选增强
vibium: {
elementStability: await measureAutoWaitDuration(url),
visualConsistency: await compareScreenshots(url),
accessibilityAttributes: await extractARIALabels(url)
}
};
// 增强后的可观测性评分
const observability =
(assessment.playwright.consoleErrors * 0.6) +
(assessment.vibium.elementStability * 0.4);Vibium MCP Tools for Testability
迁移策略
typescript
// 1. Element Stability Measurement
const browser = await browser_launch();
await browser_navigate({ url });
const startTime = Date.now();
const element = await browser_find({ selector: ".critical-element" });
const autoWaitDuration = Date.now() - startTime;
// Lower duration = better stability
// 2. Visual Consistency Check
const screenshot1 = await browser_screenshot();
await browser_navigate({ url }); // Reload
const screenshot2 = await browser_screenshot();
const visualDiff = compareImages(screenshot1.png, screenshot2.png);
// Lower diff = better stability
// 3. Accessibility Attribute Extraction
const elements = await browser_find({ selector: "button, a, input" });
const ariaLabels = elements.map(el => el.attributes["aria-label"]);
const semanticScore = (ariaLabels.filter(Boolean).length / elements.length) * 100;当前版本(V2.2): 混合模式
- Playwright: 所有10项原则的核心引擎
- Vibium: 稳定性指标的可选增强
未来版本(V3.0): 当Vibium V2发布时
- 若满足以下条件,将评估Vibium作为核心引擎的可行性:
- 具备控制台/网络API
- 经过生产环境稳定性验证
- 社区采用率提升
Migration Strategy
Agent协作提示
—
内存命名空间
Current (V2.2): Hybrid approach
- Playwright: Primary engine for all 10 principles
- Vibium: Optional enhancement for stability metrics
Future (V3.0): When Vibium V2 ships
- Evaluate Vibium as primary engine if:
- Console/Network APIs available
- Production stability proven
- Community adoption increases
aqe/testability/
├── assessments/* - 按URL分类的评估结果
├── historical/* - 用于趋势分析的历史评分
├── recommendations/* - 改进建议
├── integration/* - QX集成数据
└── vibium/* - Vibium专属指标(可选)Agent Coordination Hints
集群协作
Memory Namespace
—
aqe/testability/
├── assessments/* - Assessment results by URL
├── historical/* - Historical scores for trend analysis
├── recommendations/* - Improvement recommendations
├── integration/* - QX integration data
└── vibium/* - Vibium-specific metrics (optional)typescript
const testabilityFleet = await FleetManager.coordinate({
strategy: 'testability-assessment',
agents: [
'qe-quality-analyzer', - 核心评估
'qx-partner', - UX集成
'qe-visual-tester' - 视觉验证
],
topology: 'sequential'
});Fleet Coordination
常见问题与解决方案
typescript
const testabilityFleet = await FleetManager.coordinate({
strategy: 'testability-assessment',
agents: [
'qe-quality-analyzer', // Primary assessment
'qx-partner', // UX integration
'qe-visual-tester' // Visual validation
],
topology: 'sequential'
});| 问题 | 解决方案 |
|---|---|
| 测试超时 | 增加超时时间: |
| 结果不完整 | 检查控制台错误,增加网络超时时间 |
| 报告未自动打开 | 使用 |
| 配置未生效 | 使用 |
| Vibium不可用 | 通过 |
| 混合模式报错 | Vibium为可选组件,无Vibium时评估仍可正常运行 |
Common Issues & Solutions
相关技能
| Issue | Solution |
|---|---|
| Tests timing out | Increase timeout: |
| Partial results | Check console errors, increase network timeout |
| Report not opening | Use |
| Config not updating | Use |
| Vibium not available | Install via |
| Hybrid mode errors | Vibium is optional; assessments work without it |
- accessibility-testing - WCAG合规性测试(与可解释性有重叠)
- visual-testing-advanced - UI一致性测试
- performance-testing - 加载时间指标测试
Related Skills
致谢与参考
—
框架来源
- accessibility-testing - WCAG compliance (overlaps with Explainability)
- visual-testing-advanced - UI consistency
- performance-testing - Load time metrics
- James Bach与Michael Bolton所著的《Heuristics for Software Testability》
- 访问地址:https://www.satisfice.com/download/heuristics-of-software-testability
Credits & References
实现细节
Framework Origin
—
- Heuristics for Software Testability by James Bach and Michael Bolton
- Available at: https://www.satisfice.com/download/heuristics-of-software-testability
- 基于https://github.com/fndlalit/testability-scorer开发(由[@fndlalit](https://github.com/fndlalit)贡献)
- 具备AI能力的Playwright v1.49.0+(核心引擎)
- 支持MCP集成的Vibium v1.0+(可选增强)
- 用于雷达图可视化的Chart.js
Implementation
Vibium资源
- Based on https://github.com/fndlalit/testability-scorer (contributed by @fndlalit)
- Playwright v1.49.0+ with AI capabilities (primary engine)
- Vibium v1.0+ with MCP integration (optional enhancement)
- Chart.js for radar visualizations
- GitHub:https://github.com/VibiumDev/vibium
- MCP集成:
claude mcp add vibium -- npx -y vibium - 由Jason Huggins(Selenium/Appium创始人)开发
Vibium Resources
注意事项
- GitHub: https://github.com/VibiumDev/vibium
- MCP Integration:
claude mcp add vibium -- npx -y vibium - Created by Jason Huggins (creator of Selenium/Appium)
可测试性是一种投资,而非事后弥补。
良好的可测试性:
- 减少调试时间
- 实现更快的反馈循环
- 更易发现缺陷
- 支持持续测试
低评分=高风险。请按权重×影响优先级安排改进工作。
Remember
—
Testability is an investment, not an afterthought.
Good testability:
- Reduces debugging time
- Enables faster feedback loops
- Makes defects easier to find
- Supports continuous testing
Low scores = High risk. Prioritize improvements by weight × impact.
—