skill-test
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSkill Test
Skill 测试
Validates files for structural compliance and
behavioral correctness. No external dependencies — runs entirely within the
existing skill/hook/template architecture.
.claude/skills/*/SKILL.mdFour modes:
| Mode | Command | Purpose | Token Cost |
|---|---|---|---|
| | Structural linter — 7 compliance checks per skill | Low (~1k/skill) |
| | Behavioral verifier — evaluates assertions in test spec | Medium (~5k/skill) |
| | Category rubric — checks skill against its category-specific metrics | Low (~2k/skill) |
| | Coverage report — skills, agent specs, last test dates | Low (~3k total) |
验证文件的结构合规性和行为正确性。无需外部依赖——完全在现有Skill/hook/模板架构内运行。
.claude/skills/*/SKILL.md四种模式:
| 模式 | 命令 | 用途 | Token 成本 |
|---|---|---|---|
| `/skill-test static [name\ | all]` | 结构代码检查器——每个Skill执行7项合规性检查 |
| | 行为验证器——评估测试规范中的断言 | 中(约5k/每个Skill) |
| `/skill-test category [name\ | all]` | 分类评估标准——根据Skill所属类别的特定指标进行检查 |
| | 覆盖率报告——展示Skill、Agent规范及上次测试日期 | 低(总计约3k) |
Phase 1: Parse Arguments
阶段1:解析参数
Determine mode from the first argument:
- → run 7 structural checks on one skill
static [name] - → run 7 structural checks on all skills (Glob
static all).claude/skills/*/SKILL.md - → read skill + test spec, evaluate assertions
spec [name] - → run category-specific rubric from
category [name]CCGS Skill Testing Framework/quality-rubric.md - → run category rubric for every skill that has a
category allin catalogcategory: - (or no argument) → read catalog, list all skills and agents, show coverage
audit
If argument is missing or unrecognized, output usage and stop.
从第一个参数确定模式:
- → 对单个Skill执行7项结构检查
static [name] - → 对所有Skill执行7项结构检查(匹配通配符
static all).claude/skills/*/SKILL.md - → 读取Skill及测试规范,评估断言
spec [name] - → 根据
category [name]中的分类特定评估标准执行检查CCGS Skill Testing Framework/quality-rubric.md - → 对目录中包含
category all字段的所有Skill执行分类评估标准检查category: - (或无参数)→ 读取目录,列出所有Skill和Agent,展示覆盖率
audit
如果参数缺失或无法识别,输出使用说明并停止运行。
Phase 2A: Static Mode — Structural Linter
阶段2A:Static模式——结构代码检查器
For each skill being tested, read its fully and run all 7 checks:
SKILL.md对于每个待测试的Skill,完整读取其文件并执行全部7项检查:
SKILL.mdCheck 1 — Required Frontmatter Fields
检查1——必填前置元数据字段
The file must contain all of these in the YAML frontmatter block:
name:description:argument-hint:user-invocable:allowed-tools:
FAIL if any are absent.
文件的YAML前置元数据块必须包含以下所有字段:
name:description:argument-hint:user-invocable:allowed-tools:
不通过:如果任意字段缺失。
Check 2 — Multiple Phases
检查2——多阶段结构
The skill must have ≥2 numbered phase headings. Look for patterns like:
- or
## Phase N## Phase N: - (numbered top-level sections)
## N. - At least 2 distinct headings if phases aren't explicitly numbered
##
FAIL if fewer than 2 phase-like headings are found.
Skill必须包含≥2个带编号的阶段标题。查找如下格式:
- 或
## Phase N## Phase N: - (带编号的顶级章节)
## N. - 如果未明确编号阶段,则至少有2个不同的标题
##
不通过:如果找到的类阶段标题少于2个。
Check 3 — Verdict Keywords
检查3——结论关键词
The skill must contain at least one of: , , , ,
, , , ,
PASSFAILCONCERNSAPPROVEDBLOCKEDCOMPLETEREADYCOMPLIANTNON-COMPLIANTFAIL if none are present.
Skill必须包含至少一个以下关键词:, , , , , , , ,
PASSFAILCONCERNSAPPROVEDBLOCKEDCOMPLETEREADYCOMPLIANTNON-COMPLIANT不通过:如果未找到任何关键词。
Check 4 — Collaborative Protocol Language
检查4——协作协议语言
The skill must contain ask-before-write language. Look for:
- (canonical form)
"May I write" - or
"before writing"near file-write instructions"approval" - +
"ask"in close proximity (within same section)"write"
WARN if absent (some read-only skills legitimately skip this).
FAIL if includes or but no ask-before-write language is found.
allowed-toolsWriteEditSkill必须包含“写入前询问”的相关表述。查找如下内容:
- (标准形式)
"May I write" - 文件写入说明附近出现或
"before writing""approval" - 同一章节中与
"ask"同时出现(位置相近)"write"
警告:如果未找到(部分只读Skill可合法跳过此项)。
不通过:如果包含或但未找到“写入前询问”表述。
allowed-toolsWriteEditCheck 5 — Next-Step Handoff
检查5——后续步骤交接
The skill must end with a recommended next action or follow-up path. Look for:
- A final section mentioning another skill (e.g., ,
/story-done)/gate-check - "Recommended next" or "next step" phrasing
- A "Follow-Up" or "After this" section
WARN if absent.
Skill结尾必须包含推荐的下一步操作或跟进路径。查找如下内容:
- 最后一个章节提及其他Skill(例如,
/story-done)/gate-check - 出现“Recommended next”或“next step”类表述
- 包含“Follow-Up”或“After this”章节
警告:如果未找到。
Check 6 — Fork Context Complexity
检查6——Fork上下文复杂度
If frontmatter contains , the skill should have ≥5 phase headings
( level or numbered Phase N headers). Fork context is for complex multi-phase
skills; simple skills should not use it.
context: fork##WARN if is set but fewer than 5 phases found.
context: fork如果前置元数据包含,Skill应包含≥5个阶段标题(级别或带编号的Phase N标题)。Fork上下文适用于复杂的多阶段Skill;简单Skill不应使用此配置。
context: fork##警告:如果设置了但找到的阶段少于5个。
context: forkCheck 7 — Argument Hint Plausibility
检查7——参数提示合理性
argument-hintWARN if hint is or if documented modes don't match hint.
""argument-hint警告:如果提示为空或文档记录的模式与提示不匹配。
""Static Mode Output Format
Static模式输出格式
For a single skill:
=== Skill Static Check: /[name] ===
Check 1 — Frontmatter Fields: PASS
Check 2 — Multiple Phases: PASS (7 phases found)
Check 3 — Verdict Keywords: PASS (PASS, FAIL, CONCERNS)
Check 4 — Collaborative Protocol: PASS ("May I write" found)
Check 5 — Next-Step Handoff: WARN (no follow-up section found)
Check 6 — Fork Context Complexity: PASS (8 phases, context: fork set)
Check 7 — Argument Hint: PASS
Verdict: WARNINGS (1 warning, 0 failures)
Recommended: Add a "Follow-Up Actions" section at the end of the skill.For , produce a summary table then list any non-compliant skills:
static all=== Skill Static Check: All 52 Skills ===
Skill | Result | Issues
-----------------------|--------------|-------
gate-check | COMPLIANT |
design-review | COMPLIANT |
story-readiness | WARNINGS | Check 5: no handoff
...
Summary: 48 COMPLIANT, 3 WARNINGS, 1 NON-COMPLIANT
Aggregate Verdict: N WARNINGS / N FAILURES单个Skill的输出:
=== Skill Static Check: /[name] ===
Check 1 — Frontmatter Fields: PASS
Check 2 — Multiple Phases: PASS (7 phases found)
Check 3 — Verdict Keywords: PASS (PASS, FAIL, CONCERNS)
Check 4 — Collaborative Protocol: PASS ("May I write" found)
Check 5 — Next-Step Handoff: WARN (no follow-up section found)
Check 6 — Fork Context Complexity: PASS (8 phases, context: fork set)
Check 7 — Argument Hint: PASS
Verdict: WARNINGS (1 warning, 0 failures)
Recommended: Add a "Follow-Up Actions" section at the end of the skill.static all=== Skill Static Check: All 52 Skills ===
Skill | Result | Issues
-----------------------|--------------|-------
gate-check | COMPLIANT |
design-review | COMPLIANT |
story-readiness | WARNINGS | Check 5: no handoff
...
Summary: 48 COMPLIANT, 3 WARNINGS, 1 NON-COMPLIANT
Aggregate Verdict: N WARNINGS / N FAILURESPhase 2B: Spec Mode — Behavioral Verifier
阶段2B:Spec模式——行为验证器
Step 1 — Locate Files
步骤1——定位文件
Find skill at .
Look up the spec path from — use the
field for the matching skill entry.
.claude/skills/[name]/SKILL.mdCCGS Skill Testing Framework/catalog.yamlspec:If either is missing:
- Missing skill: "Skill '[name]' not found in ."
.claude/skills/ - Missing spec path in catalog: "No spec path set for '[name]' in catalog.yaml."
- Spec file not found at path: "Spec file missing at [path]. Run to see coverage gaps."
/skill-test audit
在路径下查找Skill。
从中查找规范路径——使用匹配Skill条目中的字段。
.claude/skills/[name]/SKILL.mdCCGS Skill Testing Framework/catalog.yamlspec:如果任意文件缺失:
- Skill缺失:"Skill '[name]' 未在中找到。"
.claude/skills/ - 目录中缺失规范路径:"catalog.yaml中未为'[name]'设置规范路径。"
- 路径下未找到规范文件:"规范文件在[path]路径缺失。运行查看覆盖率缺口。"
/skill-test audit
Step 2 — Read Both Files
步骤2——读取两个文件
Read the skill file and test spec file completely.
完整读取Skill文件和测试规范文件。
Step 3 — Evaluate Assertions
步骤3——评估断言
For each Test Case in the spec:
- Read the Fixture description (assumed state of project files)
- Read the Expected behavior steps
- Read each Assertion checkbox
For each assertion, evaluate whether the skill's written instructions, if
followed correctly given the fixture state, would satisfy it. This is a
Claude-evaluated reasoning check, not code execution.
Mark each assertion:
- PASS — skill instructions clearly satisfy this assertion
- PARTIAL — skill instructions partially address it, but with ambiguity
- FAIL — skill instructions would NOT satisfy this assertion given the fixture
For Protocol Compliance assertions (always present):
- Check whether the skill requires "May I write" before file writes
- Check whether the skill presents findings before requesting approval
- Check whether the skill ends with a recommended next step
- Check whether the skill avoids auto-creating files without approval
对于规范中的每个测试用例:
- 读取Fixture描述(假设的项目文件状态)
- 读取预期行为步骤
- 读取每个断言复选框
对于每个断言,评估在给定Fixture状态下,严格遵循Skill的书面说明是否能满足该断言。这是由Claude执行的推理检查,而非代码执行。
为每个断言标记:
- PASS——Skill说明明确满足该断言
- PARTIAL——Skill说明部分满足该断言,但存在歧义
- FAIL——在给定Fixture状态下,Skill说明无法满足该断言
对于协议合规性断言(始终存在):
- 检查Skill是否要求在写入文件前使用"May I write"
- 检查Skill是否在请求批准前展示结果
- 检查Skill结尾是否包含推荐的下一步操作
- 检查Skill是否避免未经批准自动创建文件
Step 4 — Build Report
步骤4——生成报告
=== Skill Spec Test: /[name] ===
Date: [date]
Spec: CCGS Skill Testing Framework/skills/[category]/[name].md
Case 1: [Happy Path — name]
Fixture: [summary]
Assertions:
[PASS] [assertion text]
[FAIL] [assertion text]
Reason: The skill's Phase 3 says "..." but the fixture state means "..."
Case Verdict: FAIL
Case 2: [Edge Case — name]
...
Case Verdict: PASS
Protocol Compliance:
[PASS] Uses "May I write" before file writes
[PASS] Presents findings before asking approval
[WARN] No explicit next-step handoff at end
Overall Verdict: FAIL (1 case failed, 1 warning)=== Skill Spec Test: /[name] ===
Date: [date]
Spec: CCGS Skill Testing Framework/skills/[category]/[name].md
Case 1: [Happy Path — name]
Fixture: [summary]
Assertions:
[PASS] [assertion text]
[FAIL] [assertion text]
Reason: The skill's Phase 3 says "..." but the fixture state means "..."
Case Verdict: FAIL
Case 2: [Edge Case — name]
...
Case Verdict: PASS
Protocol Compliance:
[PASS] Uses "May I write" before file writes
[PASS] Presents findings before asking approval
[WARN] No explicit next-step handoff at end
Overall Verdict: FAIL (1 case failed, 1 warning)Step 5 — Offer to Write Results
步骤5——询问是否写入结果
"May I write these results to
and update ?"
CCGS Skill Testing Framework/results/skill-test-spec-[name]-[date].mdCCGS Skill Testing Framework/catalog.yamlIf yes:
- Write results file to
CCGS Skill Testing Framework/results/ - Update the skill's entry in :
CCGS Skill Testing Framework/catalog.yamllast_spec: [date]last_spec_result: PASS|PARTIAL|FAIL
"是否允许我将这些结果写入并更新?"
CCGS Skill Testing Framework/results/skill-test-spec-[name]-[date].mdCCGS Skill Testing Framework/catalog.yaml如果同意:
- 将结果文件写入路径
CCGS Skill Testing Framework/results/ - 更新中的Skill条目:
CCGS Skill Testing Framework/catalog.yamllast_spec: [date]last_spec_result: PASS|PARTIAL|FAIL
Phase 2D: Category Mode — Rubric Evaluation
阶段2D:Category模式——评估标准验证
Step 1 — Locate Skill and Category
步骤1——定位Skill和分类
Find skill at .
Look up field in .
.claude/skills/[name]/SKILL.mdcategory:CCGS Skill Testing Framework/catalog.yamlIf skill not found: "Skill '[name]' not found."
If no field: "No category assigned for '[name]' in catalog.yaml.
Add to the skill entry first."
category:category: [name]For : collect all skills with a field and process each.
skills are evaluated against U1 (static checks pass) and U2
(gate mode correct if applicable) only — skip to the static mode for U1.
category allcategory:category: utility在路径下查找Skill。
从中查找字段。
.claude/skills/[name]/SKILL.mdCCGS Skill Testing Framework/catalog.yamlcategory:如果Skill未找到:"Skill '[name]' 未找到。"
如果无字段:"catalog.yaml中未为'[name]'分配分类。请先在Skill条目中添加。"
category:category: [name]对于:收集所有包含字段的Skill并逐一处理。
类Skill仅根据U1(静态检查通过)和U2(适用时gate模式正确)进行评估——直接跳至U1的静态模式检查。
category allcategory:category: utilityStep 2 — Read Rubric Section
步骤2——读取评估标准章节
Read .
Extract the section matching the skill's category (e.g., , ).
CCGS Skill Testing Framework/quality-rubric.md### gate### team读取。
提取与Skill所属分类匹配的章节(例如, )。
CCGS Skill Testing Framework/quality-rubric.md### gate### teamStep 3 — Read Skill
步骤3——读取Skill
Read the skill's fully.
SKILL.md完整读取Skill的文件。
SKILL.mdStep 4 — Evaluate Rubric Metrics
步骤4——评估分类评估标准指标
For each metric in the category's rubric table:
- Check whether the skill's written instructions clearly satisfy the criterion
- Mark PASS, FAIL, or WARN
- For FAIL/WARN, identify the exact gap in the skill text (quote the relevant section or note its absence)
对于分类评估标准表格中的每个指标:
- 检查Skill的书面说明是否明确满足该标准
- 标记为PASS、FAIL或WARN
- 对于FAIL/WARN,指出Skill文本中的具体缺口(引用相关章节或说明其缺失)
Step 5 — Output Report
步骤5——输出报告
=== Skill Category Check: /[name] ([category]) ===
Metric G1 — Review mode read: PASS
Metric G2 — Full mode directors: FAIL
Gap: Phase 3 spawns only CD-PHASE-GATE; TD-PHASE-GATE, PR-PHASE-GATE, AD-PHASE-GATE absent
Metric G3 — Lean mode: PHASE-GATE only: PASS
Metric G4 — Solo mode: no directors: PASS
Metric G5 — No auto-advance: PASS
Verdict: FAIL (1 failure, 0 warnings)
Fix: Add TD-PHASE-GATE, PR-PHASE-GATE, and AD-PHASE-GATE to the full-mode director
panel in Phase 3.=== Skill Category Check: /[name] ([category]) ===
Metric G1 — Review mode read: PASS
Metric G2 — Full mode directors: FAIL
Gap: Phase 3 spawns only CD-PHASE-GATE; TD-PHASE-GATE, PR-PHASE-GATE, AD-PHASE-GATE absent
Metric G3 — Lean mode: PHASE-GATE only: PASS
Metric G4 — Solo mode: no directors: PASS
Metric G5 — No auto-advance: PASS
Verdict: FAIL (1 failure, 0 warnings)
Fix: Add TD-PHASE-GATE, PR-PHASE-GATE, and AD-PHASE-GATE to the full-mode director
panel in Phase 3.Step 6 — Offer to Update Catalog
步骤6——询问是否更新目录
"May I update to record this category check
(, ) for [name]?"
CCGS Skill Testing Framework/catalog.yamllast_categorylast_category_result"是否允许我更新以记录此分类检查结果(, )?"
CCGS Skill Testing Framework/catalog.yamllast_categorylast_category_resultPhase 2C: Audit Mode — Coverage Report
阶段2C:Audit模式——覆盖率报告
Step 1 — Read Catalog
步骤1——读取目录
Read . If missing, note that catalog doesn't exist
yet (first-run state).
CCGS Skill Testing Framework/catalog.yaml读取。如果缺失,说明目录尚未创建(首次运行状态)。
CCGS Skill Testing Framework/catalog.yamlStep 2 — Enumerate All Skills and Agents
步骤2——枚举所有Skill和Agent
Glob to get the complete list of skills.
Extract skill name from each path (directory name).
.claude/skills/*/SKILL.mdAlso read the section from to get the
complete list of agents.
agents:CCGS Skill Testing Framework/catalog.yaml使用通配符获取完整的Skill列表。
从每个路径中提取Skill名称(目录名称)。
.claude/skills/*/SKILL.md同时从的章节读取完整的Agent列表。
CCGS Skill Testing Framework/catalog.yamlagents:Step 3 — Build Skill Coverage Table
步骤3——生成Skill覆盖率表格
For each skill:
- Check if a spec file exists (use the path from catalog, or glob
spec:)CCGS Skill Testing Framework/skills/*/[name].md - Look up ,
last_static,last_static_result,last_spec,last_spec_result,last_category,last_category_resultfrom catalog (or mark as "never" / "—" if not in catalog)category - Priority comes from catalog field (critical/high/medium/low)
priority:
对于每个Skill:
- 检查是否存在规范文件(使用目录中的路径,或通配符
spec:)CCGS Skill Testing Framework/skills/*/[name].md - 从目录中查找,
last_static,last_static_result,last_spec,last_spec_result,last_category,last_category_result字段(如果目录中不存在则标记为"never" / "—")category - 优先级来自目录中的字段(critical/high/medium/low)
priority:
Step 3b — Build Agent Coverage Table
步骤3b——生成Agent覆盖率表格
For each agent in catalog's section:
agents:- Check if a spec file exists (use the path from catalog, or glob
spec:)CCGS Skill Testing Framework/agents/*/[name].md - Look up ,
last_spec,last_spec_resultfrom catalogcategory
对于目录章节中的每个Agent:
agents:- 检查是否存在规范文件(使用目录中的路径,或通配符
spec:)CCGS Skill Testing Framework/agents/*/[name].md - 从目录中查找,
last_spec,last_spec_result字段category
Step 4 — Output Report
步骤4——输出报告
=== Skill Test Coverage Audit ===
Date: [date]
SKILLS (72 total)
Specs written: 72 (100%) | Never static tested: 72 | Never category tested: 72
Skill | Cat | Has Spec | Last Static | S.Result | Last Cat | C.Result | Priority
-----------------------|----------|----------|-------------|----------|----------|----------|----------
gate-check | gate | YES | never | — | never | — | critical
design-review | review | YES | never | — | never | — | critical
...
AGENTS (49 total)
Agent specs written: 49 (100%)
Agent | Category | Has Spec | Last Spec | Result
-----------------------|------------|----------|-------------|--------
creative-director | director | YES | never | —
technical-director | director | YES | never | —
...
Top 5 Priority Gaps (skills with no spec, critical/high priority):
(none if all specs are written)
Skill coverage: 72/72 specs (100%)
Agent coverage: 49/49 specs (100%)No file writes in audit mode.
Offer: "Would you like to run to check structural
compliance across all skills? to run category rubric
checks? Or to run a specific behavioral test?"
/skill-test static all/skill-test category all/skill-test spec [name]=== Skill Test Coverage Audit ===
Date: [date]
SKILLS (72 total)
Specs written: 72 (100%) | Never static tested: 72 | Never category tested: 72
Skill | Cat | Has Spec | Last Static | S.Result | Last Cat | C.Result | Priority
-----------------------|----------|----------|-------------|----------|----------|----------|----------
gate-check | gate | YES | never | — | never | — | critical
design-review | review | YES | never | — | never | — | critical
...
AGENTS (49 total)
Agent specs written: 49 (100%)
Agent | Category | Has Spec | Last Spec | Result
-----------------------|------------|----------|-------------|--------
creative-director | director | YES | never | —
technical-director | director | YES | never | —
...
Top 5 Priority Gaps (skills with no spec, critical/high priority):
(none if all specs are written)
Skill coverage: 72/72 specs (100%)
Agent coverage: 49/49 specs (100%)Audit模式下不执行文件写入操作。
询问:"是否要运行检查所有Skill的结构合规性?运行执行分类评估标准检查?或者运行执行特定的行为测试?"
/skill-test static all/skill-test category all/skill-test spec [name]Phase 3: Recommended Next Steps
阶段3:推荐后续步骤
After any mode completes, offer contextual follow-up:
- After : "Run
static [name]to validate behavioral correctness if a test spec exists."/skill-test spec [name] - After with failures: "Address NON-COMPLIANT skills first. Run
static allindividually for detailed remediation guidance."/skill-test static [name] - After PASS: "Update
spec [name]to record this pass date. Consider runningCCGS Skill Testing Framework/catalog.yamlto find the next spec gap."/skill-test audit - After FAIL: "Review the failing assertions and update the skill or the test spec to resolve the mismatch."
spec [name] - After : "Start with the critical-priority gaps. Use the spec template at
auditto create new specs."CCGS Skill Testing Framework/templates/skill-test-spec.md
任何模式完成后,提供上下文相关的跟进建议:
- 执行后:"如果存在测试规范,运行
static [name]验证行为正确性。"/skill-test spec [name] - 执行且存在不通过项后:"优先处理NON-COMPLIANT类Skill。单独运行
static all获取详细的修复指导。"/skill-test static [name] - 执行且结果为PASS后:"更新
spec [name]记录本次通过日期。考虑运行CCGS Skill Testing Framework/catalog.yaml查找下一个规范缺口。"/skill-test audit - 执行且结果为FAIL后:"查看未通过的断言并更新Skill或测试规范以解决不匹配问题。"
spec [name] - 执行后:"从高优先级缺口开始处理。使用
audit路径下的规范模板创建新规范。" ",CCGS Skill Testing Framework/templates/skill-test-spec.md