sdd-verify
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinesesdd-verify
sdd-verify
Verifies that the implementation complies with the specs, design, and task plan.
Triggers: , verify implementation, quality gate, validate change, sdd verify
/sdd-verify <change-name>验证实现是否符合规格说明、设计方案和任务计划。
触发方式: , verify implementation, quality gate, validate change, sdd verify
/sdd-verify <change-name>Step 0 — Load project context + Spec context preload
步骤0 — 加载项目上下文 + 规格说明上下文预加载
Follow Section F (Project Context Load) and Section G (Spec Context Preload). Both are non-blocking.
skills/_shared/sdd-phase-common.md遵循中的F部分(项目上下文加载)和G部分(规格说明上下文预加载)。这两个步骤均为非阻塞式。
skills/_shared/sdd-phase-common.mdPurpose
目的
Verification is the quality gate before archiving. It objectively validates that what was implemented meets what was specified. It fixes nothing — it only reports.
验证是归档前的质量关卡。它客观地验证已实现内容是否符合规格要求。此过程不进行任何修复——仅生成验证报告。
Process
流程
Skill Resolution
技能解析
When the orchestrator launches this sub-agent, it resolves the skill path using:
1. .claude/skills/sdd-verify/SKILL.md (project-local — highest priority)
2. ~/.claude/skills/sdd-verify/SKILL.md (global catalog — fallback)Project-local skills override the global catalog. See for the full algorithm.
docs/SKILL-RESOLUTION.md当编排器启动此子Agent时,它会通过以下优先级顺序解析技能路径:
1. .claude/skills/sdd-verify/SKILL.md (项目本地版本 — 最高优先级)
2. ~/.claude/skills/sdd-verify/SKILL.md (全局目录版本 — 备用)项目本地技能会覆盖全局目录中的技能。完整算法请参考。
docs/SKILL-RESOLUTION.mdStep 1 — Load all artifacts
步骤1 — 加载所有工件
I read:
- The tasks artifact — what was planned:
- →
mem_search(query: "sdd/{change-name}/tasks").mem_get_observation(id) - If not found and Engram not reachable: tasks content passed inline from orchestrator.
- The spec artifact — what was required:
- →
mem_search(query: "sdd/{change-name}/spec").mem_get_observation(id) - If not found and Engram not reachable: spec content passed inline from orchestrator.
- The design artifact — how it was designed:
- →
mem_search(query: "sdd/{change-name}/design").mem_get_observation(id) - If not found and Engram not reachable: design content passed inline from orchestrator.
- The code files that were created/modified
我会读取以下内容:
- 任务工件——计划内容:
- →
mem_search(query: "sdd/{change-name}/tasks")。mem_get_observation(id) - 若未找到且无法连接到Engram:则使用编排器传入的任务内容。
- 规格说明工件——要求内容:
- →
mem_search(query: "sdd/{change-name}/spec")。mem_get_observation(id) - 若未找到且无法连接到Engram:则使用编排器传入的规格说明内容。
- 设计工件——设计方案:
- →
mem_search(query: "sdd/{change-name}/design")。mem_get_observation(id) - 若未找到且无法连接到Engram:则使用编排器传入的设计内容。
- 已创建/修改的代码文件
Step 2 — Completeness Check (Tasks)
步骤2 — 完整性检查(任务)
I count total tasks vs completed tasks:
markdown
undefined我会统计总任务数与已完成任务数:
markdown
undefinedCompleteness
完整性
| Metric | Value |
|---|---|
| Total tasks | [N] |
| Completed tasks [x] | [M] |
| Incomplete tasks [ ] | [K] |
Incomplete tasks:
- [number and description of each one]
**Severity:**
- Incomplete core logic tasks → CRITICAL
- Incomplete cleanup/docs tasks → WARNING| 指标 | 数值 |
|---|---|
| 总任务数 | [N] |
| 已完成任务数 [x] | [M] |
| 未完成任务数 [ ] | [K] |
未完成任务:
- [每个未完成任务的编号和描述]
**严重程度:**
- 核心逻辑任务未完成 → 严重(CRITICAL)
- 清理/文档任务未完成 → 警告(WARNING)Step 3 — Correctness Check (Specs)
步骤3 — 正确性检查(规格说明)
For EACH requirement in the spec.md files:
- I look for evidence in the code that it is implemented
- For EACH Given/When/Then scenario:
- Is the GIVEN handled? (precondition/guard)
- Is the WHEN implemented? (the action/endpoint)
- Is the THEN verifiable? (the correct result)
markdown
undefined针对spec.md文件中的每一项需求:
- 我会在代码中查找已实现的证据
- 针对每一个Given/When/Then场景:
- GIVEN条件是否已处理?(前置条件/防护逻辑)
- WHEN动作是否已实现?(操作/端点)
- THEN结果是否可验证?(正确输出)
markdown
undefinedCorrectness (Specs)
正确性(规格说明)
| Requirement | Status | Notes |
|---|---|---|
| [Req 1] | ✅ Implemented | |
| [Req 2] | ⚠️ Partial | Missing 401 error scenario |
| [Req 3] | ❌ Not implemented | Endpoint /auth/refresh does not exist |
| 需求 | 状态 | 备注 |
|---|---|---|
| [需求1] | ✅ 已实现 | |
| [需求2] | ⚠️ 部分实现 | 缺少401错误场景 |
| [需求3] | ❌ 未实现 | 端点/auth/refresh不存在 |
Scenario Coverage
场景覆盖
| Scenario | Status |
|---|---|
| Successful login | ✅ Covered |
| Failed login — incorrect password | ✅ Covered |
| Failed login — user does not exist | ⚠️ Partial — implemented but no test |
| Expired token | ❌ Not covered |
undefined| 场景 | 状态 |
|---|---|
| 登录成功 | ✅ 已覆盖 |
| 登录失败——密码错误 | ✅ 已覆盖 |
| 登录失败——用户不存在 | ⚠️ 部分覆盖——已实现但无测试 |
| 令牌过期 | ❌ 未覆盖 |
undefinedStep 4 — Coherence Check (Design)
步骤4 — 一致性检查(设计)
I verify that the design decisions were followed:
markdown
undefined我会验证设计决策是否被遵循:
markdown
undefinedCoherence (Design)
一致性(设计)
| Decision | Followed? | Notes |
|---|---|---|
| Validation with Zod | ✅ Yes | |
| JWT with RS256 | ⚠️ Deviation | HS256 was used. Dev documented it in tasks. |
| Repository pattern | ✅ Yes |
undefined| 决策 | 是否遵循? | 备注 |
|---|---|---|
| 使用Zod进行验证 | ✅ 是 | |
| 使用RS256算法的JWT | ⚠️ 偏离设计 | 实际使用了HS256,开发人员已在任务文档中记录此偏差。 |
| 仓储模式 | ✅ 是 |
undefinedStep 5 — Testing Check
步骤5 — 测试检查
markdown
undefinedmarkdown
undefinedTesting
测试情况
| Area | Tests Exist | Scenarios Covered |
|---|---|---|
| AuthService.login() | ✅ Yes | 3/4 scenarios |
| AuthController | ✅ Yes | Happy paths only |
| JWT Middleware | ❌ No | — |
undefined| 区域 | 是否存在测试 | 覆盖场景数 |
|---|---|---|
| AuthService.login() | ✅ 是 | 3/4个场景 |
| AuthController | ✅ 是 | 仅覆盖正常流程 |
| JWT中间件 | ❌ 否 | — |
undefinedStep 6 — Run Tests
步骤6 — 运行测试
I resolve test commands using a three-level priority model. I check in order:
config.yaml (at project root)Level 1 — config key (highest priority — checked first):
verify_commandsif config.yaml (at project root) exists and has key verify_commands:
→ use the listed commands in order
→ do NOT check level 2 or run auto-detection
→ for each command:
run the command via Bash tool
capture exit code + stdout/stderr
record in ## Tool Execution section with source label "verify_commands (config level 1)"
→ skip levels 2 and 3 entirely
else:
→ proceed to level 2 checkWhen is present, it overrides all lower levels — it is NOT additive.
Commands are assumed non-destructive; the user is responsible for this.
verify_commandsLevel 2 — config key (checked when verify_commands is absent):
verify.test_commandsif config.yaml (at project root) exists and has key verify.test_commands:
if verify.test_commands is not a list:
→ emit WARNING: "verify.test_commands is not a list — treating as absent"
→ proceed to level 3 (auto-detection)
else if verify.test_commands is an empty list []:
→ treat as absent (empty list falls through — prevents silent zero-command success)
→ proceed to level 3 (auto-detection)
else:
→ use the listed commands in order
→ do NOT run auto-detection
→ for each command:
run the command via Bash tool
capture exit code + stdout/stderr
record in ## Tool Execution section with source label "verify.test_commands (config level 2)"
→ skip level 3 entirely
else:
→ proceed to level 3 (auto-detection)Level 3 — Auto-detection (only when both and are absent or invalid — prioritized — use the first match):
verify_commandsverify.test_commands| Priority | File to check | Condition | Command |
|---|---|---|---|
| 1 | | | |
| 2 | | pytest indicators present | |
| 3 | | | |
| 4 | | file exists | |
| 5 | | file exists | |
| — | none of the above | — | Skip with WARNING |
Execution:
- I execute the detected command via Bash tool
- I capture the exit code (0 = pass, non-zero = failure)
- I capture stdout/stderr output for analysis
- I record: runner name, command executed, exit code, summary of failures (if any)
Error handling:
- If the command cannot be executed (missing dependencies, command not found): I report "Test Execution: ERROR — [error message]" with status WARNING and continue to subsequent steps
- If tests run but some fail: I report the failure count and list failing test names if parseable from the output
- If no test runner is detected: I report "Test Execution: SKIPPED — no test runner detected" with status WARNING
I save the full test output for use in Step 8 (Coverage Validation) and Step 9 (Spec Compliance Matrix).
我会通过三级优先级模型解析测试命令。按顺序检查项目根目录下的:
config.yaml第一级 — 配置项(最高优先级 — 首先检查):
verify_commands若项目根目录下存在config.yaml且包含verify_commands键:
→ 按顺序使用列出的命令
→ 不检查第二级或运行自动检测
→ 针对每个命令:
通过Bash工具运行命令
捕获退出码 + 标准输出/标准错误
在## 工具执行部分记录,来源标签为"verify_commands (config level 1)"
→ 完全跳过第二级和第三级
否则:
→ 继续检查第二级当存在时,会覆盖所有低优先级配置——它不是累加的。
命令被假定为非破坏性的,用户需对此负责。
verify_commands第二级 — 配置项(仅当verify_commands不存在时检查):
verify.test_commands若项目根目录下存在config.yaml且包含verify.test_commands键:
若verify.test_commands不是列表类型:
→ 发出警告:"verify.test_commands不是列表类型——视为不存在"
→ 继续第三级(自动检测)
否则若verify.test_commands是空列表[]:
→ 视为不存在(空列表会进入下一级——避免静默的零命令成功)
→ 继续第三级(自动检测)
否则:
→ 按顺序使用列出的命令
→ 不运行自动检测
→ 针对每个命令:
通过Bash工具运行命令
捕获退出码 + 标准输出/标准错误
在## 工具执行部分记录,来源标签为"verify.test_commands (config level 2)"
→ 完全跳过第三级
否则:
→ 继续第三级(自动检测)第三级 — 自动检测(仅当verify_commands和verify.test_commands均不存在或无效时触发——按优先级匹配,使用第一个匹配项):
| 优先级 | 检查文件 | 条件 | 命令 |
|---|---|---|---|
| 1 | | | |
| 2 | | 存在pytest相关标识 | |
| 3 | | | |
| 4 | | 文件存在 | |
| 5 | | 文件存在 | |
| — | 以上均不满足 | — | 跳过并发出警告 |
执行流程:
- 通过Bash工具执行检测到的命令
- 捕获退出码(0=通过,非0=失败)
- 捕获标准输出/标准错误用于分析
- 记录:运行器名称、执行的命令、退出码、失败摘要(若有)
错误处理:
- 若命令无法执行(缺少依赖、命令未找到):我会报告“测试执行:错误 — [错误信息]”,状态为警告并继续后续步骤
- 若测试运行但部分失败:我会报告失败数量,并从输出中解析出失败的测试名称(如果可解析)
- 若未检测到测试运行器:我会报告“测试执行:已跳过 — 未检测到测试运行器”,状态为警告
我会保存完整的测试输出,用于步骤8(覆盖率验证)和步骤9(规格合规矩阵)。
Step 7 — Build & Type Check
步骤7 — 构建与类型检查
I detect the project's build/type-check command and execute it.
Config override check — and (checked before auto-detection):
verify.build_commandverify.type_check_commandif config.yaml (at project root) exists and has key verify.build_command:
if verify.build_command is not a string:
→ emit WARNING: "verify.build_command is not a string — treating as absent"
→ proceed to auto-detection for build command
else:
→ use verify.build_command as the build/type-check command
→ skip the auto-detection table below for the build/type-check command
if config.yaml (at project root) exists and has key verify.type_check_command:
if verify.type_check_command is not a string:
→ emit WARNING: "verify.type_check_command is not a string — treating as absent"
→ proceed to auto-detection for type check command
else:
→ use verify.type_check_command as the type-check command
→ skip auto-detection for type check commandWhen either config override is present and valid, it replaces the corresponding auto-detected command.
Both overrides are independent — one can be set without the other.
Build command auto-detection (only when is absent or invalid — prioritized — use the first match):
verify.build_command| Priority | File to check | Condition | Command |
|---|---|---|---|
| 1 | | | |
| 2 | | | |
| 3 | | file exists + TypeScript in devDependencies | |
| 4 | | | |
| 5 | | file exists | |
| 6 | | file exists | |
| — | none of the above | — | Skip with INFO |
Execution:
- I execute the detected command via Bash tool
- I capture the exit code (0 = pass, non-zero = failure)
- I capture error output for analysis
- I record: command executed, exit code, error summary (if any)
Error handling:
- If the command cannot be executed: I report "Build/Type Check: ERROR — [error message]" with status WARNING and continue
- If the build fails: I report "Build/Type Check: FAILING" and include error output in the detail section
- If no build command is detected: I report "Build/Type Check: SKIPPED — no build command detected" with status INFO (not WARNING)
我会检测项目的构建/类型检查命令并执行。
配置覆盖检查 — 和(在自动检测前检查):
verify.build_commandverify.type_check_command若项目根目录下存在config.yaml且包含verify.build_command键:
若verify.build_command不是字符串类型:
→ 发出警告:"verify.build_command不是字符串类型——视为不存在"
→ 继续自动检测构建命令
否则:
→ 使用verify.build_command作为构建/类型检查命令
→ 跳过以下构建/类型检查命令的自动检测表
若项目根目录下存在config.yaml且包含verify.type_check_command键:
若verify.type_check_command不是字符串类型:
→ 发出警告:"verify.type_check_command不是字符串类型——视为不存在"
→ 继续自动检测类型检查命令
否则:
→ 使用verify.type_check_command作为类型检查命令
→ 跳过类型检查命令的自动检测当任一配置覆盖项存在且有效时,会替换对应的自动检测命令。
两个覆盖项相互独立——可单独设置其中一个。
构建命令自动检测(仅当verify.build_command不存在或无效时触发——按优先级匹配,使用第一个匹配项):
| 优先级 | 检查文件 | 条件 | 命令 |
|---|---|---|---|
| 1 | | | |
| 2 | | | |
| 3 | | 文件存在 + devDependencies中包含TypeScript | |
| 4 | | | |
| 5 | | 文件存在 | |
| 6 | | 文件存在 | |
| — | 以上均不满足 | — | 跳过并记录信息 |
执行流程:
- 通过Bash工具执行检测到的命令
- 捕获退出码(0=通过,非0=失败)
- 捕获错误输出用于分析
- 记录:执行的命令、退出码、错误摘要(若有)
错误处理:
- 若命令无法执行:我会报告“构建/类型检查:错误 — [错误信息]”,状态为警告并继续后续步骤
- 若构建失败:我会报告“构建/类型检查:失败”并在详细部分包含错误输出
- 若未检测到构建命令:我会报告“构建/类型检查:已跳过 — 未检测到构建命令”,状态为信息(非警告)
Step 8 — Coverage Validation (optional)
步骤8 — 覆盖率验证(可选)
This step is only active when a coverage threshold is configured. It is advisory only — it never produces CRITICAL status and never blocks verification.
Process:
- I read and look for
config.yaml (at project root)(e.g.,coverage.threshold)coverage: { threshold: 80 } - If no threshold is configured: I skip this step entirely and report "Coverage Validation: SKIPPED — no threshold configured"
- If a threshold is configured:
a. I parse the coverage percentage from the Step 6 test output (looking for common coverage summary formats)
b. I compare the actual coverage against the configured threshold
c. I report the result:
- Actual >= threshold: "Coverage: [X]% (threshold: [Y]%) — PASS"
- Actual < threshold: "Coverage: [X]% (threshold: [Y]%) — BELOW THRESHOLD" with status WARNING
- If coverage data cannot be parsed from the test output: I report "Coverage Validation: SKIPPED — could not parse coverage from test output" with status WARNING
此步骤仅在配置了覆盖率阈值时激活。它仅提供建议——不会产生严重状态,也不会阻止验证流程。
流程:
- 读取项目根目录下的,查找
config.yaml(例如:coverage.threshold)coverage: { threshold: 80 } - 若未配置阈值:完全跳过此步骤并报告“覆盖率验证:已跳过 — 未配置阈值”
- 若已配置阈值:
a. 从步骤6的测试输出中解析覆盖率百分比(查找常见的覆盖率摘要格式)
b. 将实际覆盖率与配置的阈值进行比较
c. 报告结果:
- 实际覆盖率 >= 阈值:“覆盖率:[X]%(阈值:[Y]%)—— 通过”
- 实际覆盖率 < 阈值:“覆盖率:[X]%(阈值:[Y]%)—— 低于阈值”,状态为警告
- 若无法从测试输出中解析覆盖率数据:报告“覆盖率验证:已跳过 — 无法从测试输出中解析覆盖率数据”,状态为警告
Step 9 — Spec Compliance Matrix
步骤9 — 规格合规矩阵
I produce a Spec Compliance Matrix that cross-references every Given/When/Then scenario from the change's spec files against the verification evidence.
Process:
- I read all spec content from the active persistence mode (same source as Step 1)
- For each spec file, I extract every Given/When/Then scenario
- For each scenario, I cross-reference against:
- Code implementation evidence from Step 3 (Correctness Check)
- Test results from Step 6 (Run Tests) — if tests were executed
- I assign a compliance status per scenario:
| Status | Meaning | Criteria |
|---|---|---|
| COMPLIANT | Fully implemented and verified | Code implements the scenario + test passes (or code inspection confirms correctness when no test runner exists) |
| FAILING | Implemented but test fails | Code implements the scenario + corresponding test fails |
| UNTESTED | Implemented but no test coverage | Code implements the scenario + no test covers this scenario (only when a test runner exists but no test covers it) |
| PARTIAL | Partially implemented | Code covers some but not all THEN/AND clauses of the scenario |
When no test runner exists:
- The matrix is still produced using code inspection evidence from Step 3
- Scenarios verified only by code inspection receive COMPLIANT or PARTIAL (never UNTESTED, since code evidence was checked)
Output format:
markdown
undefined我会生成规格合规矩阵,将变更规格文件中的每一个Given/When/Then场景与验证证据进行交叉引用。
流程:
- 从当前持久化模式读取所有规格内容(与步骤1的来源相同)
- 针对每个规格文件,提取所有Given/When/Then场景
- 针对每个场景,交叉引用以下内容:
- 步骤3(正确性检查)中的代码实现证据
- 步骤6(运行测试)中的测试结果——若已执行测试
- 为每个场景分配合规状态:
| 状态 | 含义 | 判定标准 |
|---|---|---|
| COMPLIANT(合规) | 完全实现并验证 | 代码实现了场景 + 测试通过(或当无测试运行器时,代码检查确认正确性) |
| FAILING(失败) | 已实现但测试失败 | 代码实现了场景 + 对应的测试失败 |
| UNTESTED(未测试) | 已实现但无测试覆盖 | 代码实现了场景 + 无测试覆盖此场景(仅当存在测试运行器但无对应测试时) |
| PARTIAL(部分实现) | 部分实现 | 代码仅覆盖场景中的部分THEN/AND子句 |
当无测试运行器时:
- 仍会使用步骤3中的代码检查证据生成矩阵
- 仅通过代码检查验证的场景会被标记为COMPLIANT或PARTIAL(不会标记为UNTESTED,因为已检查代码证据)
输出格式:
markdown
undefinedSpec Compliance Matrix
规格合规矩阵
| Spec Domain | Requirement | Scenario | Status | Evidence |
|---|---|---|---|---|
| [domain] | [requirement name] | [scenario name] | COMPLIANT | [evidence description] |
| [domain] | [requirement name] | [scenario name] | FAILING | [failing test name or output] |
| [domain] | [requirement name] | [scenario name] | UNTESTED | No test coverage found |
| [domain] | [requirement name] | [scenario name] | PARTIAL | [which clauses are covered and which are not] |
The matrix MUST include scenarios from ALL spec domains affected by the change.| 规格领域 | 需求 | 场景 | 状态 | 证据 |
|---|---|---|---|---|
| [领域] | [需求名称] | [场景名称] | COMPLIANT | [证据描述] |
| [领域] | [需求名称] | [场景名称] | FAILING | [失败测试名称或输出] |
| [领域] | [需求名称] | [场景名称] | UNTESTED | 未找到测试覆盖 |
| [领域] | [需求名称] | [场景名称] | PARTIAL | [已覆盖和未覆盖的子句说明] |
矩阵必须包含变更影响的所有规格领域中的场景。Step 10 — Create verify-report.md
步骤10 — 创建verify-report.md
Evidence rule — applies to every criterion in :
verify-report.mdA criterion MUST only be marked when:
[x]- A tool command was run and its output confirms the criterion, OR
- The user provided an explicit evidence statement
When neither condition is met: leave with note: "Manual confirmation required — no tool output available".
Abstract reasoning or code inspection alone MUST NOT suffice to mark a criterion .
[ ][x]The section is mandatory in every — even when tool execution was skipped. When skipped, the section MUST still appear with: "Test Execution: SKIPPED — no test runner detected".
## Tool Executionverify-report.mdI persist the verify report to engram:
Call with , , , content = full report markdown. Do NOT write any file.
mem_savetopic_key: sdd/{change-name}/verify-reporttype: architectureproject: {project}If Engram MCP is not reachable: skip persistence. Return report content inline only.
Persisted artifact (compact — only what sdd-archive and the orchestrator consume):
markdown
undefined证据规则 — 适用于中的每一项判定标准:
verify-report.md只有满足以下条件之一时,判定标准才能标记为:
[x]- 已运行工具命令且其输出确认该标准,或
- 用户提供了明确的证据声明
当两个条件都不满足时:保留并备注:“需手动确认——无工具输出可用”。仅通过抽象推理或代码检查不能将判定标准标记为。
[ ][x]每个中必须包含部分——即使工具执行已跳过。当跳过执行时,该部分仍需显示:“测试执行:已跳过 — 未检测到测试运行器”。
verify-report.md## 工具执行我会将验证报告持久化到Engram:
调用,参数为, , , content = 完整报告markdown。不写入任何文件。
mem_savetopic_key: sdd/{change-name}/verify-reporttype: architectureproject: {project}若无法连接到Engram MCP:跳过持久化。仅返回报告内容。
持久化工件(精简版 — 仅sdd-archive和编排器需要的内容):
markdown
undefinedVerification Report: [change-name]
验证报告: [change-name]
Date: [YYYY-MM-DD]
Verdict: PASS / PASS WITH WARNINGS / FAIL
日期: [YYYY-MM-DD]
verdict: 通过 / 带警告通过 / 失败
Summary
摘要
| Dimension | Status |
|---|---|
| Completeness | OK / WARNING / CRITICAL |
| Correctness | OK / WARNING / CRITICAL |
| Coherence | OK / WARNING / CRITICAL |
| Testing | OK / WARNING / CRITICAL |
| Test Execution | OK / WARNING / CRITICAL / SKIPPED |
| Build | OK / WARNING / SKIPPED |
| 维度 | 状态 |
|---|---|
| 完整性 | 正常 / 警告 / 严重 |
| 正确性 | 正常 / 警告 / 严重 |
| 一致性 | 正常 / 警告 / 严重 |
| 测试情况 | 正常 / 警告 / 严重 |
| 测试执行 | 正常 / 警告 / 严重 / 已跳过 |
| 构建 | 正常 / 警告 / 已跳过 |
Tool Execution
工具执行
| Command | Exit Code | Result |
|---|---|---|
| [command] | [code] | [PASS/FAIL/SKIPPED] |
| 命令 | 退出码 | 结果 |
|---|---|---|
| [命令] | [代码] | 通过/失败/已跳过 |
Issues
问题
CRITICAL
严重问题
- [issue description] [or: "None."]
- [问题描述] [或: "无。"]
WARNINGS
警告
- [issue description] [or: "None."]
**Conversational output** (shown to user but NOT persisted):
The full detail sections — Completeness tables, Correctness requirement-by-requirement tables, Coherence decision tracking, Testing coverage tables, Spec Compliance Matrix, Coverage Validation, and SUGGESTIONS — are presented in the conversational response. This gives the user full visibility without inflating the persisted artifact.
The conversational output MUST still include all detail sections from Steps 2-9 — the user needs to see the full analysis. Only the **persisted artifact** is compact.- [问题描述] [或: "无。"]
**对话输出**(展示给用户但**不持久化**):
完整的详细部分——完整性表格、正确性需求逐项表格、一致性决策跟踪、测试覆盖表格、规格合规矩阵、覆盖率验证和建议——会在对话响应中展示。这使用户能够查看完整分析,同时不会增大持久化工件的体积。
对话输出必须包含步骤2-9的所有详细部分——用户需要查看完整分析。只有**持久化工件**是精简版。WARNINGS (should be resolved):
警告(应解决):
- [description] [or: "None."]
- [描述] [或: "无。"]
SUGGESTIONS (optional improvements):
建议(可选改进):
- [description] [or: "None."]
---- [描述] [或: "无。"]
---Verdict Criteria
判定标准
| Verdict | Condition |
|---|---|
| PASS | 0 critical, 0 warnings |
| PASS WITH WARNINGS | 0 critical, 1+ warnings |
| FAIL | 1+ critical |
| 判定结果 | 条件 |
|---|---|
| 通过 | 0个严重问题,0个警告 |
| 带警告通过 | 0个严重问题,1个及以上警告 |
| 失败 | 1个及以上严重问题 |
Severities
严重程度
| Severity | Description | Blocks archiving |
|---|---|---|
| CRITICAL | Requirement not implemented, main scenario not covered, core task incomplete | Yes |
| WARNING | Edge case scenario without test, design deviation, pending cleanup task, test execution failure | No |
| SUGGESTION | Optional quality improvement | No |
| SKIPPED | Step preconditions not met (no test runner, no build command, no coverage config) — does NOT count toward verdict | No |
| INFO | Informational note (e.g., no build command detected) — does NOT count toward verdict | No |
Verdict calculation note: Only the original four dimensions (Completeness, Correctness, Coherence, Testing) plus Test Execution and Spec Compliance contribute CRITICAL/WARNING statuses. SKIPPED and INFO statuses from any dimension do NOT count as WARNING or CRITICAL for the verdict. This preserves identical verdict behavior for projects without test infrastructure.
| 严重程度 | 描述 | 是否阻止归档 |
|---|---|---|
| 严重(CRITICAL) | 需求未实现、主场景未覆盖、核心任务未完成 | 是 |
| 警告(WARNING) | 边缘场景无测试、设计偏离、待完成的清理任务、测试执行失败 | 否 |
| 建议(SUGGESTION) | 可选的质量改进 | 否 |
| 已跳过(SKIPPED) | 步骤前置条件未满足(无测试运行器、无构建命令、无覆盖率配置)——不影响判定结果 | 否 |
| 信息(INFO) | 信息性说明(例如:未检测到构建命令)——不影响判定结果 | 否 |
判定结果计算说明: 只有最初的四个维度(完整性、正确性、一致性、测试情况)加上测试执行和规格合规性会产生严重/警告状态。任何维度的已跳过和信息状态不会被视为警告或严重来影响判定结果。这确保了没有测试基础设施的项目也能获得一致的判定结果。
Output to Orchestrator
输出给编排器
json
{
"status": "ok|warning|failed",
"summary": "Verification [change-name]: [verdict]. [N] critical, [M] warnings.",
"artifacts": ["engram:sdd/{change-name}/verify-report"],
"test_execution": {
"runner": "[detected runner or null]",
"command": "[command or null]",
"exit_code": "[0/1/N or null]",
"result": "PASS|FAILING|ERROR|SKIPPED"
},
"build_check": {
"command": "[command or null]",
"exit_code": "[0/1/N or null]",
"result": "PASS|FAILING|ERROR|SKIPPED"
},
"compliance_matrix": {
"total_scenarios": "[N]",
"compliant": "[N]",
"failing": "[N]",
"untested": "[N]",
"partial": "[N]"
},
"next_recommended": ["sdd-archive (if PASS or PASS WITH WARNINGS)"],
"risks": ["CRITICAL: [description if any]"]
}Continue with archive? Reply yes to proceed or no to pause.
(Manual: )
/sdd-archive <slug>json
{
"status": "ok|warning|failed",
"summary": "Verification [change-name]: [verdict]. [N] critical, [M] warnings.",
"artifacts": ["engram:sdd/{change-name}/verify-report"],
"test_execution": {
"runner": "[检测到的运行器或null]",
"command": "[命令或null]",
"exit_code": "[0/1/N或null]",
"result": "PASS|FAILING|ERROR|SKIPPED"
},
"build_check": {
"command": "[命令或null]",
"exit_code": "[0/1/N或null]",
"result": "PASS|FAILING|ERROR|SKIPPED"
},
"compliance_matrix": {
"total_scenarios": "[N]",
"compliant": "[N]",
"failing": "[N]",
"untested": "[N]",
"partial": "[N]"
},
"next_recommended": ["sdd-archive (if PASS or PASS WITH WARNINGS)"],
"risks": ["CRITICAL: [描述(若有)]"]
}是否继续归档?回复yes继续或no暂停。
(手动触发: )
/sdd-archive <slug>Rules
规则
- I ONLY report — I fix nothing during verification
- I read real code — I do not assume something works just because the file exists
- I am objective: I report what IS, not what should be
- If there are deviations documented in tasks.md, I evaluate them with context
- A FAIL is not personal — it is information for improvement
- I run tests if possible (via Bash tool): I report the actual results
- The section is mandatory in every
## Tool Execution— even when skipped; when skipped it MUST state "Test Execution: SKIPPED — no test runner detected"verify-report.md - A criterion marked MUST have verifiable evidence: tool output or an explicit user evidence statement; abstract reasoning or code inspection alone MUST NOT suffice
[x] - Test command resolution uses a three-level priority model: level 1 () > level 2 (
verify_commands) > level 3 (auto-detection); each level is only consulted when all higher levels are absent or invalidverify.test_commands - Empty falls through to auto-detection — it is NOT treated as zero-command success
verify.test_commands: [] - and
verify.build_commandoverride their respective auto-detected commands when present and are strings; non-string values emit a WARNING and fall back to auto-detectionverify.type_check_command
- 我仅生成报告——验证过程中不进行任何修复
- 我读取真实代码——不会仅因文件存在就假设功能正常
- 我保持客观:报告实际情况,而非应然情况
- 若tasks.md中记录了偏离设计的情况,我会结合上下文进行评估
- 失败结果并非针对个人——而是用于改进的信息
- 若可能(通过Bash工具)我会运行测试:报告实际结果
- 每个中必须包含
verify-report.md部分——即使工具执行已跳过;当跳过执行时必须显示“测试执行:已跳过 — 未检测到测试运行器”## 工具执行 - 标记为的判定标准必须有可验证的证据:工具输出或用户明确的证据声明;仅通过抽象推理或代码检查不能标记
[x] - 测试命令解析使用三级优先级模型:第一级() > 第二级(
verify_commands) > 第三级(自动检测);仅当所有更高优先级级别不存在或无效时才会查询下一级verify.test_commands - 空的会进入自动检测——不会被视为零命令成功
verify.test_commands: [] - 当和
verify.build_command存在且为字符串时,会覆盖对应的自动检测命令;非字符串值会发出警告并回退到自动检测verify.type_check_command