sdd-verify

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

sdd-verify

Verifies that the implementation complies with the specs, design, and task plan.

Triggers:

/sdd-verify <change-name>

, verify implementation, quality gate, validate change, sdd verify

验证实现是否符合规格说明、设计方案和任务计划。

触发方式:

/sdd-verify <change-name>

, verify implementation, quality gate, validate change, sdd verify

Step 0 — Load project context + Spec context preload

步骤0 — 加载项目上下文 + 规格说明上下文预加载

skills/_shared/sdd-phase-common.md

Section F (Project Context Load) and Section G (Spec Context Preload). Both are non-blocking.

遵循

skills/_shared/sdd-phase-common.md

中的F部分（项目上下文加载）和G部分（规格说明上下文预加载）。这两个步骤均为非阻塞式。

Purpose

目的

Verification is the quality gate before archiving. It objectively validates that what was implemented meets what was specified. It fixes nothing — it only reports.

验证是归档前的质量关卡。它客观地验证已实现内容是否符合规格要求。此过程不进行任何修复——仅生成验证报告。

Process

流程

Skill Resolution

技能解析

When the orchestrator launches this sub-agent, it resolves the skill path using:

1. .claude/skills/sdd-verify/SKILL.md     (project-local — highest priority)
2. ~/.claude/skills/sdd-verify/SKILL.md   (global catalog — fallback)

Project-local skills override the global catalog. See

docs/SKILL-RESOLUTION.md

for the full algorithm.

当编排器启动此子Agent时，它会通过以下优先级顺序解析技能路径：

1. .claude/skills/sdd-verify/SKILL.md     (项目本地版本 — 最高优先级)
2. ~/.claude/skills/sdd-verify/SKILL.md   (全局目录版本 — 备用)

项目本地技能会覆盖全局目录中的技能。完整算法请参考

docs/SKILL-RESOLUTION.md

。

Step 1 — Load all artifacts

步骤1 — 加载所有工件

I read:

The tasks artifact — what was planned:
- ```
mem_search(query: "sdd/{change-name}/tasks")
```
  →
```
mem_get_observation(id)
```
  .
- If not found and Engram not reachable: tasks content passed inline from orchestrator.
The spec artifact — what was required:
- ```
mem_search(query: "sdd/{change-name}/spec")
```
  →
```
mem_get_observation(id)
```
  .
- If not found and Engram not reachable: spec content passed inline from orchestrator.
The design artifact — how it was designed:
- ```
mem_search(query: "sdd/{change-name}/design")
```
  →
```
mem_get_observation(id)
```
  .
- If not found and Engram not reachable: design content passed inline from orchestrator.
The code files that were created/modified

我会读取以下内容：

任务工件——计划内容：
- ```
mem_search(query: "sdd/{change-name}/tasks")
```
  →
```
mem_get_observation(id)
```
  。
- 若未找到且无法连接到Engram：则使用编排器传入的任务内容。
规格说明工件——要求内容：
- ```
mem_search(query: "sdd/{change-name}/spec")
```
  →
```
mem_get_observation(id)
```
  。
- 若未找到且无法连接到Engram：则使用编排器传入的规格说明内容。
设计工件——设计方案：
- ```
mem_search(query: "sdd/{change-name}/design")
```
  →
```
mem_get_observation(id)
```
  。
- 若未找到且无法连接到Engram：则使用编排器传入的设计内容。
已创建/修改的代码文件

Step 2 — Completeness Check (Tasks)

步骤2 — 完整性检查（任务）

I count total tasks vs completed tasks:

markdown

undefined

我会统计总任务数与已完成任务数：

markdown

undefined

Completeness

完整性

Metric	Value
Total tasks	[N]
Completed tasks [x]	[M]
Incomplete tasks [ ]	[K]

Incomplete tasks:

[number and description of each one]


**Severity:**

- Incomplete core logic tasks → CRITICAL
- Incomplete cleanup/docs tasks → WARNING

指标	数值
总任务数	[N]
已完成任务数 [x]	[M]
未完成任务数 [ ]	[K]

未完成任务：

[每个未完成任务的编号和描述]


**严重程度：**

- 核心逻辑任务未完成 → 严重（CRITICAL）
- 清理/文档任务未完成 → 警告（WARNING）

Step 3 — Correctness Check (Specs)

步骤3 — 正确性检查（规格说明）

For EACH requirement in the spec.md files:

I look for evidence in the code that it is implemented
For EACH Given/When/Then scenario:
- Is the GIVEN handled? (precondition/guard)
- Is the WHEN implemented? (the action/endpoint)
- Is the THEN verifiable? (the correct result)

markdown

undefined

针对spec.md文件中的每一项需求：

我会在代码中查找已实现的证据
针对每一个Given/When/Then场景：
- GIVEN条件是否已处理？（前置条件/防护逻辑）
- WHEN动作是否已实现？（操作/端点）
- THEN结果是否可验证？（正确输出）

markdown

undefined

Correctness (Specs)

正确性（规格说明）

Requirement	Status	Notes
[Req 1]	✅ Implemented
[Req 2]	⚠️ Partial	Missing 401 error scenario
[Req 3]	❌ Not implemented	Endpoint /auth/refresh does not exist

需求	状态	备注
[需求1]	✅ 已实现
[需求2]	⚠️ 部分实现	缺少401错误场景
[需求3]	❌ 未实现	端点/auth/refresh不存在

Scenario Coverage

场景覆盖

Scenario	Status
Successful login	✅ Covered
Failed login — incorrect password	✅ Covered
Failed login — user does not exist	⚠️ Partial — implemented but no test
Expired token	❌ Not covered

undefined

场景	状态
登录成功	✅ 已覆盖
登录失败——密码错误	✅ 已覆盖
登录失败——用户不存在	⚠️ 部分覆盖——已实现但无测试
令牌过期	❌ 未覆盖

undefined

Step 4 — Coherence Check (Design)

步骤4 — 一致性检查（设计）

I verify that the design decisions were followed:

markdown

undefined

我会验证设计决策是否被遵循：

markdown

undefined

Coherence (Design)

一致性（设计）

Decision	Followed?	Notes
Validation with Zod	✅ Yes
JWT with RS256	⚠️ Deviation	HS256 was used. Dev documented it in tasks.
Repository pattern	✅ Yes

undefined

决策	是否遵循？	备注
使用Zod进行验证	✅ 是
使用RS256算法的JWT	⚠️ 偏离设计	实际使用了HS256，开发人员已在任务文档中记录此偏差。
仓储模式	✅ 是

undefined

Step 5 — Testing Check

步骤5 — 测试检查

markdown

undefined

markdown

undefined

Testing

测试情况

Area	Tests Exist	Scenarios Covered
AuthService.login()	✅ Yes	3/4 scenarios
AuthController	✅ Yes	Happy paths only
JWT Middleware	❌ No	—

undefined

区域	是否存在测试	覆盖场景数
AuthService.login()	✅ 是	3/4个场景
AuthController	✅ 是	仅覆盖正常流程
JWT中间件	❌ 否	—

undefined

Step 6 — Run Tests

步骤6 — 运行测试

I resolve test commands using a three-level priority model. I check

config.yaml (at project root)

in order:

Level 1 —
verify_commands
config key (highest priority — checked first):

if config.yaml (at project root) exists and has key verify_commands:
    → use the listed commands in order
    → do NOT check level 2 or run auto-detection
    → for each command:
         run the command via Bash tool
         capture exit code + stdout/stderr
         record in ## Tool Execution section with source label "verify_commands (config level 1)"
    → skip levels 2 and 3 entirely
else:
    → proceed to level 2 check

When

verify_commands

is present, it overrides all lower levels — it is NOT additive. Commands are assumed non-destructive; the user is responsible for this.

Level 2 —
verify.test_commands
config key (checked when verify_commands is absent):

if config.yaml (at project root) exists and has key verify.test_commands:
    if verify.test_commands is not a list:
        → emit WARNING: "verify.test_commands is not a list — treating as absent"
        → proceed to level 3 (auto-detection)
    else if verify.test_commands is an empty list []:
        → treat as absent (empty list falls through — prevents silent zero-command success)
        → proceed to level 3 (auto-detection)
    else:
        → use the listed commands in order
        → do NOT run auto-detection
        → for each command:
             run the command via Bash tool
             capture exit code + stdout/stderr
             record in ## Tool Execution section with source label "verify.test_commands (config level 2)"
        → skip level 3 entirely
else:
    → proceed to level 3 (auto-detection)

Level 3 — Auto-detection (only when both
verify_commands
and
verify.test_commands
are absent or invalid — prioritized — use the first match):

Priority	File to check	Condition	Command
1	`package.json`	`scripts.test` exists	`npm test` (or `yarn test` if `yarn.lock` exists, `pnpm test` if `pnpm-lock.yaml` exists)
2	`pyproject.toml` / `pytest.ini` / `setup.cfg`	pytest indicators present	`pytest`
3	`Makefile`	`test` target exists	`make test`
4	`build.gradle` / `gradlew`	file exists	`./gradlew test`
5	`mix.exs`	file exists	`mix test`
—	none of the above	—	Skip with WARNING

Execution:

I execute the detected command via Bash tool
I capture the exit code (0 = pass, non-zero = failure)
I capture stdout/stderr output for analysis
I record: runner name, command executed, exit code, summary of failures (if any)

Error handling:

If the command cannot be executed (missing dependencies, command not found): I report "Test Execution: ERROR — [error message]" with status WARNING and continue to subsequent steps
If tests run but some fail: I report the failure count and list failing test names if parseable from the output
If no test runner is detected: I report "Test Execution: SKIPPED — no test runner detected" with status WARNING

I save the full test output for use in Step 8 (Coverage Validation) and Step 9 (Spec Compliance Matrix).

我会通过三级优先级模型解析测试命令。按顺序检查项目根目录下的

config.yaml

：

第一级 —
verify_commands
配置项（最高优先级 — 首先检查）：

若项目根目录下存在config.yaml且包含verify_commands键：
    → 按顺序使用列出的命令
    → 不检查第二级或运行自动检测
    → 针对每个命令：
         通过Bash工具运行命令
         捕获退出码 + 标准输出/标准错误
         在## 工具执行部分记录，来源标签为"verify_commands (config level 1)"
    → 完全跳过第二级和第三级
否则：
    → 继续检查第二级

当存在

verify_commands

时，会覆盖所有低优先级配置——它不是累加的。命令被假定为非破坏性的，用户需对此负责。

第二级 —
verify.test_commands
配置项（仅当verify_commands不存在时检查）：

若项目根目录下存在config.yaml且包含verify.test_commands键：
    若verify.test_commands不是列表类型：
        → 发出警告："verify.test_commands不是列表类型——视为不存在"
        → 继续第三级（自动检测）
    否则若verify.test_commands是空列表[]：
        → 视为不存在（空列表会进入下一级——避免静默的零命令成功）
        → 继续第三级（自动检测）
    否则：
        → 按顺序使用列出的命令
        → 不运行自动检测
        → 针对每个命令：
             通过Bash工具运行命令
             捕获退出码 + 标准输出/标准错误
             在## 工具执行部分记录，来源标签为"verify.test_commands (config level 2)"
        → 完全跳过第三级
否则：
    → 继续第三级（自动检测）

第三级 — 自动检测（仅当verify_commands和verify.test_commands均不存在或无效时触发——按优先级匹配，使用第一个匹配项）：

优先级	检查文件	条件	命令
1	`package.json`	`scripts.test` 存在	`npm test` （若存在 `yarn.lock` 则使用 `yarn test` ，若存在 `pnpm-lock.yaml` 则使用 `pnpm test` ）
2	`pyproject.toml` / `pytest.ini` / `setup.cfg`	存在pytest相关标识	`pytest`
3	`Makefile`	`test` 目标存在	`make test`
4	`build.gradle` / `gradlew`	文件存在	`./gradlew test`
5	`mix.exs`	文件存在	`mix test`
—	以上均不满足	—	跳过并发出警告

执行流程：

通过Bash工具执行检测到的命令
捕获退出码（0=通过，非0=失败）
捕获标准输出/标准错误用于分析
记录：运行器名称、执行的命令、退出码、失败摘要（若有）

错误处理：

若命令无法执行（缺少依赖、命令未找到）：我会报告“测试执行：错误 — [错误信息]”，状态为警告并继续后续步骤
若测试运行但部分失败：我会报告失败数量，并从输出中解析出失败的测试名称（如果可解析）
若未检测到测试运行器：我会报告“测试执行：已跳过 — 未检测到测试运行器”，状态为警告

我会保存完整的测试输出，用于步骤8（覆盖率验证）和步骤9（规格合规矩阵）。

Step 7 — Build & Type Check

步骤7 — 构建与类型检查

I detect the project's build/type-check command and execute it.

Config override check —
verify.build_command
and
verify.type_check_command
(checked before auto-detection):

if config.yaml (at project root) exists and has key verify.build_command:
    if verify.build_command is not a string:
        → emit WARNING: "verify.build_command is not a string — treating as absent"
        → proceed to auto-detection for build command
    else:
        → use verify.build_command as the build/type-check command
        → skip the auto-detection table below for the build/type-check command

if config.yaml (at project root) exists and has key verify.type_check_command:
    if verify.type_check_command is not a string:
        → emit WARNING: "verify.type_check_command is not a string — treating as absent"
        → proceed to auto-detection for type check command
    else:
        → use verify.type_check_command as the type-check command
        → skip auto-detection for type check command

When either config override is present and valid, it replaces the corresponding auto-detected command. Both overrides are independent — one can be set without the other.

Build command auto-detection (only when
verify.build_command
is absent or invalid — prioritized — use the first match):

Priority	File to check	Condition	Command
1	`package.json`	`scripts.typecheck` exists	`npm run typecheck`
2	`package.json`	`scripts.build` exists	`npm run build`
3	`tsconfig.json`	file exists + TypeScript in devDependencies	`npx tsc --noEmit`
4	`Makefile`	`build` target exists	`make build`
5	`build.gradle` / `gradlew`	file exists	`./gradlew build`
6	`mix.exs`	file exists	`mix compile --warnings-as-errors`
—	none of the above	—	Skip with INFO

Execution:

I execute the detected command via Bash tool
I capture the exit code (0 = pass, non-zero = failure)
I capture error output for analysis
I record: command executed, exit code, error summary (if any)

Error handling:

If the command cannot be executed: I report "Build/Type Check: ERROR — [error message]" with status WARNING and continue
If the build fails: I report "Build/Type Check: FAILING" and include error output in the detail section
If no build command is detected: I report "Build/Type Check: SKIPPED — no build command detected" with status INFO (not WARNING)

我会检测项目的构建/类型检查命令并执行。

配置覆盖检查 —
verify.build_command
和
verify.type_check_command
（在自动检测前检查）：

若项目根目录下存在config.yaml且包含verify.build_command键：
    若verify.build_command不是字符串类型：
        → 发出警告："verify.build_command不是字符串类型——视为不存在"
        → 继续自动检测构建命令
    否则：
        → 使用verify.build_command作为构建/类型检查命令
        → 跳过以下构建/类型检查命令的自动检测表

若项目根目录下存在config.yaml且包含verify.type_check_command键：
    若verify.type_check_command不是字符串类型：
        → 发出警告："verify.type_check_command不是字符串类型——视为不存在"
        → 继续自动检测类型检查命令
    否则：
        → 使用verify.type_check_command作为类型检查命令
        → 跳过类型检查命令的自动检测

当任一配置覆盖项存在且有效时，会替换对应的自动检测命令。两个覆盖项相互独立——可单独设置其中一个。

构建命令自动检测（仅当verify.build_command不存在或无效时触发——按优先级匹配，使用第一个匹配项）：

优先级	检查文件	条件	命令
1	`package.json`	`scripts.typecheck` 存在	`npm run typecheck`
2	`package.json`	`scripts.build` 存在	`npm run build`
3	`tsconfig.json`	文件存在 + devDependencies中包含TypeScript	`npx tsc --noEmit`
4	`Makefile`	`build` 目标存在	`make build`
5	`build.gradle` / `gradlew`	文件存在	`./gradlew build`
6	`mix.exs`	文件存在	`mix compile --warnings-as-errors`
—	以上均不满足	—	跳过并记录信息

执行流程：

通过Bash工具执行检测到的命令
捕获退出码（0=通过，非0=失败）
捕获错误输出用于分析
记录：执行的命令、退出码、错误摘要（若有）

错误处理：

若命令无法执行：我会报告“构建/类型检查：错误 — [错误信息]”，状态为警告并继续后续步骤
若构建失败：我会报告“构建/类型检查：失败”并在详细部分包含错误输出
若未检测到构建命令：我会报告“构建/类型检查：已跳过 — 未检测到构建命令”，状态为信息（非警告）

Step 8 — Coverage Validation (optional)

步骤8 — 覆盖率验证（可选）

This step is only active when a coverage threshold is configured. It is advisory only — it never produces CRITICAL status and never blocks verification.

Process:

I read

config.yaml (at project root)

and look for

coverage.threshold

(e.g.,

coverage: { threshold: 80 }

)

If no threshold is configured: I skip this step entirely and report "Coverage Validation: SKIPPED — no threshold configured"
If a threshold is configured: a. I parse the coverage percentage from the Step 6 test output (looking for common coverage summary formats) b. I compare the actual coverage against the configured threshold c. I report the result:
- Actual >= threshold: "Coverage: [X]% (threshold: [Y]%) — PASS"
- Actual < threshold: "Coverage: [X]% (threshold: [Y]%) — BELOW THRESHOLD" with status WARNING
If coverage data cannot be parsed from the test output: I report "Coverage Validation: SKIPPED — could not parse coverage from test output" with status WARNING

此步骤仅在配置了覆盖率阈值时激活。它仅提供建议——不会产生严重状态，也不会阻止验证流程。

流程：

读取项目根目录下的

config.yaml

，查找

coverage.threshold

（例如：

coverage: { threshold: 80 }

）

若未配置阈值：完全跳过此步骤并报告“覆盖率验证：已跳过 — 未配置阈值”
若已配置阈值： a. 从步骤6的测试输出中解析覆盖率百分比（查找常见的覆盖率摘要格式） b. 将实际覆盖率与配置的阈值进行比较 c. 报告结果：
- 实际覆盖率 >= 阈值：“覆盖率：[X]%（阈值：[Y]%）—— 通过”
- 实际覆盖率 < 阈值：“覆盖率：[X]%（阈值：[Y]%）—— 低于阈值”，状态为警告
若无法从测试输出中解析覆盖率数据：报告“覆盖率验证：已跳过 — 无法从测试输出中解析覆盖率数据”，状态为警告

Step 9 — Spec Compliance Matrix

步骤9 — 规格合规矩阵

I produce a Spec Compliance Matrix that cross-references every Given/When/Then scenario from the change's spec files against the verification evidence.

Process:

I read all spec content from the active persistence mode (same source as Step 1)
For each spec file, I extract every Given/When/Then scenario
For each scenario, I cross-reference against:
- Code implementation evidence from Step 3 (Correctness Check)
- Test results from Step 6 (Run Tests) — if tests were executed
I assign a compliance status per scenario:

Status	Meaning	Criteria
COMPLIANT	Fully implemented and verified	Code implements the scenario + test passes (or code inspection confirms correctness when no test runner exists)
FAILING	Implemented but test fails	Code implements the scenario + corresponding test fails
UNTESTED	Implemented but no test coverage	Code implements the scenario + no test covers this scenario (only when a test runner exists but no test covers it)
PARTIAL	Partially implemented	Code covers some but not all THEN/AND clauses of the scenario

When no test runner exists:

The matrix is still produced using code inspection evidence from Step 3
Scenarios verified only by code inspection receive COMPLIANT or PARTIAL (never UNTESTED, since code evidence was checked)

Output format:

markdown

undefined

我会生成规格合规矩阵，将变更规格文件中的每一个Given/When/Then场景与验证证据进行交叉引用。

流程：

从当前持久化模式读取所有规格内容（与步骤1的来源相同）
针对每个规格文件，提取所有Given/When/Then场景
针对每个场景，交叉引用以下内容：
- 步骤3（正确性检查）中的代码实现证据
- 步骤6（运行测试）中的测试结果——若已执行测试
为每个场景分配合规状态：

状态	含义	判定标准
COMPLIANT（合规）	完全实现并验证	代码实现了场景 + 测试通过（或当无测试运行器时，代码检查确认正确性）
FAILING（失败）	已实现但测试失败	代码实现了场景 + 对应的测试失败
UNTESTED（未测试）	已实现但无测试覆盖	代码实现了场景 + 无测试覆盖此场景（仅当存在测试运行器但无对应测试时）
PARTIAL（部分实现）	部分实现	代码仅覆盖场景中的部分THEN/AND子句

当无测试运行器时：

仍会使用步骤3中的代码检查证据生成矩阵
仅通过代码检查验证的场景会被标记为COMPLIANT或PARTIAL（不会标记为UNTESTED，因为已检查代码证据）

输出格式：

markdown

undefined

Spec Compliance Matrix

规格合规矩阵

Spec Domain	Requirement	Scenario	Status	Evidence
[domain]	[requirement name]	[scenario name]	COMPLIANT	[evidence description]
[domain]	[requirement name]	[scenario name]	FAILING	[failing test name or output]
[domain]	[requirement name]	[scenario name]	UNTESTED	No test coverage found
[domain]	[requirement name]	[scenario name]	PARTIAL	[which clauses are covered and which are not]


The matrix MUST include scenarios from ALL spec domains affected by the change.

规格领域	需求	场景	状态	证据
[领域]	[需求名称]	[场景名称]	COMPLIANT	[证据描述]
[领域]	[需求名称]	[场景名称]	FAILING	[失败测试名称或输出]
[领域]	[需求名称]	[场景名称]	UNTESTED	未找到测试覆盖
[领域]	[需求名称]	[场景名称]	PARTIAL	[已覆盖和未覆盖的子句说明]


矩阵必须包含变更影响的所有规格领域中的场景。

Step 10 — Create verify-report.md

步骤10 — 创建verify-report.md

Evidence rule — applies to every criterion in
verify-report.md
:

A criterion MUST only be marked

[x]

when:

A tool command was run and its output confirms the criterion, OR
The user provided an explicit evidence statement

When neither condition is met: leave

[ ]

with note: "Manual confirmation required — no tool output available". Abstract reasoning or code inspection alone MUST NOT suffice to mark a criterion

[x]

The
## Tool Execution
section is mandatory in every
verify-report.md
— even when tool execution was skipped. When skipped, the section MUST still appear with: "Test Execution: SKIPPED — no test runner detected".

I persist the verify report to engram:

Call

mem_save

with

topic_key: sdd/{change-name}/verify-report

type: architecture

project: {project}

, content = full report markdown. Do NOT write any file.

If Engram MCP is not reachable: skip persistence. Return report content inline only.

Persisted artifact (compact — only what sdd-archive and the orchestrator consume):

markdown

undefined

证据规则 — 适用于
verify-report.md
中的每一项判定标准：

只有满足以下条件之一时，判定标准才能标记为

[x]

：

已运行工具命令且其输出确认该标准，或
用户提供了明确的证据声明

当两个条件都不满足时：保留

[ ]

并备注：“需手动确认——无工具输出可用”。仅通过抽象推理或代码检查不能将判定标准标记为

[x]

。

每个

verify-report.md

中必须包含

## 工具执行

部分——即使工具执行已跳过。当跳过执行时，该部分仍需显示：“测试执行：已跳过 — 未检测到测试运行器”。

我会将验证报告持久化到Engram：

调用

mem_save

，参数为

topic_key: sdd/{change-name}/verify-report

type: architecture

project: {project}

, content = 完整报告markdown。不写入任何文件。

若无法连接到Engram MCP：跳过持久化。仅返回报告内容。

持久化工件（精简版 — 仅sdd-archive和编排器需要的内容）：

markdown

undefined

Verification Report: [change-name]

验证报告: [change-name]

Date: [YYYY-MM-DD] Verdict: PASS / PASS WITH WARNINGS / FAIL

日期: [YYYY-MM-DD] verdict: 通过 / 带警告通过 / 失败

Summary

摘要

Dimension	Status
Completeness	OK / WARNING / CRITICAL
Correctness	OK / WARNING / CRITICAL
Coherence	OK / WARNING / CRITICAL
Testing	OK / WARNING / CRITICAL
Test Execution	OK / WARNING / CRITICAL / SKIPPED
Build	OK / WARNING / SKIPPED

维度	状态
完整性	正常 / 警告 / 严重
正确性	正常 / 警告 / 严重
一致性	正常 / 警告 / 严重
测试情况	正常 / 警告 / 严重
测试执行	正常 / 警告 / 严重 / 已跳过
构建	正常 / 警告 / 已跳过

Tool Execution

工具执行

Command	Exit Code	Result
[command]	[code]	[PASS/FAIL/SKIPPED]

命令	退出码	结果
[命令]	[代码]	通过/失败/已跳过

Issues

问题

CRITICAL

严重问题

[issue description] [or: "None."]

[问题描述] [或: "无。"]

WARNINGS

警告

[issue description] [or: "None."]


**Conversational output** (shown to user but NOT persisted):

The full detail sections — Completeness tables, Correctness requirement-by-requirement tables, Coherence decision tracking, Testing coverage tables, Spec Compliance Matrix, Coverage Validation, and SUGGESTIONS — are presented in the conversational response. This gives the user full visibility without inflating the persisted artifact.

The conversational output MUST still include all detail sections from Steps 2-9 — the user needs to see the full analysis. Only the **persisted artifact** is compact.

[问题描述] [或: "无。"]


**对话输出**（展示给用户但**不持久化**）：

完整的详细部分——完整性表格、正确性需求逐项表格、一致性决策跟踪、测试覆盖表格、规格合规矩阵、覆盖率验证和建议——会在对话响应中展示。这使用户能够查看完整分析，同时不会增大持久化工件的体积。

对话输出必须包含步骤2-9的所有详细部分——用户需要查看完整分析。只有**持久化工件**是精简版。

WARNINGS (should be resolved):

警告（应解决）：

[description] [or: "None."]

[描述] [或: "无。"]

SUGGESTIONS (optional improvements):

建议（可选改进）：

[description] [or: "None."]

---

[描述] [或: "无。"]

---

Verdict Criteria

判定标准

Verdict	Condition
PASS	0 critical, 0 warnings
PASS WITH WARNINGS	0 critical, 1+ warnings
FAIL	1+ critical

判定结果	条件
通过	0个严重问题，0个警告
带警告通过	0个严重问题，1个及以上警告
失败	1个及以上严重问题

Severities

严重程度

Severity	Description	Blocks archiving
CRITICAL	Requirement not implemented, main scenario not covered, core task incomplete	Yes
WARNING	Edge case scenario without test, design deviation, pending cleanup task, test execution failure	No
SUGGESTION	Optional quality improvement	No
SKIPPED	Step preconditions not met (no test runner, no build command, no coverage config) — does NOT count toward verdict	No
INFO	Informational note (e.g., no build command detected) — does NOT count toward verdict	No

Verdict calculation note: Only the original four dimensions (Completeness, Correctness, Coherence, Testing) plus Test Execution and Spec Compliance contribute CRITICAL/WARNING statuses. SKIPPED and INFO statuses from any dimension do NOT count as WARNING or CRITICAL for the verdict. This preserves identical verdict behavior for projects without test infrastructure.

严重程度	描述	是否阻止归档
严重（CRITICAL）	需求未实现、主场景未覆盖、核心任务未完成	是
警告（WARNING）	边缘场景无测试、设计偏离、待完成的清理任务、测试执行失败	否
建议（SUGGESTION）	可选的质量改进	否
已跳过（SKIPPED）	步骤前置条件未满足（无测试运行器、无构建命令、无覆盖率配置）——不影响判定结果	否
信息（INFO）	信息性说明（例如：未检测到构建命令）——不影响判定结果	否

判定结果计算说明： 只有最初的四个维度（完整性、正确性、一致性、测试情况）加上测试执行和规格合规性会产生严重/警告状态。任何维度的已跳过和信息状态不会被视为警告或严重来影响判定结果。这确保了没有测试基础设施的项目也能获得一致的判定结果。

Output to Orchestrator

输出给编排器

json

{
  "status": "ok|warning|failed",
  "summary": "Verification [change-name]: [verdict]. [N] critical, [M] warnings.",
  "artifacts": ["engram:sdd/{change-name}/verify-report"],
  "test_execution": {
    "runner": "[detected runner or null]",
    "command": "[command or null]",
    "exit_code": "[0/1/N or null]",
    "result": "PASS|FAILING|ERROR|SKIPPED"
  },
  "build_check": {
    "command": "[command or null]",
    "exit_code": "[0/1/N or null]",
    "result": "PASS|FAILING|ERROR|SKIPPED"
  },
  "compliance_matrix": {
    "total_scenarios": "[N]",
    "compliant": "[N]",
    "failing": "[N]",
    "untested": "[N]",
    "partial": "[N]"
  },
  "next_recommended": ["sdd-archive (if PASS or PASS WITH WARNINGS)"],
  "risks": ["CRITICAL: [description if any]"]
}

Continue with archive? Reply yes to proceed or no to pause. (Manual:
/sdd-archive <slug>
)

json

{
  "status": "ok|warning|failed",
  "summary": "Verification [change-name]: [verdict]. [N] critical, [M] warnings.",
  "artifacts": ["engram:sdd/{change-name}/verify-report"],
  "test_execution": {
    "runner": "[检测到的运行器或null]",
    "command": "[命令或null]",
    "exit_code": "[0/1/N或null]",
    "result": "PASS|FAILING|ERROR|SKIPPED"
  },
  "build_check": {
    "command": "[命令或null]",
    "exit_code": "[0/1/N或null]",
    "result": "PASS|FAILING|ERROR|SKIPPED"
  },
  "compliance_matrix": {
    "total_scenarios": "[N]",
    "compliant": "[N]",
    "failing": "[N]",
    "untested": "[N]",
    "partial": "[N]"
  },
  "next_recommended": ["sdd-archive (if PASS or PASS WITH WARNINGS)"],
  "risks": ["CRITICAL: [描述（若有）]"]
}

是否继续归档？回复yes继续或no暂停。 (手动触发:
/sdd-archive <slug>
)

Rules

规则

I ONLY report — I fix nothing during verification
I read real code — I do not assume something works just because the file exists
I am objective: I report what IS, not what should be
If there are deviations documented in tasks.md, I evaluate them with context
A FAIL is not personal — it is information for improvement
I run tests if possible (via Bash tool): I report the actual results
The
```
## Tool Execution
```
section is mandatory in every
```
verify-report.md
```
— even when skipped; when skipped it MUST state "Test Execution: SKIPPED — no test runner detected"
A criterion marked
```
[x]
```
MUST have verifiable evidence: tool output or an explicit user evidence statement; abstract reasoning or code inspection alone MUST NOT suffice
Test command resolution uses a three-level priority model: level 1 (
```
verify_commands
```
) > level 2 (
```
verify.test_commands
```
) > level 3 (auto-detection); each level is only consulted when all higher levels are absent or invalid
Empty
```
verify.test_commands: []
```
falls through to auto-detection — it is NOT treated as zero-command success
```
verify.build_command
```
and
```
verify.type_check_command
```
override their respective auto-detected commands when present and are strings; non-string values emit a WARNING and fall back to auto-detection

我仅生成报告——验证过程中不进行任何修复
我读取真实代码——不会仅因文件存在就假设功能正常
我保持客观：报告实际情况，而非应然情况
若tasks.md中记录了偏离设计的情况，我会结合上下文进行评估
失败结果并非针对个人——而是用于改进的信息
若可能（通过Bash工具）我会运行测试：报告实际结果
每个
```
verify-report.md
```
中必须包含
```
## 工具执行
```
部分——即使工具执行已跳过；当跳过执行时必须显示“测试执行：已跳过 — 未检测到测试运行器”
标记为
```
[x]
```
的判定标准必须有可验证的证据：工具输出或用户明确的证据声明；仅通过抽象推理或代码检查不能标记
测试命令解析使用三级优先级模型：第一级(
```
verify_commands
```
) > 第二级(
```
verify.test_commands
```
) > 第三级（自动检测）；仅当所有更高优先级级别不存在或无效时才会查询下一级
空的
```
verify.test_commands: []
```
会进入自动检测——不会被视为零命令成功
当
```
verify.build_command
```
和
```
verify.type_check_command
```
存在且为字符串时，会覆盖对应的自动检测命令；非字符串值会发出警告并回退到自动检测