ln-404-test-executor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Test Task Executor

测试任务执行器

Runs a single Story final test task (label "tests") through implementation/execution to To Review.

运行单个Story的最终测试任务（标记为“tests”），完成从实现/执行到To Review状态的流转。

Purpose & Scope

用途与范围

Handle only tasks labeled "tests"; other tasks go to ln-401.
Follow the 11-section test task plan (E2E/Integration/Unit, infra/docs/cleanup).
Enforce risk-based constraints: Priority ≤15; E2E 2-5, Integration 0-8, Unit 0-15, total 10-28; no framework/DB/library/performance tests.
Update Linear/kanban for this task only: Todo -> In Progress -> To Review.

仅处理标记为“tests”的任务；其他任务流转至ln-401。
遵循包含11部分的测试任务计划（E2E/集成/单元测试，基础设施/文档/清理）。
强制执行基于风险的约束条件：优先级≤15；E2E测试数量2-5，集成测试0-8，单元测试0-15，总测试数量10-28；禁止执行框架/数据库/库/性能测试。
仅更新该任务的Linear/看板状态：Todo → In Progress → To Review。

Task Storage Mode

任务存储模式

Aspect	Linear Mode	File Mode
Load task	`get_issue(task_id)`	`Read("docs/tasks/epics/.../tasks/T{NNN}-*.md")`
Load Story	`get_issue(parent_id)`	`Read("docs/tasks/epics/.../story.md")`
Update status	`update_issue(id, state)`	`Edit` the `Status:` line in file
Test results	Linear comment	Append to task file

File Mode transitions: Todo → In Progress → To Review

方面	Linear模式	文件模式
加载任务	`get_issue(task_id)`	`Read("docs/tasks/epics/.../tasks/T{NNN}-*.md")`
加载Story	`get_issue(parent_id)`	`Read("docs/tasks/epics/.../story.md")`
更新状态	`update_issue(id, state)`	编辑文件中的 `Status:` 行
测试结果	Linear评论	追加至任务文件

文件模式状态流转： Todo → In Progress → To Review

Workflow (concise)

工作流程（精简版）

Receive task: Get task ID from orchestrator (ln-400); fetch full test task description (Linear: get_issue; File: Read task file); read linked guides/manuals/ADRs/research; review parent Story and manual test results if provided.
Read runbook: Read
docs/project/runbook.md
— understand test environment setup, Docker commands, test execution prerequisites. Use exact commands from runbook.
Validate plan: Check Priority ≤15 and test count limits; ensure focus on business flows (no infra-only tests).
Start work: Set task In Progress (Linear: update_issue; File: Edit status line); move in kanban.
Implement & run: Author/update tests per plan; reuse existing fixtures/helpers; run tests; fix failing existing tests; update infra/doc sections as required.
Complete: Ensure counts/priority still within limits; set task To Review; move in kanban; add comment summarizing coverage, commands run, and any deviations.

接收任务： 从编排器（ln-400）获取任务ID；获取完整的测试任务描述（Linear：调用
```
get_issue
```
；文件模式：读取任务文件）；查阅相关指南/手册/架构决策记录（ADR）/研究资料；若有提供，查看关联的父Story和手动测试结果。
阅读运行手册： 务必阅读
```
docs/project/runbook.md
```
— 了解测试环境搭建、Docker命令、测试执行前提条件，使用运行手册中的精确命令。
验证计划： 检查优先级是否≤15以及测试数量限制；确保聚焦于业务流程（禁止仅基础设施测试）。
开始工作： 将任务状态设置为In Progress（Linear：调用
```
update_issue
```
；文件模式：编辑状态行）；更新看板位置。
实现与运行： 根据计划编写/更新测试；复用现有fixtures/helpers；运行测试；修复现有失败的测试；根据任务计划要求更新基础设施/文档部分。
完成任务： 确保测试数量/优先级仍符合限制；将任务状态设置为To Review；更新看板位置；添加评论总结测试覆盖范围、执行的命令以及任何偏差情况。

Critical Rules

关键规则

Single-task only; no bulk updates.
Do not mark Done; ln-402 approves. Task must end in To Review.
Keep language (EN/RU) consistent with task.
No framework/library/DB/performance/load tests; focus on business logic correctness (not infrastructure throughput).
Respect limits and priority; if violated, stop and return with findings.
Do NOT commit. Leave all changes uncommitted — ln-402 reviews and commits with task ID reference.

仅处理单个任务；禁止批量更新。
不得标记为Done状态；由ln-402进行审批。任务必须以To Review状态结束。
保持与任务一致的语言（EN/RU）。
禁止执行框架/库/数据库/性能/负载测试；聚焦于业务逻辑正确性（而非基础设施吞吐量）。
遵守限制条件与优先级规则；若违反，立即停止并返回检查结果。
禁止提交代码。所有修改保持未提交状态 — 由ln-402审核并关联任务ID提交。

Definition of Done

完成定义

Task identified as test task and set to In Progress; kanban updated.
Plan validated (priority/limits) and guides read.
Tests implemented/updated and executed; existing failures fixed.
Docs/infra updates applied per task plan.
Task set to To Review; kanban moved; summary comment added with commands and coverage.

任务被识别为测试任务并设置为In Progress状态；看板已更新。
计划已验证（优先级/限制条件）且相关指南已查阅。
测试已实现/更新并执行；现有失败的测试已修复。
根据任务计划完成了文档/基础设施更新。
任务已设置为To Review状态；看板已更新；添加了包含执行命令和测试覆盖范围的总结评论。

Test Failure Analysis Protocol

测试失败分析流程

CRITICAL: When a newly written test fails, STOP and analyze BEFORE changing anything.

Step 1: Verify Test Correctness

Does test match AC requirements exactly? (Given/When/Then from Story)
Is expected value correct per business logic?

If uncertain: Query

ref_search_documentation(query="[domain] expected behavior")

Step 2: Decision

Test matches AC?	Action
YES	BUG IN CODE → Fix implementation, not test
NO	Test is wrong → Fix test assertion
UNCERTAIN	MANDATORY: Query MCP Ref + ask user before changing

Step 3: Document in Linear comment "Test [name] failed. Analysis: [test correct / test wrong]. Action: [fixed code / fixed test]. Reason: [justification]"

RED FLAGS (require user confirmation):

⚠️ Changing assertion to match actual output ("make test green")
⚠️ Removing test case that "doesn't work"
⚠️ Weakening expectations (e.g.,
```
toContain
```
instead of
```
toEqual
```
)

GREEN LIGHTS (safe to proceed):

✅ Fixing typo in test setup/mock data
✅ Fixing code to match AC requirements
✅ Adding missing test setup step

关键注意事项： 当新编写的测试失败时，在进行任何修改前先停止并分析原因。

步骤1：验证测试正确性

测试是否完全符合验收标准（AC）要求？（Story中的Given/When/Then规则）
预期值是否符合业务逻辑？

若不确定：调用

ref_search_documentation(query="[领域] expected behavior")

查询资料

步骤2：决策

测试是否符合验收标准？	操作
是	代码存在BUG → 修复实现代码，而非测试用例
否	测试用例错误 → 修复测试断言
不确定	必须执行：查询MCP参考资料并在修改前询问用户

步骤3：在Linear中记录注释 “测试[名称]执行失败。分析：[测试正确/测试错误]。操作：[修复代码/修复测试]。理由：[说明]”

需用户确认的红色预警情况：

⚠️ 修改断言以匹配实际输出（“让测试变绿”）
⚠️ 移除“无法运行”的测试用例
⚠️ 降低预期（例如用
```
toContain
```
替代
```
toEqual
```
）

可安全执行的绿色情况：

✅ 修复测试设置/模拟数据中的拼写错误
✅ 修复代码以符合验收标准要求
✅ 添加缺失的测试设置步骤

Test Writing Principles

测试编写原则

1. Strict Assertions - Fail on Any Mismatch

1. 严格断言 - 任何不匹配都判定失败

Use exact match assertions by default:

Strict (PREFER)	Loose (AVOID unless justified)
Exact equality check	Partial/substring match
Exact length check	"Has any length" check
Full object comparison	Partial object match
Exact type check	Truthy/falsy check

WARN-level assertions FORBIDDEN - test either PASS or FAIL, no warnings.

默认使用精确匹配断言：

严格断言（推荐）	宽松断言（非必要避免）
精确相等检查	部分/子字符串匹配
精确长度检查	“存在长度”检查
完整对象比较	部分对象匹配
精确类型检查	真值/假值检查

禁止使用WARN级别的断言 - 测试结果只能是通过或失败，无警告。

2. Expected-Based Testing for Deterministic Output

2. 针对确定性输出的基于预期的测试

For deterministic responses (API, transformations):

Use snapshot/golden file testing for complex deterministic output
Compare actual output vs expected reference file
Normalize dynamic data before comparison (timestamps → fixed, UUIDs → placeholder)

对于确定性响应（API、数据转换等）：

对复杂的确定性输出使用快照/黄金文件测试
对比实际输出与预期参考文件
对比前标准化动态数据（时间戳→固定值，UUID→占位符）

3. Golden Rule

3. 黄金法则

"If you know the expected value, assert the exact value."

Forbidden: Using loose assertions to "make test pass" when exact value is known.

“如果已知预期值，请断言精确值。”

禁止操作： 在已知精确值的情况下，使用宽松断言来“让测试通过”。

Reference Files

参考文件

Kanban format:
```
docs/tasks/kanban_board.md
```

Version: 3.2.0 (Added Test Writing Principles: strict assertions, expected-based testing, golden rule) Last Updated: 2026-01-15

看板格式：
```
docs/tasks/kanban_board.md
```

版本： 3.2.0（新增测试编写原则：严格断言、基于预期的测试、黄金法则） 最后更新时间： 2026-01-15