testing-review

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Testing Review

测试审查

Review the repo test suite from current reality, not stale vibes.
Use this when you want a periodic testing audit, a fresh coverage map, or a new next-batch recommendation before a breaking-change wave.
从当前实际情况而非陈旧印象出发,审查仓库的测试套件。
当你需要定期测试审计、更新覆盖率图谱,或是在一波破坏性变更前获取下一批测试任务推荐时,可使用本流程。

Goal

目标

  • rerun fresh repo coverage
  • inspect test-suite health
  • score remaining files by real regression value
  • publish the next recommended batch
  • stop fake work before it starts
This workflow is audit-first. Do not implement the recommended tests unless the user explicitly asks for execution.
  • 重新运行全仓库的覆盖率测试
  • 检查测试套件的健康状况
  • 根据实际回归价值对剩余文件进行评分
  • 发布下一批推荐的测试任务
  • 从源头上避免无效工作
本流程以审计为首要目标。除非用户明确要求执行,否则不要实施推荐的测试任务。

Inputs

输入

  • @.agents/rules/task.mdc
  • @.agents/rules/testing.mdc
  • @.agents/rules/task.mdc
  • @.agents/rules/testing.mdc

Core Rules

核心规则

  • Use fresh
    lcov
    as source of truth.
  • Score files, not just packages.
  • Do not default to package sweeps once the obvious package-wide passes are spent.
  • Prefer file-ranked batches across packages when the remaining value is scattered.
  • Only recommend a package sweep when a package is still largely untouched and contains multiple top-ranked seams.
  • Lock a roadmap for the current review phase instead of re-inventing the next batch on every pass.
  • Future passes should update roadmap status in place unless the candidate set materially changes.
  • Do not permanently exclude
    /react
    . Only exclude it when the current review explicitly says so.
  • Penalize wrappers, crumbs, giant sludge files, and likely-dead code.
  • Reward deterministic transforms, queries, merge helpers, parser/serializer seams, plugin resolution, normalization, and public editor contracts.
  • Coverage is regression telemetry, not a KPI.
  • If the remaining misses are mostly low-ROI dust, say stop.
  • 以最新的
    lcov
    文件作为事实依据。
  • 对文件进行评分,而非仅针对包。
  • 在明显的包级测试完成后,不要默认进行包级全面测试。
  • 当剩余测试价值分散时,优先选择跨包的文件排名批次。
  • 仅当某个包仍大部分未被测试且包含多个高优先级测试点时,才推荐进行包级全面测试。
  • 为当前审查阶段锁定一份路线图,而非每次执行都重新制定下一批任务。
  • 后续执行应原地更新路线图状态,除非候选文件集发生实质性变化。
  • 不要永久排除
    /react
    目录。仅当当前审查明确要求时才排除它。
  • 对包装器、琐碎代码、巨型冗余文件和可能已废弃的代码进行扣分。
  • 对确定性转换、查询、合并助手、解析器/序列化器测试点、插件解析、标准化和公共编辑器合约进行加分。
  • 覆盖率是回归遥测数据,而非关键绩效指标(KPI)。
  • 如果剩余未覆盖部分大多是低投资回报率(ROI)的琐碎内容,建议停止测试。

Workflow

工作流程

1. Refresh Coverage

1. 更新覆盖率

Run fresh repo coverage with a date-stamped output directory:
bash
bun test --coverage --coverage-reporter=lcov --coverage-dir=/Users/zbeyens/git/plate/.coverage-repo-YYYY-MM-DDx --reporter=dots
Capture:
  • pass/fail count
  • file count
  • runtime
  • lcov.info
    path
运行全仓库的覆盖率测试,并将输出保存到带日期戳的目录中:
bash
bun test --coverage --coverage-reporter=lcov --coverage-dir=/Users/zbeyens/git/plate/.coverage-repo-YYYY-MM-DDx --reporter=dots
记录以下信息:
  • 测试通过/失败数量
  • 文件数量
  • 运行时间
  • lcov.info
    文件路径

2. Inspect Suite Health

2. 检查套件健康状况

Run the fast-lane timing checks:
bash
pnpm test:profile -- --top 25
pnpm test:slowest -- --top 25
Then scan for stale suite debt:
bash
rg -n "describe\\.skip|it\\.skip|test\\.skip|xit\\(|xdescribe\\(" packages apps
rg -n "^\\s*//\\s*(describe|it|test)\\(" packages apps -g "*.spec.ts" -g "*.spec.tsx"
rg -n "from '.*\\.spec'" packages apps -g "*.spec.ts" -g "*.spec.tsx"
Only report debt that is actually worth fixing.
运行快速计时检查:
bash
pnpm test:profile -- --top 25
pnpm test:slowest -- --top 25
然后扫描陈旧的测试债务:
bash
rg -n "describe\\.skip|it\\.skip|test\\.skip|xit\\(|xdescribe\\(" packages apps
rg -n "^\\s*//\\s*(describe|it|test)\\(" packages apps -g "*.spec.ts" -g "*.spec.tsx"
rg -n "from '.*\\.spec'" packages apps -g "*.spec.ts" -g "*.spec.tsx"
仅报告值得修复的测试债务。

3. Score Remaining Files

3. 对剩余文件评分

Score every remaining
packages/**/src/**
file for worth-testing value.
Exclude by default:
  • test files
  • barrels
  • declaration files
  • obvious type-only files
  • generated junk
  • zero-value crumbs
Scoring should reflect:
  • seam type
  • runtime coverage
  • uncovered behavior
  • likely regression value during breaking changes
  • test ROI
When recommending the next batch:
  • prefer the best files across packages over "do package X next"
  • call out when package totals are inflated by crumbs, wrappers, or giant low-ROI leftovers
  • say explicitly when a package sweep would be dumb
对所有剩余的
packages/**/src/**
文件进行测试价值评分。
默认排除以下文件:
  • 测试文件
  • 桶文件(barrels)
  • 声明文件
  • 明显仅包含类型的文件
  • 生成的垃圾文件
  • 无价值的琐碎代码
评分应考虑以下因素:
  • 测试点类型
  • 运行时覆盖率
  • 未覆盖的行为
  • 破坏性变更期间可能的回归价值
  • 测试投资回报率(ROI)
推荐下一批测试任务时:
  • 优先选择跨包中的最优文件,而非“接下来测试X包”
  • 指出哪些包的总评分是被琐碎代码、包装器或低ROI的巨型文件拉高的
  • 明确说明何时进行包级全面测试是不明智的

4. Write Artifacts

4. 生成产物

Write:
  • a markdown map under
    docs/plans/
  • a package TSV
  • a file TSV
  • a locked roadmap markdown file when this is the first meaningful pass for the current phase, or update that roadmap if it already exists
The markdown map should include:
  • fresh coverage result
  • scoring rules
  • strict next batch
  • wider next batch if still defensible
  • package ranking
  • file ranking
  • stop condition
  • clear caveats about fake-high package totals
The roadmap should include:
  • the frozen threshold for the current phase
  • the execution queue in stable order
  • explicit deferrals with reasons
  • status for each queued or deferred file
  • an update rule that says future passes mark items done, removed, or deferred instead of reshuffling the whole list
生成以下文件:
  • docs/plans/
    目录下的markdown图谱
  • 包级TSV文件
  • 文件级TSV文件
  • 如果是当前阶段的首次有效审查,生成一份锁定的markdown路线图;若路线图已存在,则更新该路线图
markdown图谱应包含:
  • 最新的覆盖率结果
  • 评分规则
  • 严格的下一批测试任务
  • 若合理,可包含更宽泛的下一批测试任务
  • 包排名
  • 文件排名
  • 停止条件
  • 关于虚假高包总评分的明确说明
路线图应包含:
  • 当前阶段的冻结阈值
  • 稳定顺序的执行队列
  • 带有理由的明确延期项
  • 每个已排队或延期文件的状态
  • 更新规则:后续执行仅标记任务为完成、移除或延期,而非重新调整整个排名

5. Final Recommendation

5. 最终建议

Answer with:
  • what the real next batch is
  • whether to keep pushing coverage or stop
  • what should be deferred by design
给出以下内容:
  • 真正的下一批测试任务是什么
  • 是否继续推进覆盖率测试还是停止
  • 哪些内容应按设计延期

Output Standard

输出标准

Use blunt rankings, not mush.
Say things like:
  • core first, then markdown, then diff
  • do not do another package sweep
  • the best next files are split across packages, so do not sweep package X
  • this roadmap is locked for the current phase; future passes update status, not the whole ranking
  • this file is uncovered but not worth touching
  • stop after the >= 5 batch
使用明确的排名,避免模糊表述。
例如:
  • 优先测试core,然后是markdown,再是diff
  • 不要进行下一次包级全面测试
  • 最优的下一批文件分布在多个包中,因此不要全面测试X包
  • 本路线图在当前阶段已锁定;后续执行仅更新状态,而非重新排名
  • 该文件未被覆盖但不值得测试
  • 完成第5批测试后停止

Stop Conditions

停止条件

Recommend stopping when the remaining misses are mostly:
  • wrappers
  • provider/store dust
  • DOM-only seams
  • giant low-ROI files
  • tiny uncovered crumbs
  • code likely to be rewritten soon
At that point, tell the user to switch from coverage work to architecture-safety work.
<!-- cross-ref:start -->
当剩余未覆盖部分大多为以下内容时,建议停止:
  • 包装器
  • 提供者/存储的琐碎代码
  • 仅DOM测试点
  • 低ROI的巨型文件
  • 未覆盖的微小琐碎代码
  • 可能很快会重写的代码
此时,告知用户从覆盖率工作切换到架构安全性工作。
<!-- cross-ref:start -->

See also (related skills — Testing (project-level) family)

另请参阅(相关技能 — 项目级测试系列)

If your issue relates to:
  • Python pytest discipline — isolation, coverage, mocks — check
    testing-strategy
    if appropriate.
  • Plate/Slate editor 3-layer strategy — check
    plate-testing-strategy
    if appropriate.
<!-- cross-ref:end -->
如果你的问题涉及:
  • Python pytest规范 — 隔离、覆盖率、模拟 — 若适用,请查看
    testing-strategy
  • Plate/Slate编辑器三层策略 — 若适用,请查看
    plate-testing-strategy
<!-- cross-ref:end -->