test-anti-patterns
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTest Anti-Pattern Detection
测试反模式检测
Analyze .NET test code for anti-patterns, code smells, and quality issues that undermine test reliability, maintainability, and diagnostic value.
分析.NET测试代码中的反模式、代码异味和质量问题,这些问题会损害测试的可靠性、可维护性和诊断价值。
When to Use
适用场景
- User asks to review test quality or find test smells
- User wants to know why tests are flaky or unreliable
- User asks "are my tests good?" or "what's wrong with my tests?"
- User requests a test audit or test code review
- User wants to improve existing test code
- 用户要求评审测试质量或查找测试异味
- 用户想了解测试不稳定、不可靠的原因
- 用户询问「我的测试写得好吗?」或「我的测试有什么问题?」
- 用户请求测试审计或测试代码评审
- 用户想要优化现有测试代码
When Not to Use
不适用场景
- User wants to write new tests from scratch (use )
writing-mstest-tests - User wants to run or execute tests (use )
run-tests - User wants to migrate between test frameworks or versions (use migration skills)
- User wants to measure code coverage (out of scope)
- 用户想要从零编写新测试(使用能力)
writing-mstest-tests - 用户想要运行或执行测试(使用能力)
run-tests - 用户想要在测试框架或版本之间迁移(使用迁移相关能力)
- 用户想要统计代码覆盖率(不在能力范围内)
Inputs
输入参数
| Input | Required | Description |
|---|---|---|
| Test code | Yes | One or more test files or classes to analyze |
| Production code | No | The code under test, for context on what tests should verify |
| Specific concern | No | A focused area like "flakiness" or "naming" to narrow the review |
| 输入 | 是否必填 | 描述 |
|---|---|---|
| 测试代码 | 是 | 待分析的一个或多个测试文件或类 |
| 生产代码 | 否 | 被测业务代码,用于提供测试预期验证逻辑的上下文 |
| 特定关注项 | 否 | 可指定聚焦领域如「不稳定性」或「命名规范」来缩小评审范围 |
Workflow
工作流程
Step 1: Gather the test code
步骤1:收集测试代码
Read the test files the user wants reviewed. If the user points to a directory or project, scan for all test files (files containing , , , , or attributes).
[TestClass][TestMethod][Fact][Test][Theory]If production code is available, read it too -- this is critical for detecting tests that are coupled to implementation details rather than behavior.
读取用户要评审的测试文件。如果用户指定了目录或项目,扫描所有测试文件(包含、、、或特性的文件)。
[TestClass][TestMethod][Fact][Test][Theory]如果提供了生产代码也需要读取——这对检测「测试耦合实现细节而非行为」的问题至关重要。
Step 2: Scan for anti-patterns
步骤2:扫描反模式
Check each test file against the anti-pattern catalog below. Report findings grouped by severity.
对照下方反模式目录检查每个测试文件,按严重程度分组上报发现的问题。
Critical -- Tests that give false confidence
严重——会传递错误信心的测试
| Anti-Pattern | What to Look For |
|---|---|
| No assertions | Test methods that execute code but never assert anything. A passing test without assertions proves nothing. |
| Swallowed exceptions | |
| Assert in catch block only | |
| Always-true assertions | |
| Commented-out assertions | Assertions that were disabled but the test still runs, giving the illusion of coverage. |
| 反模式 | 识别要点 |
|---|---|
| 无断言 | 测试方法执行了代码但没有任何断言逻辑。无断言的测试通过不证明任何逻辑正确性。 |
| 异常被吞 | 使用 |
| 仅在catch块内断言 | 写法如 |
| 永真断言 | 如 |
| 注释掉的断言 | 断言被禁用但测试仍然运行,造成有测试覆盖的假象。 |
High -- Tests likely to cause pain
高危——大概率会带来维护负担的测试
| Anti-Pattern | What to Look For |
|---|---|
| Flakiness indicators | |
| Test ordering dependency | Static mutable fields modified across tests, |
| Over-mocking | More mock setup lines than actual test logic. Verifying exact call sequences on mocks rather than outcomes. Mocking types the test owns. |
| Implementation coupling | Testing private methods via reflection, asserting on internal state, verifying exact method call counts on collaborators instead of observable behavior. |
| Broad exception assertions | |
| 反模式 | 识别要点 |
|---|---|
| 不稳定测试指标 | 使用 |
| 测试执行顺序依赖 | 静态可变字段在多测试间被修改, |
| 过度Mock | Mock配置代码行数多于实际测试逻辑,验证Mock的精确调用顺序而非最终结果,Mock测试本身可控的类型。 |
| 实现耦合 | 通过反射测试私有方法,断言内部状态,验证协作方的精确方法调用次数而非可观测行为。 |
| 宽泛异常断言 | 使用 |
Medium -- Maintainability and clarity issues
中危——可维护性和清晰度问题
| Anti-Pattern | What to Look For |
|---|---|
| Poor naming | Test names like |
| Magic values | Unexplained numbers or strings in arrange/assert: |
| Duplicate tests | Three or more test methods with near-identical bodies that differ only in a single input value. Should be data-driven ( |
| Giant tests | Test methods exceeding ~30 lines or testing multiple behaviors at once. Hard to diagnose when they fail. |
| Assertion messages that repeat the assertion | |
| Missing AAA separation | Arrange, Act, Assert phases are interleaved or indistinguishable. |
| 反模式 | 识别要点 |
|---|---|
| 命名不规范 | 测试命名如 |
| 魔法值 | 准备/断言阶段出现未解释的数字或字符串:如 |
| 重复测试 | 三个及以上测试方法的实现几乎完全相同,仅输入值不同,应该改用数据驱动写法( |
| 巨型测试 | 测试方法超过约30行,或同时测试多个行为,失败时难以定位问题。 |
| 无效断言信息 | 如 |
| 缺失AAA阶段划分 | 准备(Arrange)、执行(Act)、断言(Assert)阶段交错或无法区分。 |
Low -- Style and hygiene
低危——风格和规范问题
| Anti-Pattern | What to Look For |
|---|---|
| Unused test infrastructure | |
| IDisposable not disposed | Test creates |
| Console.WriteLine debugging | Leftover |
| Inconsistent naming convention | Mix of naming styles in the same test class (e.g., some use |
| 反模式 | 识别要点 |
|---|---|
| 未使用的测试基础设施 | 空实现的 |
| IDisposable资源未释放 | 测试创建了 |
| 调试日志残留 | 测试开发阶段遗留的 |
| 命名规范不一致 | 同一个测试类中混用多种命名风格(如部分使用 |
Step 3: Calibrate severity honestly
步骤3:客观校准严重程度
Before reporting, re-check each finding against these severity rules:
- Critical/High: Only for issues that cause tests to give false confidence or be unreliable. A test that always passes regardless of correctness is Critical. Flaky shared state is High.
- Medium: Only for issues that actively harm maintainability -- 5+ nearly-identical tests, truly meaningless names like .
Test1 - Low: Cosmetic naming mismatches, minor style preferences, assertion messages that could be better. When in doubt, rate Low.
- Not an issue: Separate tests for distinct boundary conditions (zero vs. negative vs. null). Explicit per-test setup instead of (this improves isolation). Tests that are short and clear but could theoretically be consolidated.
[TestInitialize]
IMPORTANT: If the tests are well-written, say so clearly up front. Do not inflate severity to justify the review. A review that finds zero Critical/High issues and only minor Low suggestions is a valid and valuable outcome. Lead with what the tests do well.
上报问题前,对照以下规则重新检查每个问题的严重等级:
- 严重/高危:仅用于会导致测试传递错误信心或不可靠的问题。无论逻辑是否正确都永远通过的测试属于严重,由共享状态导致的不稳定属于高危。
- 中危:仅用于确实损害可维护性的问题——5个及以上几乎完全相同的测试、这类完全无意义的命名。
Test1 - 低危:表层命名不匹配、次要风格偏好、可优化的断言信息。拿不准的时候就归为低危。
- 不是问题:覆盖不同边界条件的独立测试(零值 vs 负值 vs 空值),不使用而是每个测试单独配置(这反而会提升隔离性),简短清晰、理论上可合并但不需要合并的测试。
[TestInitialize
重要提示:如果测试写得很好,要在最开头明确说明。不要为了证明评审有价值而抬高问题严重等级。一个仅发现零个严重/高危问题、只有少量低危优化建议的评审是合理且有价值的结果,要先说明测试的优点。
Step 4: Report findings
步骤4:上报发现的问题
Present findings in this structure:
- Summary -- Total issues found, broken down by severity (Critical / High / Medium / Low). If tests are well-written, lead with that assessment.
- Critical and High findings -- List each with:
- The anti-pattern name
- The specific location (file, method name, line)
- A brief explanation of why it's a problem
- A concrete fix (show before/after code when helpful)
- Medium and Low findings -- Summarize in a table unless the user wants full detail
- Positive observations -- Call out things the tests do well (sealed class, specific exception types, data-driven tests, clear AAA structure, proper use of fakes, good naming). Don't only report negatives.
按以下结构呈现结果:
- 总结——发现的问题总数,按严重程度拆分(严重/高危/中危/低危)。如果测试质量很好,要先给出这个结论。
- 严重和高危问题——逐个列出,包含:
- 反模式名称
- 具体位置(文件、方法名、行号)
- 问题影响的简要说明
- 具体修复方案(必要时展示修改前后的代码对比)
- 中危和低危问题——除非用户要求完整细节,否则用表格汇总展示
- 正面评价——指出测试做得好的地方(密封类、特定异常类型断言、数据驱动测试、清晰的AAA结构、正确使用测试替身、命名规范等),不要只报问题。
Step 5: Prioritize recommendations
步骤5:优先级排序建议
If there are many findings, recommend which to fix first:
- Critical -- Fix immediately, these tests may be giving false confidence
- High -- Fix soon, these cause flakiness or maintenance burden
- Medium/Low -- Fix opportunistically during related edits
如果发现的问题很多,按以下顺序推荐修复优先级:
- 严重——立即修复,这类测试可能正在传递错误的信心
- 高危——尽快修复,这类问题会导致测试不稳定或增加维护负担
- 中危/低危——在后续相关代码修改时顺带修复
Validation
校验项
- Every finding includes a specific location (not just a general warning)
- Every Critical/High finding includes a concrete fix
- Report covers all categories (assertions, isolation, naming, structure)
- Positive observations are included alongside problems
- Recommendations are prioritized by severity
- 每个问题都标注了具体位置(不是泛泛的警告)
- 每个严重/高危问题都提供了具体修复方案
- 报告覆盖了所有类别(断言、隔离、命名、结构)
- 问题之外包含了正面评价
- 修复建议按严重程度做了优先级排序
Common Pitfalls
常见误区
| Pitfall | Solution |
|---|---|
| Reporting style issues as critical | Naming and formatting are Medium/Low, never Critical |
| Suggesting rewrites instead of targeted fixes | Show minimal diffs -- change the assertion, not the whole test |
| Flagging intentional design choices | If |
| Inventing false positives on clean code | If tests follow best practices, say so. A review finding "0 Critical, 0 High, 1 Low" is perfectly valid. Don't inflate findings to justify the review. |
| Flagging separate boundary tests as duplicates | Two tests for zero and negative inputs test different edge cases. Only flag as duplicates when 3+ tests have truly identical bodies differing by a single value. |
| Rating cosmetic issues as Medium | Naming mismatches (e.g., method name says |
| Ignoring the test framework | xUnit uses |
| Missing the forest for the trees | If 80% of tests have no assertions, lead with that systemic issue rather than listing every instance |
| 误区 | 解决方案 |
|---|---|
| 把风格问题判定为严重等级 | 命名和格式问题属于中危/低危,永远不能归为严重 |
| 建议完全重写而非定向修复 | 展示最小修改差异——只改有问题的断言,不要重写整个测试 |
| 把 intentional 设计判定为问题 | 如果 |
| 在干净的代码上强行制造问题 | 如果测试符合最佳实践,就如实说明。「0个严重、0个高危、1个低危」的评审结果完全合理,不要为了证明评审有价值而夸大问题 |
| 把边界条件独立测试判定为重复 | 零值和负值输入的两个测试覆盖的是不同边界场景,只有当3个及以上测试的实现完全相同、仅单值不同时才判定为重复 |
| 把表层问题判定为中危 | 命名不匹配(如方法名写的是 |
| 忽略测试框架差异 | xUnit用 |
| 抓小放大 | 如果80%的测试都没有断言,要先说明这个系统性问题,而不是逐个罗列每个实例 |