assertion-quality

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Assertion Diversity Analysis

断言多样性分析

Analyze .NET test code to measure how varied and meaningful the assertions are. Produce a metrics report that reveals whether tests verify different facets of correctness — not just "output equals X" but also structure, exceptions, state transitions, side effects, and invariants.
分析.NET测试代码,衡量断言的多样性与意义。生成指标报告,揭示测试是否验证了正确性的不同维度——不仅是“输出等于X”,还包括结构、异常、状态转换、副作用和不变量。

Why Assertion Diversity Matters

断言多样性的重要性

Low assertion diversity signals shallow testing. Tests may pass while bugs hide in unasserted logic. Common symptoms:
ProblemSymptomConsequence
Trivial assertions
Assert.IsNotNull(result)
only
Test passes but doesn't verify correctness
Single-value obsessionAlways check one field or return valueBugs in unasserted logic slip through
No negative assertionsNever check what shouldn't happenRegressions sneak in through false positives
No state checksDon't verify object state changesMissed side-effects or lifecycle issues
No structural checksOnly assert top-level valueBugs in nested objects go unnoticed
Assertion-free testsTests that call but don't verifyCode coverage lies; false security
低断言多样性意味着浅层测试。测试可能通过,但Bug仍隐藏在未被断言的逻辑中。常见症状:
问题表现后果
无实际意义的断言仅使用
Assert.IsNotNull(result)
测试通过但未验证正确性
单一值依赖始终只检查一个字段或返回值未被断言的逻辑中的Bug会被遗漏
无否定断言从不验证不应发生的情况回归问题会通过假阳性悄悄混入
无状态检查不验证对象状态变化遗漏副作用或生命周期问题
无结构检查仅断言顶层值嵌套对象中的Bug无法被发现
无断言测试仅调用代码但不做验证代码覆盖率存在误导,带来虚假安全感

When to Use

适用场景

  • User asks to evaluate assertion quality or depth
  • User asks "are my tests actually testing anything meaningful?"
  • User wants to know if test assertions are too shallow or trivial
  • User asks for assertion coverage metrics or diversity analysis
  • User suspects tests give false confidence despite passing
  • 用户请求评估断言的质量或深度
  • 用户询问“我的测试是否真的测试了有意义的内容?”
  • 用户想了解测试断言是否过于浅层或无实际意义
  • 用户请求断言覆盖指标或多样性分析
  • 用户怀疑测试虽通过但提供了虚假信心

When Not to Use

不适用场景

  • User wants to write new tests (use
    writing-mstest-tests
    )
  • User wants to detect anti-patterns beyond assertions (use
    test-anti-patterns
    )
  • User wants to fix or rewrite assertions (help them directly)
  • User asks about code coverage percentages (out of scope — this analyzes assertion quality, not line coverage)
  • 用户想要编写新测试(请使用
    writing-mstest-tests
  • 用户想要检测断言之外的反模式(请使用
    test-anti-patterns
  • 用户想要修复或重写断言(直接提供帮助)
  • 用户询问代码覆盖率百分比(超出范围——此工具分析断言质量,而非行覆盖率)

Inputs

输入项

InputRequiredDescription
Test codeYesOne or more test files or a test project directory to analyze
Production codeNoThe code under test, to evaluate whether assertions cover the important behaviors
输入是否必填描述
测试代码要分析的一个或多个测试文件,或测试项目目录
生产代码被测代码,用于评估断言是否覆盖了重要行为

Workflow

工作流程

Step 1: Gather the test code

步骤1:收集测试代码

Read all test files the user provides. If the user points to a directory or project, scan for all test files — see the
dotnet-test-frameworks
skill for framework-specific markers.
读取用户提供的所有测试文件。如果用户指向目录或项目,扫描所有测试文件——可参考
dotnet-test-frameworks
技能中针对各框架的标记规则。

Step 2: Classify every assertion

步骤2:分类所有断言

For each test method, identify all assertions and classify them into these categories:
CategoryExamplesWhat it verifies
Equality
Assert.AreEqual
,
Assert.Equal
,
Is.EqualTo
Return value matches expected
Boolean
Assert.IsTrue
,
Assert.IsFalse
,
Assert.True
Condition holds
Null checks
Assert.IsNull
,
Assert.IsNotNull
,
Assert.NotNull
Presence/absence of value
Exception
Assert.ThrowsException
,
Assert.Throws
,
Assert.ThrowsAsync
Error handling behavior
Type checks
Assert.IsInstanceOfType
,
Assert.IsAssignableFrom
Runtime type correctness
String
StringAssert.Contains
,
StringAssert.StartsWith
,
Assert.Matches
Text content and format
Collection
CollectionAssert.Contains
,
Assert.Contains
,
Assert.All
,
Has.Member
Collection contents and structure
Comparison
Assert.IsTrue(x > y)
,
Assert.InRange
,
Is.GreaterThan
Ordering and magnitude
Approximate
Assert.AreEqual(expected, actual, delta)
,
Is.EqualTo().Within()
Floating-point or tolerance-based
Negative
Assert.AreNotEqual
,
Assert.DoesNotContain
,
Assert.DoesNotThrow
What should NOT happen
State/Side-effectAssertions on object properties after mutation, verifying mock callsState transitions and side effects
Structural/DeepAssertions on nested properties, serialized forms, complex objectsDeep object correctness
A single assertion can belong to multiple categories (e.g.,
Assert.AreNotEqual
is both Equality and Negative).
针对每个测试方法,识别所有断言并将其分类为以下类别:
类别示例验证内容
相等性
Assert.AreEqual
,
Assert.Equal
,
Is.EqualTo
返回值与预期匹配
布尔值
Assert.IsTrue
,
Assert.IsFalse
,
Assert.True
条件成立
空值检查
Assert.IsNull
,
Assert.IsNotNull
,
Assert.NotNull
值的存在/缺失
异常
Assert.ThrowsException
,
Assert.Throws
,
Assert.ThrowsAsync
错误处理行为
类型检查
Assert.IsInstanceOfType
,
Assert.IsAssignableFrom
运行时类型正确性
字符串
StringAssert.Contains
,
StringAssert.StartsWith
,
Assert.Matches
文本内容与格式
集合
CollectionAssert.Contains
,
Assert.Contains
,
Assert.All
,
Has.Member
集合内容与结构
比较
Assert.IsTrue(x > y)
,
Assert.InRange
,
Is.GreaterThan
顺序与量级
近似值
Assert.AreEqual(expected, actual, delta)
,
Is.EqualTo().Within()
浮点型或基于容差的验证
否定
Assert.AreNotEqual
,
Assert.DoesNotContain
,
Assert.DoesNotThrow
不应发生的情况
状态/副作用对对象突变后的属性进行断言、验证Mock调用状态转换与副作用
结构/深度对嵌套属性、序列化形式、复杂对象进行断言对象深度正确性
单个断言可属于多个类别(例如
Assert.AreNotEqual
同时属于相等性和否定类别)。

Step 3: Compute metrics

步骤3:计算指标

Calculate these metrics for the test suite:
为测试套件计算以下指标:

Per-test metrics

单测试指标

  • Assertion count: Number of assertions in each test method
  • Assertion categories: Which categories each test uses
  • 断言数量:每个测试方法中的断言数量
  • 断言类别:每个测试使用的类别

Suite-wide metrics

套件级指标

  • Average assertions per test: Total assertions / total test methods
  • Assertion type spread: Number of distinct assertion categories used across the suite (out of 12)
  • Tests with zero assertions: Count and percentage of test methods with no assertions at all
  • Tests with only trivial assertions: Count and percentage of tests where every assertion is only a null check or
    Assert.IsTrue(true)
    — trivial means no meaningful value verification
  • Tests with self-referential assertions: Count and percentage of tests whose assertions compare an input to a round-tripped or identity-transformed version of itself (e.g.,
    Assert.AreEqual(input, Parse(input.ToString()))
    ) or assert a field against itself (
    Assert.AreEqual(dto.Name, dto.Name)
    ). These are tautological — they verify the plumbing, not the behavior.
  • Tests with negative assertions: Count and percentage (target: at least 10% of tests should verify what should NOT happen)
  • Tests with exception assertions: Count and percentage
  • Tests with state/side-effect assertions: Count and percentage
  • Tests with structural/deep assertions: Count and percentage
  • Single-category tests: Count and percentage of tests that use only one assertion category
  • 单测试平均断言数:总断言数 / 总测试方法数
  • 断言类型覆盖率:套件中使用的不同断言类别数量(共12类)
  • 无断言测试:完全无断言的测试方法数量及占比
  • 仅含无实际意义断言的测试:所有断言仅为空值检查或
    Assert.IsTrue(true)
    的测试方法数量及占比——无实际意义指未进行有价值的验证
  • 自引用断言测试:断言将输入与往返转换或身份转换后的自身进行比较(例如
    Assert.AreEqual(input, Parse(input.ToString()))
    ),或断言字段与自身比较(
    Assert.AreEqual(dto.Name, dto.Name)
    )的测试方法数量及占比。这类断言是同义反复的——仅验证了流程,而非行为
  • 含否定断言的测试:数量及占比(目标:至少10%的测试应验证不应发生的情况)
  • 含异常断言的测试:数量及占比
  • 含状态/副作用断言的测试:数量及占比
  • 含结构/深度断言的测试:数量及占比
  • 单类别断言测试:仅使用一种断言类别的测试方法数量及占比

Step 4: Apply calibration rules

步骤4:应用校准规则

Before reporting, calibrate findings:
  • Trivial means truly trivial.
    Assert.IsNotNull(result)
    alone is trivial. But
    Assert.IsNotNull(result)
    followed by
    Assert.AreEqual(expected, result.Value)
    is not — the null check is a guard before the real assertion. Only flag a test as "trivial" if it has no meaningful value assertions.
  • Boolean assertions checking meaningful conditions are not trivial.
    Assert.IsTrue(result.IsValid)
    checks a specific property — it's a Boolean assertion, not a trivial one.
    Assert.IsTrue(true)
    is trivial.
  • Consider the test's intent. A test for a void method that verifies state change on a dependency is legitimate even if it only uses
    Assert.IsTrue
    .
  • Exception tests are inherently low-assertion-count.
    Assert.ThrowsException<T>(() => ...)
    may be the only assertion — that's fine for exception-focused tests. Don't penalize them for low assertion count.
  • Don't conflate diversity with volume. A test with 20
    Assert.AreEqual
    calls has high volume but low diversity. A test with one equality, one null check, and one exception assertion has low volume but good diversity.
  • Self-referential assertions are not meaningful equality checks.
    Assert.AreEqual(input, roundTrip(input))
    looks like a real equality assertion but is tautological when the operation under test is expected to be identity. Flag these separately from normal equality assertions. If the test's purpose is to verify a round-trip (serialize/deserialize, encode/decode), the assertion is valid — but it should be accompanied by assertions on non-trivial inputs that exercise the transformation.
  • If assertions are well-diversified, say so. A report concluding the suite has good diversity is perfectly valid.
报告前,校准分析结果:
  • 无实际意义指真正无价值。仅
    Assert.IsNotNull(result)
    是无实际意义的,但
    Assert.IsNotNull(result)
    之后跟随
    Assert.AreEqual(expected, result.Value)
    则不属于——空值检查是真实断言前的防护。仅当测试无有价值的验证时,才标记为“无实际意义”
  • 验证有意义条件的布尔断言不属于无实际意义
    Assert.IsTrue(result.IsValid)
    检查特定属性——这是布尔断言,而非无实际意义的断言。
    Assert.IsTrue(true)
    才是无实际意义的
  • 考虑测试意图。针对void方法、验证依赖项状态变化的测试,即使仅使用
    Assert.IsTrue
    也是合理的
  • 异常测试本质上断言数量少
    Assert.ThrowsException<T>(() => ...)
    可能是唯一的断言——这对于异常聚焦的测试是正常的。不要因断言数量少而惩罚这类测试
  • 不要将多样性与数量混淆。包含20个
    Assert.AreEqual
    调用的测试数量多但多样性低。包含一个相等性、一个空值检查和一个异常断言的测试数量少但多样性好
  • 自引用断言不是有意义的相等性检查
    Assert.AreEqual(input, roundTrip(input))
    看似真实的相等性断言,但当被测操作预期为身份转换时,它是同义反复的。将这些与正常相等性断言分开标记。如果测试的目的是验证往返转换(序列化/反序列化、编码/解码),则该断言是有效的——但应伴随针对非平凡输入的断言以验证转换逻辑
  • 如果断言多样性良好,直接说明。结论为套件多样性良好的报告是完全有效的

Step 5: Report findings

步骤5:报告分析结果

Present the analysis in this structure:
  1. Summary Dashboard — A quick-reference table of key metrics:
    | Metric                        | Value  | Assessment |
    |-------------------------------|--------|------------|
    | Total tests                   | 25     | —          |
    | Average assertions per test   | 2.4    | Moderate   |
    | Assertion type spread         | 5/12   | Low        |
    | Tests with zero assertions    | 3 (12%)| Concerning |
    | Tests with only trivial asserts | 4 (16%)| Acceptable |
    | Tests with negative assertions | 2 (8%) | Below target |
    | Single-category tests         | 15 (60%)| High       |
  2. Category Breakdown — For each assertion category, show:
    • How many tests use it
    • Representative examples from the code
    • Whether it's overused or underused relative to the code under test
  3. Gap Analysis — Based on the production code (if available), identify:
    • Behaviors that are tested but only with equality checks
    • Error paths with no exception assertions
    • State-changing methods with no state verification
    • Collections returned but never checked for contents
  4. Recommendations — Prioritized list of improvements:
    • Which tests would benefit most from additional assertion types
    • Which assertion categories are missing and why they matter
    • Concrete examples of assertions that could be added
  5. Assertion-free tests — If any exist, list each one with its method name and what it appears to be testing, so the user can decide whether to add assertions or mark them as intentional smoke tests.
按以下结构呈现分析内容:
  1. 概览仪表盘——关键指标的速查表:
    | 指标                        | 数值  | 评估 |
    |-------------------------------|--------|------------|
    | 总测试数                   | 25     | —          |
    | 单测试平均断言数   | 2.4    | 中等   |
    | 断言类型覆盖率         | 5/12   | 低        |
    | 无断言测试数    | 3 (12%)| 需关注 |
    | 仅含无实际意义断言的测试 | 4 (16%)| 可接受 |
    | 含否定断言的测试 | 2 (8%) | 低于目标 |
    | 单类别断言测试         | 15 (60%)| 高       |
  2. 类别细分——针对每个断言类别,展示:
    • 使用该类别的测试数量
    • 代码中的代表性示例
    • 相对于被测代码,该类别是否被过度使用或使用不足
  3. 差距分析——基于生产代码(若提供),识别:
    • 已测试但仅使用相等性检查的行为
    • 无异常断言的错误路径
    • 无状态验证的状态变更方法
    • 返回但从未检查内容的集合
  4. 建议——按优先级排序的改进列表:
    • 哪些测试最受益于添加其他类型的断言
    • 缺失哪些断言类别及其重要性
    • 可添加的具体断言示例
  5. 无断言测试——如果存在,列出每个测试的方法名称及其看似要测试的内容,以便用户决定是否添加断言或标记为有意的冒烟测试

Validation

验证项

  • Every assertion in the test suite was classified into at least one category
  • Metrics are computed correctly (counts add up)
  • Trivial-assertion tests are correctly identified (not over-flagged)
  • Exception tests are not penalized for low assertion count
  • Boolean assertions on meaningful properties are not classified as trivial
  • Recommendations are concrete (name specific test methods and suggest specific assertion types)
  • If the suite has good diversity, the report acknowledges this
  • 测试套件中的每个断言至少被分类到一个类别
  • 指标计算正确(计数总和匹配)
  • 含无实际意义断言的测试被正确识别(未过度标记)
  • 异常测试未因断言数量少而被惩罚
  • 验证有意义属性的布尔断言未被分类为无实际意义
  • 建议具体(指明特定测试方法并建议特定断言类型)
  • 如果套件多样性良好,报告中会提及

Common Pitfalls

常见陷阱

PitfallSolution
Penalizing exception tests for low assertion countException assertions are complete on their own — skip count warnings for these
Flagging null checks before value checks as trivialOnly flag tests where the null check is the ONLY assertion
Counting
Assert.IsTrue(condition)
as trivial
Only
Assert.IsTrue(true)
or always-true conditions are trivial
Ignoring framework differencesMSTest uses
Assert.AreEqual
, xUnit uses
Assert.Equal
, NUnit uses
Is.EqualTo
— classify all correctly
Recommending diversity for diversity's sakeOnly suggest adding assertion types that would catch real bugs in the code under test
Missing implicit assertions
Assert.ThrowsException
is both an exception assertion and a negative assertion (verifying that calling the method has a specific failure mode)
陷阱解决方案
因断言数量少惩罚异常测试异常断言本身已完整——跳过对此类测试的数量警告
将值检查前的空值标记为无实际意义仅当空值检查是唯一断言时,才标记测试为无实际意义
Assert.IsTrue(condition)
视为无实际意义
Assert.IsTrue(true)
或恒成立条件才是无实际意义的
忽略框架差异MSTest使用
Assert.AreEqual
,xUnit使用
Assert.Equal
,NUnit使用
Is.EqualTo
——正确分类所有情况
为了多样性而追求多样性仅建议添加能发现被测代码中真实Bug的断言类型
遗漏隐式断言
Assert.ThrowsException
既是异常断言也是否定断言(验证调用方法会触发特定失败模式)