dotnet-performance-patterns

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

dotnet-performance-patterns

Performance-oriented architecture patterns for .NET applications. Covers zero-allocation coding with Span<T> and Memory<T>, buffer pooling with ArrayPool<T>, struct design for performance (readonly struct, ref struct, in parameters), sealed class devirtualization by the JIT, stack-based allocation with stackalloc, and string handling performance. Focuses on the why (performance rationale and measurement) rather than the how (language syntax).

Version assumptions: .NET 8.0+ baseline. Span<T> and Memory<T> are available from .NET Core 2.1+ but this skill targets modern usage patterns on .NET 8+.

Out of scope: C# language syntax for Span, records, pattern matching, and collection expressions -- see [skill:dotnet-csharp-modern-patterns]. Coding standards and naming conventions (including sealed class style guidance) -- see [skill:dotnet-csharp-coding-standards]. Microbenchmarking setup and measurement is owned by this epic's companion skill -- see [skill:dotnet-benchmarkdotnet]. Native AOT compilation pipeline and trimming -- see [skill:dotnet-native-aot]. Serialization format performance tradeoffs -- see [skill:dotnet-serialization]. Architecture patterns (caching, resilience, DI) -- see [skill:dotnet-architecture-patterns]. EF Core query optimization -- see [skill:dotnet-efcore-patterns].

Cross-references: [skill:dotnet-benchmarkdotnet] for measuring the impact of these patterns, [skill:dotnet-csharp-modern-patterns] for Span/Memory syntax foundation, [skill:dotnet-csharp-coding-standards] for sealed class style conventions, [skill:dotnet-native-aot] for AOT performance characteristics and trimming impact on pattern choices, [skill:dotnet-serialization] for serialization performance context.

面向.NET应用的性能导向架构模式，涵盖使用Span<T>和Memory<T>的零分配编码、使用ArrayPool<T>的缓冲池、面向性能的struct设计（readonly struct、ref struct、in参数）、JIT对sealed类的去虚拟化、使用stackalloc的栈分配，以及字符串处理性能优化。内容聚焦于为什么这么做（性能原理与量化衡量）而非怎么做（语言语法）。

版本假设： 以.NET 8.0+为基准。Span<T>和Memory<T>从.NET Core 2.1开始可用，但本技能面向.NET 8+的现代化使用模式。

超出范围： Span、记录、模式匹配、集合表达式的C#语言语法——参考[skill:dotnet-csharp-modern-patterns]。编码规范与命名约定（包括sealed类的风格指引）——参考[skill:dotnet-csharp-coding-standards]。微基准测试的搭建与衡量由本体系的配套技能覆盖——参考[skill:dotnet-benchmarkdotnet]。Native AOT编译管道与裁剪——参考[skill:dotnet-native-aot]。序列化格式的性能权衡——参考[skill:dotnet-serialization]。架构模式（缓存、容错、依赖注入）——参考[skill:dotnet-architecture-patterns]。EF Core查询优化——参考[skill:dotnet-efcore-patterns]。

交叉引用：[skill:dotnet-benchmarkdotnet]用于衡量这些模式的优化效果，[skill:dotnet-csharp-modern-patterns]提供Span/Memory的语法基础，[skill:dotnet-csharp-coding-standards]提供sealed类的风格约定，[skill:dotnet-native-aot]提供AOT性能特性与裁剪对模式选择的影响，[skill:dotnet-serialization]提供序列化性能相关上下文。

Span<T> and Memory<T> for Zero-Allocation Scenarios

面向零分配场景的Span<T>与Memory<T>

Why Span<T> Matters for Performance

为什么Span<T>对性能至关重要

Span<T>

provides a safe, bounds-checked view over contiguous memory without allocating. It enables slicing arrays, strings, and stack memory without copying. For syntax details see [skill:dotnet-csharp-modern-patterns]; this section focuses on performance rationale.

Span<T>

提供了对连续内存的安全、带边界检查的视图，无需分配内存。它支持对数组、字符串、栈内存进行切片且无需拷贝。语法细节参考[skill:dotnet-csharp-modern-patterns]，本节聚焦于性能原理。

Zero-Allocation String Processing

零分配字符串处理

csharp

// BAD: Substring allocates a new string on each call
public static (string Key, string Value) ParseHeader_Allocating(string header)
{
    var colonIndex = header.IndexOf(':');
    return (header.Substring(0, colonIndex), header.Substring(colonIndex + 1).Trim());
}

// GOOD: ReadOnlySpan<char> slicing avoids all allocations
public static (ReadOnlySpan<char> Key, ReadOnlySpan<char> Value) ParseHeader_ZeroAlloc(
    ReadOnlySpan<char> header)
{
    var colonIndex = header.IndexOf(':');
    return (header[..colonIndex], header[(colonIndex + 1)..].Trim());
}

Performance impact: for high-throughput parsing (HTTP headers, log lines, CSV rows), Span-based parsing eliminates GC pressure entirely. Measure with

[MemoryDiagnoser]

in [skill:dotnet-benchmarkdotnet] -- the

Allocated

column should read

0 B

csharp

// 反面示例：每次调用Substring都会分配新的字符串
public static (string Key, string Value) ParseHeader_Allocating(string header)
{
    var colonIndex = header.IndexOf(':');
    return (header.Substring(0, colonIndex), header.Substring(colonIndex + 1).Trim());
}

// 正面示例：ReadOnlySpan<char>切片完全避免内存分配
public static (ReadOnlySpan<char> Key, ReadOnlySpan<char> Value) ParseHeader_ZeroAlloc(
    ReadOnlySpan<char> header)
{
    var colonIndex = header.IndexOf(':');
    return (header[..colonIndex], header[(colonIndex + 1)..].Trim());
}

性能影响：对于高吞吐量解析场景（HTTP头部、日志行、CSV行），基于Span的解析可以完全消除GC压力。使用[skill:dotnet-benchmarkdotnet]的

[MemoryDiagnoser]

进行测量，

Allocated

列应显示为

0 B

。

Memory<T> for Async and Storage Scenarios

面向异步与存储场景的Memory<T>

Span<T>

cannot be used in async methods or stored on the heap (it is a ref struct). Use

Memory<T>

when you need to:

Pass buffers to async I/O methods
Store a slice reference in a field or collection
Return a memory region from a method for later consumption

csharp

public async Task<int> ReadAndProcessAsync(Stream stream, Memory<byte> buffer)
{
    var bytesRead = await stream.ReadAsync(buffer);
    var data = buffer[..bytesRead]; // Memory<T> slicing -- no allocation
    return ProcessData(data.Span);  // .Span for synchronous processing
}

private int ProcessData(ReadOnlySpan<byte> data)
{
    var sum = 0;
    foreach (var b in data)
        sum += b;
    return sum;
}

Span<T>

无法在异步方法中使用，也不能存储在堆上（它是ref struct）。当你需要执行以下操作时请使用

Memory<T>

：

向异步I/O方法传递缓冲区
在字段或集合中存储切片引用
从方法返回内存区域供后续消费

csharp

public async Task<int> ReadAndProcessAsync(Stream stream, Memory<byte> buffer)
{
    var bytesRead = await stream.ReadAsync(buffer);
    var data = buffer[..bytesRead]; // Memory<T>切片——无内存分配
    return ProcessData(data.Span);  // 使用.Span进行同步处理
}

private int ProcessData(ReadOnlySpan<byte> data)
{
    var sum = 0;
    foreach (var b in data)
        sum += b;
    return sum;
}

ArrayPool<T> for Buffer Pooling

用于缓冲池的ArrayPool<T>

Why Pool Buffers

为什么要池化缓冲区

Large array allocations (>= 85,000 bytes) go directly to the Large Object Heap (LOH), which is only collected in Gen 2 GC -- expensive and causes pauses. Even smaller arrays add GC pressure in hot paths.

ArrayPool<T>

rents and returns buffers to avoid repeated allocations.

大型数组分配（>= 85000字节）会直接进入大对象堆（LOH），仅在Gen 2 GC时才会被回收——成本很高且会导致停顿。即使是小型数组在热路径中也会增加GC压力。

ArrayPool<T>

通过租用和归还缓冲区避免重复分配。

Usage Pattern

使用模式

csharp

using System.Buffers;

public int ProcessLargeData(Stream source)
{
    var buffer = ArrayPool<byte>.Shared.Rent(minimumLength: 81920);
    try
    {
        var bytesRead = source.Read(buffer, 0, buffer.Length);
        // IMPORTANT: Rent may return a larger buffer than requested.
        // Always use bytesRead or the requested length, never buffer.Length.
        return ProcessChunk(buffer.AsSpan(0, bytesRead));
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(buffer, clearArray: true);
        // clearArray: true zeroes the buffer -- use when buffer held sensitive data
    }
}

csharp

using System.Buffers;

public int ProcessLargeData(Stream source)
{
    var buffer = ArrayPool<byte>.Shared.Rent(minimumLength: 81920);
    try
    {
        var bytesRead = source.Read(buffer, 0, buffer.Length);
        // 重要提示：Rent可能返回比请求更大的缓冲区
        // 始终使用bytesRead或请求的长度，永远不要用buffer.Length
        return ProcessChunk(buffer.AsSpan(0, bytesRead));
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(buffer, clearArray: true);
        // clearArray: true会清空缓冲区——当缓冲区存储过敏感数据时使用
    }
}

Common Mistakes

常见错误

Mistake	Impact	Fix
Using `buffer.Length` instead of requested size	Processes uninitialized bytes beyond actual data	Track requested/actual size separately
Forgetting to return the buffer	Pool exhaustion, falls back to allocation	Use try/finally or a `using` wrapper
Returning a buffer twice	Corrupts pool state	Null out the reference after return
Not clearing sensitive data	Security leak from pooled buffers	Pass `clearArray: true` to `Return`

错误	影响	修复方案
使用 `buffer.Length` 而非请求的大小	会处理超出实际数据的未初始化字节	单独跟踪请求/实际数据大小
忘记归还缓冲区	池耗尽，回退到直接分配	使用try/finally或 `using` 包装
重复归还同一个缓冲区	破坏池的状态	归还后将引用置空
未清空敏感数据	池化缓冲区导致安全泄露	调用 `Return` 时传 `clearArray: true`

readonly struct, ref struct, and in Parameters

readonly struct、ref struct与in参数

readonly struct -- Defensive Copy Elimination

readonly struct——消除防御性拷贝

The JIT must defensively copy non-readonly structs when accessed via

in

readonly

fields, or

readonly

methods to prevent mutation. Marking a struct

readonly

guarantees immutability, eliminating these copies:

csharp

// GOOD: readonly eliminates defensive copies on every access
public readonly struct Point3D
{
    public double X { get; }
    public double Y { get; }
    public double Z { get; }

    public Point3D(double x, double y, double z) => (X, Y, Z) = (x, y, z);

    // readonly struct: JIT knows this cannot mutate, no defensive copy needed
    public double DistanceTo(in Point3D other)
    {
        var dx = X - other.X;
        var dy = Y - other.Y;
        var dz = Z - other.Z;
        return Math.Sqrt(dx * dx + dy * dy + dz * dz);
    }
}

Without

readonly

, calling a method on a struct through an

in

parameter forces the JIT to copy the entire struct to protect against mutation. For large structs in tight loops, this eliminates significant overhead.

当通过

in

、

readonly

字段或

readonly

方法访问非readonly struct时，JIT必须对其进行防御性拷贝以避免修改。将struct标记为

readonly

可以保证不可变性，消除这些拷贝：

csharp

// 正面示例：readonly消除每次访问时的防御性拷贝
public readonly struct Point3D
{
    public double X { get; }
    public double Y { get; }
    public double Z { get; }

    public Point3D(double x, double y, double z) => (X, Y, Z) = (x, y, z);

    // readonly struct：JIT知道不会发生修改，无需防御性拷贝
    public double DistanceTo(in Point3D other)
    {
        var dx = X - other.X;
        var dy = Y - other.Y;
        var dz = Z - other.Z;
        return Math.Sqrt(dx * dx + dy * dy + dz * dz);
    }
}

如果没有

readonly

，通过

in

参数调用struct上的方法会强制JIT拷贝整个struct以防止修改。对于紧循环中的大型struct，这可以消除大量开销。

ref struct -- Stack-Only Types

ref struct——仅栈类型

ref struct

types are constrained to the stack. They cannot be boxed, stored in fields, or used in async methods. This enables safe wrapping of Span<T>:

csharp

public ref struct SpanLineEnumerator
{
    private ReadOnlySpan<char> _remaining;

    public SpanLineEnumerator(ReadOnlySpan<char> text) => _remaining = text;

    public ReadOnlySpan<char> Current { get; private set; }

    public bool MoveNext()
    {
        if (_remaining.IsEmpty)
            return false;

        var newlineIndex = _remaining.IndexOf('\n');
        if (newlineIndex == -1)
        {
            Current = _remaining;
            _remaining = default;
        }
        else
        {
            Current = _remaining[..newlineIndex];
            _remaining = _remaining[(newlineIndex + 1)..];
        }
        return true;
    }
}

ref struct

类型被限制只能在栈上使用，不能被装箱、存储在字段中，也不能在异步方法中使用。这使得它可以安全地包装Span<T>：

csharp

public ref struct SpanLineEnumerator
{
    private ReadOnlySpan<char> _remaining;

    public SpanLineEnumerator(ReadOnlySpan<char> text) => _remaining = text;

    public ReadOnlySpan<char> Current { get; private set; }

    public bool MoveNext()
    {
        if (_remaining.IsEmpty)
            return false;

        var newlineIndex = _remaining.IndexOf('\n');
        if (newlineIndex == -1)
        {
            Current = _remaining;
            _remaining = default;
        }
        else
        {
            Current = _remaining[..newlineIndex];
            _remaining = _remaining[(newlineIndex + 1)..];
        }
        return true;
    }
}

in Parameters -- Pass-by-Reference Without Mutation

in参数——不可变的按引用传递

Use

in

for large readonly structs passed to methods. The

in

modifier passes by reference (avoids copying) and prevents mutation:

csharp

// in parameter: pass by reference, no copy, no mutation allowed
public static double CalculateDistance(in Point3D a, in Point3D b)
    => a.DistanceTo(in b);

When to use
in
:

Struct Size	Recommendation
<= 16 bytes	Pass by value (register-friendly, no indirection overhead)
> 16 bytes	Use `in` to avoid copy overhead
Any size, readonly struct	`in` is safe (no defensive copies)
Any size, non-readonly struct	Avoid `in` (defensive copies negate the benefit)

对于传递给方法的大型readonly struct请使用

in

。

in

修饰符实现按引用传递（避免拷贝）同时防止修改：

csharp

// in参数：按引用传递，无拷贝，不允许修改
public static double CalculateDistance(in Point3D a, in Point3D b)
    => a.DistanceTo(in b);

何时使用
in
：

Struct大小	建议
<= 16字节	按值传递（对寄存器友好，无间接访问开销）
> 16字节	使用 `in` 避免拷贝开销
任意大小的readonly struct	使用 `in` 是安全的（无防御性拷贝）
任意大小的非readonly struct	避免使用 `in` （防御性拷贝会抵消收益）

Sealed Class Performance Rationale

Sealed类的性能原理

JIT Devirtualization

JIT去虚拟化

When a class is

sealed

, the JIT can replace virtual method calls with direct calls (devirtualization) because no subclass override is possible. This enables further inlining:

csharp

// Without sealed: virtual dispatch through vtable
public class OpenService : IProcessor
{
    public virtual int Process(int x) => x * 2;
}

// With sealed: JIT devirtualizes + inlines Process call
public sealed class SealedService : IProcessor
{
    public int Process(int x) => x * 2;
}

public interface IProcessor { int Process(int x); }

Verify devirtualization with

[DisassemblyDiagnoser]

in [skill:dotnet-benchmarkdotnet]. See [skill:dotnet-csharp-coding-standards] for the project convention of defaulting to sealed classes.

当类被标记为

sealed

时，JIT可以将虚方法调用替换为直接调用（去虚拟化），因为不存在子类重写的可能。这进一步支持了方法内联：

csharp

// 未加sealed：通过vtable进行虚派发
public class OpenService : IProcessor
{
    public virtual int Process(int x) => x * 2;
}

// 加了sealed：JIT去虚拟化+内联Process调用
public sealed class SealedService : IProcessor
{
    public int Process(int x) => x * 2;
}

public interface IProcessor { int Process(int x); }

使用[skill:dotnet-benchmarkdotnet]的

[DisassemblyDiagnoser]

验证去虚拟化效果。项目中默认使用sealed类的约定参考[skill:dotnet-csharp-coding-standards]。

Performance Impact

性能影响

Devirtualization + inlining eliminates:

vtable lookup -- indirect memory access to find the method pointer
Call overhead -- the actual indirect call instruction
Inlining barrier -- virtual calls cannot be inlined; sealed methods can

In tight loops and hot paths, the cumulative effect is measurable. For framework/library types that are not designed for extension, always prefer

sealed

去虚拟化+内联消除了：

vtable查找——查找方法指针的间接内存访问
调用开销——实际的间接调用指令
内联屏障——虚调用无法被内联，sealed方法可以

在紧循环和热路径中，累计效果是可测量的。对于不设计为可扩展的框架/库类型，始终优先使用

sealed

。

stackalloc for Small Stack-Based Allocations

用于小型栈分配的stackalloc

When to Use stackalloc

何时使用stackalloc

stackalloc

allocates memory on the stack, avoiding GC entirely. Use for small, fixed-size buffers in hot paths:

csharp

public static string FormatGuid(Guid guid)
{
    // 68 bytes on the stack -- well within safe limits
    Span<char> buffer = stackalloc char[68];
    guid.TryFormat(buffer, out var charsWritten, "D");
    return new string(buffer[..charsWritten]);
}

stackalloc

在栈上分配内存，完全避免GC。适用于热路径中的小型固定大小缓冲区：

csharp

public static string FormatGuid(Guid guid)
{
    // 栈上分配68字节——完全在安全范围内
    Span<char> buffer = stackalloc char[68];
    guid.TryFormat(buffer, out var charsWritten, "D");
    return new string(buffer[..charsWritten]);
}

Safety Guidelines

安全指引

Guideline	Rationale
Keep allocations small (< 1 KB typical, < 4 KB absolute maximum)	Stack space is limited (~1 MB default on Windows); overflow crashes the process
Use constant or bounded sizes only	Runtime-variable sizes risk stack overflow from malicious/unexpected input
Prefer `Span<T>` assignment over raw pointer	Span provides bounds checking; raw pointers do not
Fall back to ArrayPool for large/variable sizes	Gracefully handle cases that exceed stack budget

指引	原理
保持分配量较小（通常<1KB，绝对最大值<4KB）	栈空间有限（Windows默认约1MB）；栈溢出会直接崩溃进程
仅使用常量或有界大小	运行时可变大小可能因恶意/意外输入导致栈溢出
优先使用 `Span<T>` 赋值而非原始指针	Span提供边界检查，原始指针没有
大型/可变大小回退到ArrayPool	优雅处理超出栈预算的场景

Hybrid Pattern: stackalloc with ArrayPool Fallback

混合模式：stackalloc加ArrayPool回退

csharp

public static string ProcessData(ReadOnlySpan<byte> input)
{
    const int stackThreshold = 256;
    char[]? rented = null;

    Span<char> buffer = input.Length <= stackThreshold
        ? stackalloc char[stackThreshold]
        : (rented = ArrayPool<char>.Shared.Rent(input.Length));

    try
    {
        var written = Encoding.UTF8.GetChars(input, buffer);
        return new string(buffer[..written]);
    }
    finally
    {
        if (rented is not null)
            ArrayPool<char>.Shared.Return(rented);
    }
}

This pattern is used throughout the .NET runtime libraries and is the recommended approach for methods that handle both small and large inputs.

csharp

public static string ProcessData(ReadOnlySpan<byte> input)
{
    const int stackThreshold = 256;
    char[]? rented = null;

    Span<char> buffer = input.Length <= stackThreshold
        ? stackalloc char[stackThreshold]
        : (rented = ArrayPool<char>.Shared.Rent(input.Length));

    try
    {
        var written = Encoding.UTF8.GetChars(input, buffer);
        return new string(buffer[..written]);
    }
    finally
    {
        if (rented is not null)
            ArrayPool<char>.Shared.Return(rented);
    }
}

这种模式在.NET运行时库中被广泛使用，是处理大小不一输入的推荐方案。

String Interning and StringComparison Performance

字符串驻留与StringComparison性能

String Comparison Performance

字符串比较性能

Ordinal comparisons are significantly faster than culture-aware comparisons because they avoid Unicode normalization:

csharp

// FAST: ordinal comparison (byte-by-byte)
bool isMatch = str.Equals("expected", StringComparison.Ordinal);
bool containsKey = dict.ContainsKey(key); // Dictionary<string, T> uses ordinal by default

// FAST: case-insensitive ordinal (no culture overhead)
bool isMatchIgnoreCase = str.Equals("expected", StringComparison.OrdinalIgnoreCase);

// SLOW: culture-aware comparison (Unicode normalization, linguistic rules)
bool isMatchCulture = str.Equals("expected", StringComparison.CurrentCulture);

Default guidance: Use

StringComparison.Ordinal

StringComparison.OrdinalIgnoreCase

for internal identifiers, dictionary keys, file paths, and protocol strings. Reserve culture-aware comparison for user-visible text sorting and display.

序号比较比文化感知比较快得多，因为它们避免了Unicode规范化：

csharp

// 快：序号比较（逐字节对比）
bool isMatch = str.Equals("expected", StringComparison.Ordinal);
bool containsKey = dict.ContainsKey(key); // Dictionary<string, T>默认使用序号比较

// 快：不区分大小写的序号比较（无文化开销）
bool isMatchIgnoreCase = str.Equals("expected", StringComparison.OrdinalIgnoreCase);

// 慢：文化感知比较（Unicode规范化、语言学规则）
bool isMatchCulture = str.Equals("expected", StringComparison.CurrentCulture);

默认指引： 内部标识符、字典键、文件路径、协议字符串使用

StringComparison.Ordinal

或

StringComparison.OrdinalIgnoreCase

。文化感知比较仅用于用户可见文本的排序和展示。

String Interning

字符串驻留

The CLR interns compile-time string literals automatically.

string.Intern()

can reduce memory for runtime strings that repeat frequently:

csharp

// Intern frequently-repeated runtime strings to share a single instance
var normalized = string.Intern(headerName.ToLowerInvariant());

Caution: Interned strings are never garbage collected. Only intern strings from a bounded, known set (HTTP headers, XML element names). Never intern user input or unbounded data.

CLR会自动驻留编译时字符串字面量。

string.Intern()

可以减少高频重复运行时字符串的内存占用：

csharp

// 驻留高频重复的运行时字符串以共享单实例
var normalized = string.Intern(headerName.ToLowerInvariant());

注意： 驻留的字符串永远不会被GC回收。仅驻留自有界已知集合的字符串（HTTP头部、XML元素名），永远不要驻留用户输入或无界数据。

Efficient String Building

高效字符串构建

Scenario	Recommended Approach	Why
2-3 concatenations	String interpolation `$"{a}{b}"`	Compiler optimizes to `string.Concat`
Loop concatenation	`StringBuilder`	Avoids quadratic allocation
Known fixed parts	`string.Create`	Single allocation, Span-based writing
High-throughput formatting	`Span<char>` + `TryFormat`	Zero-allocation formatting

csharp

// string.Create for single-allocation building
public static string FormatId(int category, int item)
{
    return string.Create(11, (category, item), static (span, state) =>
    {
        state.category.TryFormat(span, out var catWritten);
        span[catWritten] = '-';
        state.item.TryFormat(span[(catWritten + 1)..], out _);
    });
}

场景	推荐方案	原因
2-3次拼接	字符串插值 `$"{a}{b}"`	编译器优化为 `string.Concat`
循环拼接	`StringBuilder`	避免二次方分配
固定部分已知	`string.Create`	单次分配，基于Span写入
高吞吐量格式化	`Span<char>` + `TryFormat`	零分配格式化

csharp

// string.Create实现单次分配构建
public static string FormatId(int category, int item)
{
    return string.Create(11, (category, item), static (span, state) =>
    {
        state.category.TryFormat(span, out var catWritten);
        span[catWritten] = '-';
        state.item.TryFormat(span[(catWritten + 1)..], out _);
    });
}

Performance Measurement Checklist

性能衡量检查清单

Before applying any optimization pattern, measure first. Premature optimization without data leads to complex code with no measurable benefit.

Identify the hot path -- use [skill:dotnet-benchmarkdotnet] to establish a baseline
Measure allocations -- enable
```
[MemoryDiagnoser]
```
and check the
```
Allocated
```
column
Apply one pattern at a time -- change one thing, re-measure, compare to baseline
Check AOT impact -- if targeting Native AOT ([skill:dotnet-native-aot]), verify patterns are trim-safe
Verify with production-like data -- synthetic benchmarks can miss real-world allocation patterns
Document the tradeoff -- every optimization trades readability or flexibility for speed; record the measured gain

应用任何优化模式前请先测量。没有数据的过早优化会导致代码复杂但没有可测量的收益。

识别热路径——使用[skill:dotnet-benchmarkdotnet]建立基准线
测量分配量——启用
```
[MemoryDiagnoser]
```
检查
```
Allocated
```
列
每次应用一种模式——只改一个点，重新测量，与基准线对比
检查AOT影响——如果目标是Native AOT（[skill:dotnet-native-aot]），验证模式是裁剪安全的
使用类生产数据验证——合成基准测试可能会遗漏真实世界的分配模式
记录权衡——每个优化都是用可读性或灵活性换速度，记录测量到的收益

Agent Gotchas

Agent注意事项

Measure before optimizing -- never apply Span/ArrayPool/stackalloc without a benchmark showing the allocation or latency problem. Premature optimization produces unreadable code for no measurable gain.
Do not use stackalloc with variable sizes from untrusted input -- stack overflow crashes the process with no exception handler. Always validate bounds or use the hybrid stackalloc/ArrayPool pattern.
Always mark value types
readonly struct
when they are immutable -- without
```
readonly
```
, the JIT generates defensive copies on every
```
in
```
parameter access and
```
readonly
```
field access, silently negating the performance benefit of using structs.
Return rented ArrayPool buffers in finally blocks -- forgetting to return starves the pool and causes fallback allocations that negate the benefit.
Use
StringComparison.Ordinal
for internal comparisons -- omitting the comparison parameter defaults to culture-aware comparison, which is slower and produces surprising results for technical strings (file paths, identifiers).
Sealed classes help performance only when the JIT can see the concrete type -- if the object is accessed through an interface variable in a non-devirtualizable call site, sealing provides no benefit. Verify with
```
[DisassemblyDiagnoser]
```
.
Do not re-teach language syntax -- reference [skill:dotnet-csharp-modern-patterns] for Span/Memory syntax details. This skill focuses on when and why to use these patterns for performance.

优化前先测量——不要在没有基准测试证明分配或延迟问题的情况下应用Span/ArrayPool/stackalloc。过早优化会产生难以阅读的代码，且没有可测量的收益。
不要对来自不可信输入的可变大小使用stackalloc——栈溢出会直接崩溃进程，没有异常处理机会。始终验证边界，或使用stackalloc/ArrayPool混合模式。
不可变值类型始终标记为
readonly struct
——如果没有
```
readonly
```
，JIT会在每次
```
in
```
参数访问和
```
readonly
```
字段访问时生成防御性拷贝，悄无声息地抵消使用struct的性能收益。
在finally块中归还租用的ArrayPool缓冲区——忘记归还会导致池饥饿，触发回退分配，抵消优化收益。
内部比较使用
StringComparison.Ordinal
——省略比较参数会默认使用文化感知比较，速度更慢，且对于技术字符串（文件路径、标识符）会产生意外结果。
仅当JIT可以看到具体类型时sealed类才会提升性能——如果对象是通过接口变量在无法去虚拟化的调用点访问，密封没有收益。使用
```
[DisassemblyDiagnoser]
```
验证。
不要重复讲授语言语法——Span/Memory语法细节参考[skill:dotnet-csharp-modern-patterns]。本技能聚焦于何时以及为什么要使用这些模式来提升性能。

Knowledge Sources

知识来源

Performance patterns in this skill are grounded in guidance from:

Stephen Toub -- .NET Performance blog series (devblogs.microsoft.com/dotnet/author/toub). Authoritative source on Span<T>, ValueTask, ArrayPool, async internals, and runtime performance characteristics.
Stephen Cleary -- Async best practices and concurrent collections guidance. Author of Concurrency in C# Cookbook.
Nick Chapsas -- Modern .NET performance patterns and benchmarking methodology.

These sources inform the patterns and rationale presented above. This skill does not claim to represent or speak for any individual.

本技能中的性能模式基于以下来源的指引：

Stephen Toub——.NET性能博客系列（devblogs.microsoft.com/dotnet/author/toub）。Span<T>、ValueTask、ArrayPool、异步内部实现、运行时性能特性的权威来源。
Stephen Cleary——异步最佳实践与并发集合指引，《Concurrency in C# Cookbook》作者。
Nick Chapsas——现代化.NET性能模式与基准测试方法论。

以上来源为本文呈现的模式和原理提供了参考，本技能不代表任何个人观点。