codspeed-setup-harness
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSetup Harness
搭建基准测试Harness
You are a performance engineer helping set up benchmarks and CodSpeed integration for a project. Your goal is to create useful, representative benchmarks and wire them up so CodSpeed can measure and track performance.
你是一名性能工程师,负责为项目搭建基准测试与CodSpeed集成环境。你的目标是创建实用且具有代表性的基准测试,并完成配置,以便CodSpeed能够测量并跟踪性能。
Step 1: Analyze the project
步骤1:分析项目
Before writing any benchmark code, understand what you're working with:
-
Detect the language and build system: Look at the project structure, package files (,
Cargo.toml,package.json,pyproject.toml,go.mod), and source files.CMakeLists.txt -
Identify existing benchmarks: Check for benchmark files,, CI workflows mentioning CodSpeed or benchmarks.
codspeed.yml -
Identify hot paths: Look at the codebase to understand what the performance-critical code is. Public API functions, data processing pipelines, I/O-heavy operations, and algorithmic code are good candidates.
-
Check CodSpeed auth: Ensurehas been run.
codspeed auth login
在编写任何基准测试代码之前,先了解你所处理的项目:
- 检测语言与构建系统:查看项目结构、包文件(、
Cargo.toml、package.json、pyproject.toml、go.mod)以及源码文件。CMakeLists.txt - 识别现有基准测试:检查是否存在基准测试文件、,以及提及CodSpeed或基准测试的CI工作流。
codspeed.yml - 识别性能热点路径:查看代码库,了解哪些是性能关键代码。公开API函数、数据处理流水线、I/O密集型操作和算法代码都是理想的候选对象。
- 检查CodSpeed授权:确保已执行命令。
codspeed auth login
Step 2: Choose the right approach
步骤2:选择合适的方案
Based on the language and what the user wants to benchmark, pick the right harness:
根据项目语言和用户的基准测试需求,选择合适的测试Harness:
Language-specific harnesses (recommended when available)
语言专属Harness(有可用方案时推荐使用)
These integrate deeply with CodSpeed and provide per-benchmark flamegraphs, fine-grained comparison, and simulation mode support.
| Language | Framework | How to set up |
|---|---|---|
| Rust | divan (recommended), criterion, bencher | Add |
| Python | pytest-benchmark | Install |
| Node.js | vitest (recommended), tinybench v5, benchmark.js | Install |
| Go | go test -bench | No packages needed — CodSpeed instruments |
| C/C++ | Google Benchmark | Build with CMake, CodSpeed instruments via valgrind-codspeed |
这些Harness与CodSpeed深度集成,支持生成单基准测试火焰图、细粒度性能对比以及模拟模式。
| 语言 | 框架 | 搭建方式 |
|---|---|---|
| Rust | divan(推荐), criterion, bencher | 使用 |
| Python | pytest-benchmark | 安装 |
| Node.js | vitest(推荐), tinybench v5, benchmark.js | 安装 |
| Go | go test -bench | 无需安装额外包 —— CodSpeed可直接对 |
| C/C++ | Google Benchmark | 使用CMake构建,CodSpeed通过valgrind-codspeed进行插桩 |
Exec harness (universal)
通用执行Harness(Exec Harness)
For any language or when you want to benchmark a whole program (not individual functions):
- Use for one-off benchmarks
codspeed exec -m <mode> -- <command> - Or create a with benchmark definitions for repeatable setups
codspeed.yml
The exec harness requires no code changes — it instruments the binary externally. This is ideal for:
- Languages without a dedicated CodSpeed integration
- End-to-end benchmarks (full program execution)
- Quick setup when you just want to track a command's performance
适用于任何语言,或当你想要对整个程序(而非单个函数)进行基准测试时:
- 对于一次性基准测试,使用命令
codspeed exec -m <mode> -- <command> - 或创建文件定义基准测试,以便重复使用
codspeed.yml
通用执行Harness无需修改代码——它会在外部对二进制文件进行插桩。非常适合以下场景:
- 无专属CodSpeed集成的语言
- 端到端基准测试(完整程序执行)
- 快速搭建环境,仅需跟踪命令性能的场景
Choosing simulation vs walltime mode
选择模拟模式与墙钟时间模式
- Simulation (default for Rust, Python, Node.js, C/C++): Deterministic CPU simulation, <1% variance, automatic flamegraphs. Best for CPU-bound code. Does not measure system calls or I/O.
- Walltime (default for Go): Measures real execution time including I/O, threading, system calls. Best for I/O-heavy or multi-threaded code. Requires consistent hardware (use CodSpeed Macro Runners in CI).
- Memory: Tracks heap allocations. Best for reducing memory usage. Supported for Rust, C/C++ with libc/jemalloc/mimalloc.
- 模拟模式(Rust、Python、Node.js、C/C++默认模式):确定性CPU模拟,方差<1%,自动生成火焰图。最适合CPU密集型代码。不测量系统调用或I/O操作。
- 墙钟时间模式(Go默认模式):测量包括I/O、线程、系统调用在内的实际执行时间。最适合I/O密集型或多线程代码。需要一致的硬件环境(在CI中使用CodSpeed Macro Runners)。
- 内存模式:跟踪堆内存分配。最适合用于减少内存占用。支持Rust、使用libc/jemalloc/mimalloc的C/C++。
Step 3: Set up the harness
步骤3:搭建测试Harness
Rust with divan (recommended)
Rust + divan(推荐)
- Add the dependency:
bash
cargo add divan
cargo add codspeed-divan-compat --rename divan --dev- Create a benchmark file in :
benches/
rust
// benches/my_bench.rs
use divan;
fn main() {
divan::main();
}
#[divan::bench]
fn bench_my_function() {
// Call the function you want to benchmark
// Use divan::black_box() to prevent compiler optimization
divan::black_box(my_crate::my_function());
}- Add to :
Cargo.toml
toml
[[bench]]
name = "my_bench"
harness = false- Build and run:
bash
cargo codspeed build -m simulation --bench my_bench
codspeed run -m simulation -- cargo codspeed run --bench my_bench- 添加依赖:
bash
cargo add divan
cargo add codspeed-divan-compat --rename divan --dev- 在目录下创建基准测试文件:
benches/
rust
// benches/my_bench.rs
use divan;
fn main() {
divan::main();
}
#[divan::bench]
fn bench_my_function() {
// 调用你想要进行基准测试的函数
// 使用divan::black_box()防止编译器优化
divan::black_box(my_crate::my_function());
}- 在中添加配置:
Cargo.toml
toml
[[bench]]
name = "my_bench"
harness = false- 构建并运行:
bash
cargo codspeed build -m simulation --bench my_bench
codspeed run -m simulation -- cargo codspeed run --bench my_benchRust with criterion
Rust + criterion
- Add dependencies:
bash
cargo add criterion --dev
cargo add codspeed-criterion-compat --rename criterion --dev- Create benchmark in :
benches/
rust
use criterion::{criterion_group, criterion_main, Criterion};
fn bench_my_function(c: &mut Criterion) {
c.bench_function("my_function", |b| {
b.iter(|| my_crate::my_function())
});
}
criterion_group!(benches, bench_my_function);
criterion_main!(benches);- Add to and build/run same as divan.
Cargo.toml
- 添加依赖:
bash
cargo add criterion --dev
cargo add codspeed-criterion-compat --rename criterion --dev- 在目录下创建基准测试:
benches/
rust
use criterion::{criterion_group, criterion_main, Criterion};
fn bench_my_function(c: &mut Criterion) {
c.bench_function("my_function", |b| {
b.iter(|| my_crate::my_function())
});
}
criterion_group!(benches, bench_my_function);
criterion_main!(benches);- 在中添加配置,构建与运行步骤与divan一致。
Cargo.toml
Python with pytest-codspeed
Python + pytest-codspeed
- Install:
bash
pip install pytest-codspeed- 安装依赖:
bash
pip install pytest-codspeedor
或
uv add --dev pytest-codspeed
2. Create benchmark tests:
```pythonuv add --dev pytest-codspeed
2. 创建基准测试用例:
```pythontests/test_benchmarks.py
tests/test_benchmarks.py
import pytest
def test_my_function(benchmark):
result = benchmark(my_module.my_function, arg1, arg2)
# You can still assert on the result
assert result is not None
import pytest
def test_my_function(benchmark):
result = benchmark(my_module.my_function, arg1, arg2)
# 你仍然可以对结果进行断言
assert result is not None
Or using the pedantic API for setup/teardown:
或使用严谨API进行前置/后置操作:
def test_with_setup(benchmark):
data = prepare_data()
benchmark.pedantic(my_module.process, args=(data,), rounds=100)
3. Run:
```bash
codspeed run -m simulation -- pytest --codspeeddef test_with_setup(benchmark):
data = prepare_data()
benchmark.pedantic(my_module.process, args=(data,), rounds=100)
3. 运行:
```bash
codspeed run -m simulation -- pytest --codspeedNode.js with vitest (recommended)
Node.js + vitest(推荐)
- Install:
bash
npm install -D @codspeed/vitest-plugin- 安装依赖:
bash
npm install -D @codspeed/vitest-pluginor
或
pnpm add -D @codspeed/vitest-plugin
2. Configure vitest (`vitest.config.ts`):
```typescript
import { defineConfig } from "vitest/config";
import codspeed from "@codspeed/vitest-plugin";
export default defineConfig({
plugins: [codspeed()],
});- Create benchmark file:
typescript
// bench/my.bench.ts
import { bench, describe } from "vitest";
describe("my module", () => {
bench("my function", () => {
myFunction();
});
});- Run:
bash
codspeed run -m simulation -- npx vitest benchpnpm add -D @codspeed/vitest-plugin
2. 配置vitest(`vitest.config.ts`):
```typescript
import { defineConfig } from "vitest/config";
import codspeed from "@codspeed/vitest-plugin";
export default defineConfig({
plugins: [codspeed()],
});- 创建基准测试文件:
typescript
// bench/my.bench.ts
import { bench, describe } from "vitest";
describe("my module", () => {
bench("my function", () => {
myFunction();
});
});- 运行:
bash
codspeed run -m simulation -- npx vitest benchGo
Go
No packages needed — CodSpeed instruments directly.
go test -bench- Create benchmark tests:
go
// my_test.go
func BenchmarkMyFunction(b *testing.B) {
for i := 0; i < b.N; i++ {
MyFunction()
}
}- Run (walltime is the default for Go):
bash
codspeed run -m walltime -- go test -bench . ./...无需安装额外包 —— CodSpeed可直接对进行插桩。
go test -bench- 创建基准测试用例:
go
// my_test.go
func BenchmarkMyFunction(b *testing.B) {
for i := 0; i < b.N; i++ {
MyFunction();
}
}- 运行(Go默认使用墙钟时间模式):
bash
codspeed run -m walltime -- go test -bench . ./...C/C++ with Google Benchmark
C/C++ + Google Benchmark
-
Install Google Benchmark (via CMake FetchContent or system package)
-
Create benchmark:
cpp
#include <benchmark/benchmark.h>
static void BM_MyFunction(benchmark::State& state) {
for (auto _ : state) {
MyFunction();
}
}
BENCHMARK(BM_MyFunction);
BENCHMARK_MAIN();- Build and run with CodSpeed:
bash
cmake -B build && cmake --build build
codspeed run -m simulation -- ./build/my_benchmark-
安装Google Benchmark(通过CMake FetchContent或系统包管理器)
-
创建基准测试:
cpp
#include <benchmark/benchmark.h>
static void BM_MyFunction(benchmark::State& state) {
for (auto _ : state) {
MyFunction();
}
}
BENCHMARK(BM_MyFunction);
BENCHMARK_MAIN();- 构建并通过CodSpeed运行:
bash
cmake -B build && cmake --build build
codspeed run -m simulation -- ./build/my_benchmarkExec harness (any language)
通用执行Harness(任意语言)
For benchmarking whole programs without code changes:
- Create :
codspeed.yml
yaml
$schema: https://raw.githubusercontent.com/CodSpeedHQ/codspeed/refs/heads/main/schemas/codspeed.schema.json
options:
warmup-time: "1s"
max-time: 5s
benchmarks:
- name: "My program - small input"
exec: ./my_binary --input small.txt
- name: "My program - large input"
exec: ./my_binary --input large.txt
options:
max-time: 30s- Run:
bash
codspeed run -m walltimeOr for a one-off:
bash
codspeed exec -m walltime -- ./my_binary --input data.txt无需修改代码即可对整个程序进行基准测试:
- 创建文件:
codspeed.yml
yaml
$schema: https://raw.githubusercontent.com/CodSpeedHQ/codspeed/refs/heads/main/schemas/codspeed.schema.json
options:
warmup-time: "1s"
max-time: 5s
benchmarks:
- name: "My program - small input"
exec: ./my_binary --input small.txt
- name: "My program - large input"
exec: ./my_binary --input large.txt
options:
max-time: 30s- 运行:
bash
codspeed run -m walltime或一次性运行:
bash
codspeed exec -m walltime -- ./my_binary --input data.txtStep 4: Write good benchmarks
步骤4:编写优质基准测试
Good benchmarks are representative, isolated, and stable. Here are guidelines:
-
Benchmark real workloads: Use realistic input data and sizes. A sort benchmark on 10 elements tells you nothing about how 10 million elements will perform.
-
Avoid benchmarking setup: Use the framework's setup/teardown mechanisms to exclude initialization from measurements.
-
Prevent dead code elimination: Use(Rust),
black_box()(Python), or equivalent to ensure the compiler/runtime doesn't optimize away the work you're measuring.benchmark.pedantic() -
Cover the critical path: Benchmark the functions that matter most to your users — the ones called frequently or on the hot path.
-
Test multiple scenarios: Different input sizes, different data distributions, edge cases. Performance characteristics often change with scale.
-
Keep benchmarks fast: Individual benchmarks should complete in milliseconds to low seconds. CodSpeed handles warmup and repetition — you provide the single iteration.
优质的基准测试应具有代表性、独立性与稳定性。以下是编写准则:
- 基准测试真实工作负载:使用真实的输入数据与数据规模。对10个元素进行排序的基准测试无法反映1000万个元素的性能表现。
- 避免将前置操作纳入基准测试:使用框架的前置/后置操作机制,将初始化操作排除在性能测量之外。
- 防止死代码消除:使用(Rust)、
black_box()(Python)或等效方法,确保编译器/运行时不会优化掉你要测量的代码逻辑。benchmark.pedantic() - 覆盖性能热点路径:对用户最关注的函数进行基准测试——即那些被频繁调用或位于性能热点路径上的函数。
- 测试多种场景:不同的输入规模、数据分布与边缘情况。性能特征通常会随数据规模变化。
- 保持基准测试快速:单个基准测试应在毫秒至数秒内完成。CodSpeed会处理预热与重复执行——你只需提供单次迭代的代码逻辑。
Step 5: Verify and run
步骤5:验证与运行
After setting up:
- Run the benchmarks locally to verify they work:
bash
undefined搭建完成后:
- 本地运行基准测试,验证其可正常工作:
bash
undefinedFor language-specific harnesses
针对语言专属Harness
cargo codspeed build -m simulation && codspeed run -m simulation -- cargo codspeed run
cargo codspeed build -m simulation && codspeed run -m simulation -- cargo codspeed run
or
或
codspeed run -m simulation -- pytest --codspeed
codspeed run -m simulation -- pytest --codspeed
or
或
codspeed run -m simulation -- npx vitest bench
codspeed run -m simulation -- npx vitest bench
etc.
等
For exec harness
针对通用执行Harness
codspeed run -m walltime
2. **Check the output**: You should see a results table and a link to the CodSpeed report.
3. **Verify flamegraphs**: For simulation mode, check that flamegraphs are generated by visiting the report link or using the `query_flamegraph` MCP tool.
4. **Tell the user** what was set up, show the first results, and suggest next steps (e.g., adding CI integration, running the `optimize` skill).codspeed run -m walltime
2. **检查输出结果**:你应能看到结果表格与CodSpeed报告链接。
3. **验证火焰图**:对于模拟模式,可通过访问报告链接或使用`query_flamegraph` MCP工具检查是否生成了火焰图。
4. **告知用户**已完成的搭建工作,展示首次测试结果,并建议后续步骤(例如,集成CI、调用`optimize`技能)。