flamegraphs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseFlamegraphs
火焰图
Purpose
用途
Guide agents through the pipeline from profiler data to SVG flamegraph, and teach interpretation of flamegraphs to drive concrete optimisation decisions.
指导Agent完成从性能分析器数据到SVG火焰图的全流程,并教授如何解读火焰图,从而制定具体的优化决策。
Triggers
触发场景
- "How do I generate a flamegraph from perf data?"
- "How do I read a flamegraph?"
- "The flamegraph shows a wide frame — what does that mean?"
- "How do I generate a flamegraph from Callgrind?"
- "I want to compare two flamegraphs (before/after)"
- "如何从perf数据生成火焰图?"
- "如何解读火焰图?"
- "火焰图中有一个宽帧,这意味着什么?"
- "如何从Callgrind生成火焰图?"
- "我想比较两张火焰图(优化前后)"
Workflow
操作流程
1. Install FlameGraph tools
1. 安装FlameGraph工具
bash
git clone https://github.com/brendangregg/FlameGraphbash
git clone https://github.com/brendangregg/FlameGraphNo install needed; scripts are in the repo
No install needed; scripts are in the repo
export PATH=$PATH:/path/to/FlameGraph
undefinedexport PATH=$PATH:/path/to/FlameGraph
undefined2. perf → flamegraph (most common path)
2. perf → 火焰图(最常用流程)
bash
undefinedbash
undefinedStep 1: record
Step 1: record
perf record -F 999 -g -o perf.data ./prog
perf record -F 999 -g -o perf.data ./prog
Step 2: generate script output
Step 2: generate script output
perf script -i perf.data > out.perf
perf script -i perf.data > out.perf
Step 3: collapse stacks
Step 3: collapse stacks
stackcollapse-perf.pl out.perf > out.folded
stackcollapse-perf.pl out.perf > out.folded
Step 4: generate SVG
Step 4: generate SVG
flamegraph.pl out.folded > flamegraph.svg
flamegraph.pl out.folded > flamegraph.svg
Step 5: view
Step 5: view
xdg-open flamegraph.svg # Linux
open flamegraph.svg # macOS
One-liner:
```bash
perf record -F 999 -g ./prog && perf script | stackcollapse-perf.pl | flamegraph.pl > fg.svgxdg-open flamegraph.svg # Linux
open flamegraph.svg # macOS
单行命令:
```bash
perf record -F 999 -g ./prog && perf script | stackcollapse-perf.pl | flamegraph.pl > fg.svg3. Differential flamegraph (before/after)
3. 差分火焰图(优化前后对比)
bash
undefinedbash
undefinedCollect two profiles
Collect two profiles
perf record -g -o before.data ./prog_old
perf record -g -o after.data ./prog_new
perf record -g -o before.data ./prog_old
perf record -g -o after.data ./prog_new
Collapse
Collapse
perf script -i before.data | stackcollapse-perf.pl > before.folded
perf script -i after.data | stackcollapse-perf.pl > after.folded
perf script -i before.data | stackcollapse-perf.pl > before.folded
perf script -i after.data | stackcollapse-perf.pl > after.folded
Diff (red = regressed, blue = improved)
Diff (red = regressed, blue = improved)
difffolded.pl before.folded after.folded | flamegraph.pl > diff.svg
undefineddifffolded.pl before.folded after.folded | flamegraph.pl > diff.svg
undefined4. Callgrind → flamegraph
4. Callgrind → 火焰图
bash
valgrind --tool=callgrind --callgrind-out-file=cg.out ./prog
stackcollapse-callgrind.pl cg.out | flamegraph.pl > fg.svgbash
valgrind --tool=callgrind --callgrind-out-file=cg.out ./prog
stackcollapse-callgrind.pl cg.out | flamegraph.pl > fg.svg5. Other profiler inputs
5. 其他性能分析器输入
bash
undefinedbash
undefinedGo pprof
Go pprof
go tool pprof -raw -output=prof.txt prog
stackcollapse-go.pl prof.txt | flamegraph.pl > fg.svg
go tool pprof -raw -output=prof.txt prog
stackcollapse-go.pl prof.txt | flamegraph.pl > fg.svg
DTrace
DTrace
dtrace -x ustackframes=100 -n 'profile-99 /execname=="prog"/ { @[ustack()] = count(); }'
-o out.stacks sleep 10 stackcollapse.pl out.stacks | flamegraph.pl > fg.svg
-o out.stacks sleep 10 stackcollapse.pl out.stacks | flamegraph.pl > fg.svg
dtrace -x ustackframes=100 -n 'profile-99 /execname=="prog"/ { @[ustack()] = count(); }'
-o out.stacks sleep 10 stackcollapse.pl out.stacks | flamegraph.pl > fg.svg
-o out.stacks sleep 10 stackcollapse.pl out.stacks | flamegraph.pl > fg.svg
Java (async-profiler)
Java (async-profiler)
async-profiler -d 30 -f out.collapsed PID
flamegraph.pl out.collapsed > fg.svg
undefinedasync-profiler -d 30 -f out.collapsed PID
flamegraph.pl out.collapsed > fg.svg
undefined6. Reading flamegraphs
6. 解读火焰图
A flamegraph is a call-stack visualisation:
- X axis: time on CPU (not time sequence) — wider = more time
- Y axis: call stack depth — taller = deeper call chain
- Color: random (no significance) — unless using differential mode
What to look for:
| Pattern | Meaning | Action |
|---|---|---|
| Wide frame near bottom | Function itself is hot | Optimise that function |
| Wide frame with tall narrow towers | Calling many different callees | Hot dispatch; reduce call overhead |
| Very tall stack with wide base | Deep recursion | Check recursion depth; consider iterative approach |
| Plateau at the top | Leaf function with no callees | This leaf is the actual hotspot |
| Many narrow identical stacks | Many threads doing the same work | Consider parallelism or batching |
Identifying the actionable hotspot:
- Find the widest top frame (a frame with no or narrow children above it)
- That is where CPU time is actually spent
- Trace down to understand what called it and why
Differential flamegraph:
- Red frames: more time in new profile (regression)
- Blue frames: less time in new profile (improvement)
- Frames only in one profile appear solid colored
火焰图是调用栈的可视化展示:
- X轴:CPU占用时间(非时间序列)——越宽表示占用时间越长
- Y轴:调用栈深度——越高表示调用链越深
- 颜色:随机分配(无特殊含义)——差分模式除外
重点关注内容:
| 模式 | 含义 | 操作建议 |
|---|---|---|
| 底部的宽帧 | 函数本身是热点 | 优化该函数 |
| 宽帧上方有多个窄高的调用链 | 调用了大量不同的子函数 | 热点调度;减少调用开销 |
| 非常高的调用栈且底部较宽 | 深度递归 | 检查递归深度;考虑改用迭代实现 |
| 顶部的平台状帧 | 无调用子函数的叶子函数 | 该叶子函数是实际热点 |
| 大量相同的窄调用栈 | 多个线程执行相同任务 | 考虑并行化或批量处理 |
定位可优化的热点:
- 找到最顶部的宽帧(上方无帧或只有窄帧的帧)
- 该帧就是CPU时间的主要消耗点
- 向下追踪调用链,了解调用来源及原因
差分火焰图:
- 红色帧:新分析结果中占用时间增加(性能退化)
- 蓝色帧:新分析结果中占用时间减少(性能提升)
- 仅在其中一张图中出现的帧为纯色
7. flamegraph.pl options
7. flamegraph.pl 选项
bash
flamegraph.pl --title "My App" \
--subtitle "Release build, workload X" \
--width 1600 \
--height 16 \
--minwidth 0.5 \
--colors java \
out.folded > fg.svg| Option | Effect |
|---|---|
| SVG title |
| Width in pixels |
| Frame height in pixels |
| Omit frames < N% (reduces clutter) |
| Palette: |
| Icicle chart (roots at top) |
| Reverse stacks |
| Consistent palette (same frame = same color across SVGs) |
bash
flamegraph.pl --title "My App" \
--subtitle "Release build, workload X" \
--width 1600 \
--height 16 \
--minwidth 0.5 \
--colors java \
out.folded > fg.svg| 选项 | 作用 |
|---|---|
| SVG标题 |
| 宽度(像素) |
| 帧高度(像素) |
| 忽略占比小于N%的帧(减少冗余) |
| 配色方案: |
| 生成冰柱图(根节点在顶部) |
| 反转调用栈 |
| 配色一致(不同SVG中同一帧使用相同颜色) |
References
参考资料
For tool installation, stackcollapse scripts, and palette options, see references/tools.md.
关于工具安装、stackcollapse脚本和配色方案选项,请参阅references/tools.md。
Related skills
相关技能
- Use to collect perf data
skills/profilers/linux-perf - Use to collect Callgrind data
skills/profilers/valgrind - Use for LLVM PGO from sampling profiles
skills/compilers/clang
- 使用收集perf数据
skills/profilers/linux-perf - 使用收集Callgrind数据
skills/profilers/valgrind - 使用从采样分析结果生成LLVM PGO优化数据
skills/compilers/clang