flamegraphs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Flamegraphs

火焰图

Purpose

用途

Guide agents through the pipeline from profiler data to SVG flamegraph, and teach interpretation of flamegraphs to drive concrete optimisation decisions.
指导Agent完成从性能分析器数据到SVG火焰图的全流程,并教授如何解读火焰图,从而制定具体的优化决策。

Triggers

触发场景

  • "How do I generate a flamegraph from perf data?"
  • "How do I read a flamegraph?"
  • "The flamegraph shows a wide frame — what does that mean?"
  • "How do I generate a flamegraph from Callgrind?"
  • "I want to compare two flamegraphs (before/after)"
  • "如何从perf数据生成火焰图?"
  • "如何解读火焰图?"
  • "火焰图中有一个宽帧,这意味着什么?"
  • "如何从Callgrind生成火焰图?"
  • "我想比较两张火焰图(优化前后)"

Workflow

操作流程

1. Install FlameGraph tools

1. 安装FlameGraph工具

bash
git clone https://github.com/brendangregg/FlameGraph
bash
git clone https://github.com/brendangregg/FlameGraph

No install needed; scripts are in the repo

No install needed; scripts are in the repo

export PATH=$PATH:/path/to/FlameGraph
undefined
export PATH=$PATH:/path/to/FlameGraph
undefined

2. perf → flamegraph (most common path)

2. perf → 火焰图(最常用流程)

bash
undefined
bash
undefined

Step 1: record

Step 1: record

perf record -F 999 -g -o perf.data ./prog
perf record -F 999 -g -o perf.data ./prog

Step 2: generate script output

Step 2: generate script output

perf script -i perf.data > out.perf
perf script -i perf.data > out.perf

Step 3: collapse stacks

Step 3: collapse stacks

stackcollapse-perf.pl out.perf > out.folded
stackcollapse-perf.pl out.perf > out.folded

Step 4: generate SVG

Step 4: generate SVG

flamegraph.pl out.folded > flamegraph.svg
flamegraph.pl out.folded > flamegraph.svg

Step 5: view

Step 5: view

xdg-open flamegraph.svg # Linux open flamegraph.svg # macOS

One-liner:

```bash
perf record -F 999 -g ./prog && perf script | stackcollapse-perf.pl | flamegraph.pl > fg.svg
xdg-open flamegraph.svg # Linux open flamegraph.svg # macOS

单行命令:

```bash
perf record -F 999 -g ./prog && perf script | stackcollapse-perf.pl | flamegraph.pl > fg.svg

3. Differential flamegraph (before/after)

3. 差分火焰图(优化前后对比)

bash
undefined
bash
undefined

Collect two profiles

Collect two profiles

perf record -g -o before.data ./prog_old perf record -g -o after.data ./prog_new
perf record -g -o before.data ./prog_old perf record -g -o after.data ./prog_new

Collapse

Collapse

perf script -i before.data | stackcollapse-perf.pl > before.folded perf script -i after.data | stackcollapse-perf.pl > after.folded
perf script -i before.data | stackcollapse-perf.pl > before.folded perf script -i after.data | stackcollapse-perf.pl > after.folded

Diff (red = regressed, blue = improved)

Diff (red = regressed, blue = improved)

difffolded.pl before.folded after.folded | flamegraph.pl > diff.svg
undefined
difffolded.pl before.folded after.folded | flamegraph.pl > diff.svg
undefined

4. Callgrind → flamegraph

4. Callgrind → 火焰图

bash
valgrind --tool=callgrind --callgrind-out-file=cg.out ./prog
stackcollapse-callgrind.pl cg.out | flamegraph.pl > fg.svg
bash
valgrind --tool=callgrind --callgrind-out-file=cg.out ./prog
stackcollapse-callgrind.pl cg.out | flamegraph.pl > fg.svg

5. Other profiler inputs

5. 其他性能分析器输入

bash
undefined
bash
undefined

Go pprof

Go pprof

go tool pprof -raw -output=prof.txt prog stackcollapse-go.pl prof.txt | flamegraph.pl > fg.svg
go tool pprof -raw -output=prof.txt prog stackcollapse-go.pl prof.txt | flamegraph.pl > fg.svg

DTrace

DTrace

dtrace -x ustackframes=100 -n 'profile-99 /execname=="prog"/ { @[ustack()] = count(); }'
-o out.stacks sleep 10 stackcollapse.pl out.stacks | flamegraph.pl > fg.svg
dtrace -x ustackframes=100 -n 'profile-99 /execname=="prog"/ { @[ustack()] = count(); }'
-o out.stacks sleep 10 stackcollapse.pl out.stacks | flamegraph.pl > fg.svg

Java (async-profiler)

Java (async-profiler)

async-profiler -d 30 -f out.collapsed PID flamegraph.pl out.collapsed > fg.svg
undefined
async-profiler -d 30 -f out.collapsed PID flamegraph.pl out.collapsed > fg.svg
undefined

6. Reading flamegraphs

6. 解读火焰图

A flamegraph is a call-stack visualisation:
  • X axis: time on CPU (not time sequence) — wider = more time
  • Y axis: call stack depth — taller = deeper call chain
  • Color: random (no significance) — unless using differential mode
What to look for:
PatternMeaningAction
Wide frame near bottomFunction itself is hotOptimise that function
Wide frame with tall narrow towersCalling many different calleesHot dispatch; reduce call overhead
Very tall stack with wide baseDeep recursionCheck recursion depth; consider iterative approach
Plateau at the topLeaf function with no calleesThis leaf is the actual hotspot
Many narrow identical stacksMany threads doing the same workConsider parallelism or batching
Identifying the actionable hotspot:
  1. Find the widest top frame (a frame with no or narrow children above it)
  2. That is where CPU time is actually spent
  3. Trace down to understand what called it and why
Differential flamegraph:
  • Red frames: more time in new profile (regression)
  • Blue frames: less time in new profile (improvement)
  • Frames only in one profile appear solid colored
火焰图是调用栈的可视化展示:
  • X轴:CPU占用时间(非时间序列)——越宽表示占用时间越长
  • Y轴:调用栈深度——越高表示调用链越深
  • 颜色:随机分配(无特殊含义)——差分模式除外
重点关注内容:
模式含义操作建议
底部的宽帧函数本身是热点优化该函数
宽帧上方有多个窄高的调用链调用了大量不同的子函数热点调度;减少调用开销
非常高的调用栈且底部较宽深度递归检查递归深度;考虑改用迭代实现
顶部的平台状帧无调用子函数的叶子函数该叶子函数是实际热点
大量相同的窄调用栈多个线程执行相同任务考虑并行化或批量处理
定位可优化的热点:
  1. 找到最顶部的宽帧(上方无帧或只有窄帧的帧)
  2. 该帧就是CPU时间的主要消耗点
  3. 向下追踪调用链,了解调用来源及原因
差分火焰图:
  • 红色帧:新分析结果中占用时间增加(性能退化)
  • 蓝色帧:新分析结果中占用时间减少(性能提升)
  • 仅在其中一张图中出现的帧为纯色

7. flamegraph.pl options

7. flamegraph.pl 选项

bash
flamegraph.pl --title "My App" \
              --subtitle "Release build, workload X" \
              --width 1600 \
              --height 16 \
              --minwidth 0.5 \
              --colors java \
              out.folded > fg.svg
OptionEffect
--title
SVG title
--width
Width in pixels
--height
Frame height in pixels
--minwidth
Omit frames < N% (reduces clutter)
--colors
Palette:
hot
(default),
mem
,
io
,
java
,
js
,
perl
,
red
,
green
,
blue
--inverted
Icicle chart (roots at top)
--reverse
Reverse stacks
--cp
Consistent palette (same frame = same color across SVGs)
bash
flamegraph.pl --title "My App" \
              --subtitle "Release build, workload X" \
              --width 1600 \
              --height 16 \
              --minwidth 0.5 \
              --colors java \
              out.folded > fg.svg
选项作用
--title
SVG标题
--width
宽度(像素)
--height
帧高度(像素)
--minwidth
忽略占比小于N%的帧(减少冗余)
--colors
配色方案:
hot
(默认)、
mem
io
java
js
perl
red
green
blue
--inverted
生成冰柱图(根节点在顶部)
--reverse
反转调用栈
--cp
配色一致(不同SVG中同一帧使用相同颜色)

References

参考资料

For tool installation, stackcollapse scripts, and palette options, see references/tools.md.
关于工具安装、stackcollapse脚本和配色方案选项,请参阅references/tools.md

Related skills

相关技能

  • Use
    skills/profilers/linux-perf
    to collect perf data
  • Use
    skills/profilers/valgrind
    to collect Callgrind data
  • Use
    skills/compilers/clang
    for LLVM PGO from sampling profiles
  • 使用
    skills/profilers/linux-perf
    收集perf数据
  • 使用
    skills/profilers/valgrind
    收集Callgrind数据
  • 使用
    skills/compilers/clang
    从采样分析结果生成LLVM PGO优化数据