performance-profiling

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Performance Profiling

性能剖析

Find where your application actually spends time before touching a line of code. Covers the full stack: Node.js CPU and memory profiling, browser flame graphs, React render profiling, and database query analysis. The discipline here is profile first, optimize second — premature optimization is not a workflow, it is a guess.

在修改任何代码之前，先找出应用程序的时间消耗点。涵盖全栈场景：Node.js CPU与内存剖析、浏览器火焰图、React渲染剖析、数据库查询分析。遵循的原则是先剖析，后优化——过早优化不是合理的工作流程，而是主观猜测。

When to Use

适用场景

Use for:

Diagnosing slow Node.js applications (CPU-bound, I/O-bound, memory pressure)
Generating and reading flame graphs to find hot code paths
Detecting memory leaks via heap snapshots and growth trends
Profiling React component render performance with React Profiler
Measuring browser rendering performance (Core Web Vitals, layout thrashing, long tasks)
Database query profiling with EXPLAIN ANALYZE
Measuring event loop utilization and latency

NOT for:

Infrastructure monitoring, distributed tracing, or log aggregation (use
```
logging-observability
```
)
Load testing and capacity planning (a separate domain)
Network latency analysis between services (use distributed tracing tools)
Database schema design optimization (separate from query profiling)

适用场景:

诊断运行缓慢的Node.js应用（CPU密集型、I/O密集型、内存压力问题）
生成并解读火焰图，找出热点代码路径
通过堆快照和内存增长趋势检测内存泄漏
使用React Profiler剖析React组件渲染性能
测量浏览器渲染性能（Core Web Vitals、布局抖动、长任务）
使用EXPLAIN ANALYZE进行数据库查询剖析
测量事件循环利用率与延迟

不适用场景:

基础设施监控、分布式追踪或日志聚合（请使用
```
logging-observability
```
工具）
负载测试与容量规划（独立领域）
服务间网络延迟分析（请使用分布式追踪工具）
数据库架构设计优化（与查询剖析为独立领域）

Core Decision: Where Is My App Slow?

核心决策：应用卡顿点在哪里？

mermaid

flowchart TD
    Start[App is slow. Where?] --> Layer{Which layer?}
    Layer -->|Backend| Backend{What kind?}
    Layer -->|Frontend/browser| Browser{What symptom?}
    Layer -->|Unknown| Measure[Instrument first — add timing logs]

    Backend -->|CPU pegged, slow responses| CPU[CPU Profiling]
    Backend -->|Memory growing, crashes| Mem[Memory / Heap Profiling]
    Backend -->|Fast CPU, slow I/O| IO{I/O type?}
    IO -->|Database queries| DB[EXPLAIN ANALYZE + query profiler]
    IO -->|Network calls| Network[Trace external calls, add timeouts]
    IO -->|File system| FS[Check event loop utilization]

    Browser -->|Slow initial load| Lighthouse[Lighthouse + bundle analysis]
    Browser -->|Janky scrolling, animations| Rendering[Chrome Performance tab — layout thrashing]
    Browser -->|Slow after interaction| React{React app?}
    React -->|Yes| ReactProfiler[React Profiler + why-did-you-render]
    React -->|No| JS[Chrome Performance — long tasks, main thread blocking]

    CPU --> FlameGraph[Generate flame graph with 0x or clinic flame]
    Mem --> HeapSnap[Take heap snapshots before/after suspected leak]
    FS --> ELU[clinic bubbles — event loop utilization]

mermaid

flowchart TD
    Start[App is slow. Where?] --> Layer{Which layer?}
    Layer -->|Backend| Backend{What kind?}
    Layer -->|Frontend/browser| Browser{What symptom?}
    Layer -->|Unknown| Measure[Instrument first — add timing logs]

    Backend -->|CPU pegged, slow responses| CPU[CPU Profiling]
    Backend -->|Memory growing, crashes| Mem[Memory / Heap Profiling]
    Backend -->|Fast CPU, slow I/O| IO{I/O type?}
    IO -->|Database queries| DB[EXPLAIN ANALYZE + query profiler]
    IO -->|Network calls| Network[Trace external calls, add timeouts]
    IO -->|File system| FS[Check event loop utilization]

    Browser -->|Slow initial load| Lighthouse[Lighthouse + bundle analysis]
    Browser -->|Janky scrolling, animations| Rendering[Chrome Performance tab — layout thrashing]
    Browser -->|Slow after interaction| React{React app?}
    React -->|Yes| ReactProfiler[React Profiler + why-did-you-render]
    React -->|No| JS[Chrome Performance — long tasks, main thread blocking]

    CPU --> FlameGraph[Generate flame graph with 0x or clinic flame]
    Mem --> HeapSnap[Take heap snapshots before/after suspected leak]
    FS --> ELU[clinic bubbles — event loop utilization]

Node.js: CPU Profiling

Node.js: CPU性能剖析

V8 Inspector (Built-in)

V8 Inspector（内置工具）

bash

undefined

bash

undefined

Attach inspector and capture a CPU profile

node --inspect src/index.js

Or start paused and wait for DevTools

node --inspect-brk src/index.js


Then open `chrome://inspect` in Chrome, click the target, go to the **Profiler** tab, and record while sending load to the server.

node --inspect-brk src/index.js


随后在Chrome中打开`chrome://inspect`，点击目标进程，进入**Profiler**标签页，在向服务器发送负载的同时开始录制。

0x: Flame Graphs from the Terminal

0x: 终端生成火焰图

bash

npm install -g 0x

bash

npm install -g 0x

Profile a script (runs it, generates flame graph)

0x -- node src/index.js

Profile with a load generator running simultaneously

0x -- node src/server.js & npx autocannon -d 30 http://localhost:3000/api/heavy


0x generates an interactive HTML flame graph. The **widest stacks** are where time is spent. Look for:
- Functions that appear wide near the bottom (called frequently by everything)
- Unexpected width in library code (serialization, template engines, parsers)
- Idle / `[idle]` blocks — I/O wait, not CPU (look elsewhere for those)

0x -- node src/server.js & npx autocannon -d 30 http://localhost:3000/api/heavy


0x会生成交互式HTML火焰图。**最宽的调用栈**就是时间消耗最多的地方。重点关注：
- 底部出现的宽幅函数（被所有代码频繁调用）
- 库代码中意外的宽幅区域（序列化、模板引擎、解析器）
- 空闲/`[idle]`块——表示I/O等待，而非CPU问题（需排查其他环节）

Clinic.js Suite

Clinic.js工具套件

bash

npm install -g clinic

bash

npm install -g clinic

Doctor: overview of what is wrong

Doctor: 问题概览

clinic doctor -- node src/server.js

Flame: CPU flame graph (wraps 0x)

Flame: CPU火焰图（基于0x封装）

clinic flame -- node src/server.js

Bubbles: event loop utilization

Bubbles: 事件循环利用率

clinic bubbles -- node src/server.js


Clinic Doctor gives you a triage view: CPU, memory, event loop, and handles. Start here when you do not know what kind of bottleneck you have.

clinic bubbles -- node src/server.js


Clinic Doctor提供分类视图：CPU、内存、事件循环与句柄。当你不确定瓶颈类型时，从这里开始排查。

Event Loop Utilization (ELU)

事件循环利用率（ELU）

const { performance } = require('perf_hooks');

// Sample ELU every 5 seconds
let last = performance.eventLoopUtilization();
setInterval(() => {
  const current = performance.eventLoopUtilization();
  const diff = performance.eventLoopUtilization(current, last);
  console.log(`ELU: ${(diff.utilization * 100).toFixed(1)}%`);
  last = current;
}, 5000);

ELU above 80% means the event loop is saturated — CPU-bound work or sync blocking. ELU near 0% with slow responses means I/O wait (network, disk, database).

const { performance } = require('perf_hooks');

// Sample ELU every 5 seconds
let last = performance.eventLoopUtilization();
setInterval(() => {
  const current = performance.eventLoopUtilization();
  const diff = performance.eventLoopUtilization(current, last);
  console.log(`ELU: ${(diff.utilization * 100).toFixed(1)}%`);
  last = current;
}, 5000);

ELU超过80%表示事件循环饱和——存在CPU密集型工作或同步阻塞操作。ELU接近0%但响应缓慢表示I/O等待（网络、磁盘、数据库）。

Node.js: Memory Profiling

Node.js: 内存性能剖析

Heap Snapshots

堆快照

bash

undefined

bash

undefined

Take heap snapshot via CLI

node --inspect src/index.js

In chrome://inspect → Memory tab → Take Heap Snapshot


**Three-snapshot technique for leak detection**:
1. Snapshot after startup (baseline)
2. Snapshot after N requests (warm)
3. Snapshot after 2N requests (growth)

Compare Snapshot 3 to Snapshot 2 — objects that grew proportionally to request count are leaking.


**内存泄漏检测的三快照技术**:
1. 启动后拍摄快照（基准线）
2. N次请求后拍摄快照（预热后）
3. 2N次请求后拍摄快照（增长后）

对比快照3与快照2——与请求数量成比例增长的对象即为泄漏对象。

Common Leak Patterns

常见泄漏模式

Closure captures — Variables captured in long-lived closures that should have been released:

// LEAK: handler is registered but never removed
emitter.on('data', (chunk) => {
  processedData.push(chunk);  // processedData grows unbounded
});

// FIX: remove listener when done, or use once()
emitter.once('data', handler);
// or
const handler = (chunk) => { ... };
emitter.on('data', handler);
// later:
emitter.off('data', handler);

Growing caches without eviction:

// LEAK: cache grows forever
const cache = new Map();
app.get('/user/:id', async (req, res) => {
  if (!cache.has(req.params.id)) {
    cache.set(req.params.id, await db.getUser(req.params.id));
  }
  res.json(cache.get(req.params.id));
});

// FIX: use LRU cache with max size
const LRU = require('lru-cache');
const cache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 });

WeakRef and FinalizationRegistry (for intentional weak references):

const cache = new Map();

function cacheValue(key, obj) {
  const ref = new WeakRef(obj);
  const registry = new FinalizationRegistry((k) => cache.delete(k));
  registry.register(obj, key);
  cache.set(key, ref);
}

闭包捕获——长期存在的闭包捕获了本应被释放的变量:

// LEAK: 注册了处理函数但从未移除
emitter.on('data', (chunk) => {
  processedData.push(chunk);  // processedData无限增长
});

// FIX: 完成后移除监听器，或使用once()
emitter.once('data', handler);
// 或者
const handler = (chunk) => { ... };
emitter.on('data', handler);
// 后续操作:
emitter.off('data', handler);

无淘汰机制的缓存增长:

// LEAK: 缓存无限增长
const cache = new Map();
app.get('/user/:id', async (req, res) => {
  if (!cache.has(req.params.id)) {
    cache.set(req.params.id, await db.getUser(req.params.id));
  }
  res.json(cache.get(req.params.id));
});

// FIX: 使用带最大容量的LRU缓存
const LRU = require('lru-cache');
const cache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 });

WeakRef与FinalizationRegistry（用于有意的弱引用）:

const cache = new Map();

function cacheValue(key, obj) {
  const ref = new WeakRef(obj);
  const registry = new FinalizationRegistry((k) => cache.delete(k));
  registry.register(obj, key);
  cache.set(key, ref);
}

Anti-Pattern: Optimizing Without Profiling

反模式：未剖析就优化

Novice: "This function looks expensive, I'll rewrite it in a more efficient algorithm."

Expert: Rewrote the wrong function. Profiling would have shown that this function is called once per startup and contributes 0.1% of runtime. The actual bottleneck was JSON serialization in the response handler, called 10,000 times per second. Optimization effort must follow measurement, never intuition.

Detection: The "optimized" code is measurably faster in microbenchmark isolation but production p99 latency is unchanged.

新手: "这个函数看起来开销很大，我要用更高效的算法重写它。"

专家: 重写了错误的函数。剖析会显示该函数仅在启动时调用一次，占运行时间的0.1%。真正的瓶颈是响应处理中的JSON序列化，每秒被调用10000次。优化工作必须基于测量，而非直觉。

识别标志: "优化后的代码"在微基准测试中速度更快，但生产环境的p99延迟毫无变化。

Anti-Pattern: Micro-Benchmarking in Isolation

反模式：孤立微基准测试

Novice: Writes a benchmark comparing two sorting algorithms on an array of 1000 items, concludes Algorithm B is 2x faster, rewrites production code.

Expert: Micro-benchmarks measure JIT-compiled hot paths under artificial conditions. Real workloads have different data shapes, mixed call patterns, GC pressure, and I/O interspersed. The JIT may optimize the benchmark differently than the real call site. Profile the actual application under real load — or at minimum, profile with realistic data shapes and call patterns embedded in the actual application code path.

The test: Does your benchmark run in a tight loop 10,000 times before measuring? If yes, V8 has JIT-compiled it differently than it will compile the real code, which runs cold at startup and is called with varied inputs.

新手: 编写基准测试对比两种排序算法处理1000条数据的性能，得出算法B快2倍的结论，随后重写生产代码。

专家: 微基准测试测量的是人工条件下JIT编译的热点路径。真实工作负载的数据形态、调用模式、GC压力与I/O交互都不同。JIT对基准测试的优化方式可能与真实调用位点完全不同。请在真实负载下剖析实际应用——至少要在实际应用代码路径中嵌入真实数据形态与调用模式进行剖析。

测试标准: 你的基准测试是否在测量前循环执行10000次？如果是，V8对其的JIT编译方式会与真实代码不同——真实代码启动时是冷态，且输入多样。

React Rendering Performance

React渲染性能

React Profiler (DevTools)

React Profiler（DevTools）

Open React DevTools → Profiler tab
Click "Record"
Perform the slow interaction
Stop recording
Examine the flame chart — bars represent components, width represents render time

Key columns: "Why did this render?" shows which prop or state change triggered each render.

打开React DevTools → Profiler标签页
点击"Record"
执行缓慢的交互操作
停止录制
查看火焰图——条形代表组件，宽度代表渲染时间

关键列：**"Why did this render?"**显示触发组件渲染的props或state变化原因。

why-did-you-render

bash

npm install @welldone-software/why-did-you-render

// src/wdyr.js (import before React)
import React from 'react';
if (process.env.NODE_ENV === 'development') {
  const whyDidYouRender = require('@welldone-software/why-did-you-render');
  whyDidYouRender(React, { trackAllPureComponents: true });
}

// Mark a specific component for tracking
MyExpensiveComponent.whyDidYouRender = true;

This logs to the console every time a component re-renders with the same props — exposing unnecessary renders caused by reference equality failures.

bash

npm install @welldone-software/why-did-you-render

// src/wdyr.js（需在React之前导入）
import React from 'react';
if (process.env.NODE_ENV === 'development') {
  const whyDidYouRender = require('@welldone-software/why-did-you-render');
  whyDidYouRender(React, { trackAllPureComponents: true });
}

// 标记特定组件进行追踪
MyExpensiveComponent.whyDidYouRender = true;

该工具会在控制台记录组件使用相同props重复渲染的情况——暴露因引用相等性失败导致的不必要渲染。

Common React Performance Patterns

常见React性能优化模式

// Memoize expensive components
const ExpensiveList = React.memo(({ items, onSelect }) => {
  return items.map(item => <Item key={item.id} item={item} onSelect={onSelect} />);
});

// Stable callback references — prevent re-renders downstream
const handleSelect = useCallback((id) => {
  setSelected(id);
}, []); // no deps: stable forever

// Memoize expensive computations
const sortedItems = useMemo(() => {
  return [...items].sort((a, b) => a.name.localeCompare(b.name));
}, [items]);

// Virtualize long lists
import { FixedSizeList } from 'react-window';
<FixedSizeList height={600} itemCount={items.length} itemSize={50} width="100%">
  {({ index, style }) => <Row item={items[index]} style={style} />}
</FixedSizeList>

// 记忆化开销大的组件
const ExpensiveList = React.memo(({ items, onSelect }) => {
  return items.map(item => <Item key={item.id} item={item} onSelect={onSelect} />);
});

// 稳定回调引用——避免下游组件重渲染
const handleSelect = useCallback((id) => {
  setSelected(id);
}, []); // 无依赖：永久稳定

// 记忆化开销大的计算
const sortedItems = useMemo(() => {
  return [...items].sort((a, b) => a.name.localeCompare(b.name));
}, [items]);

// 虚拟化长列表
import { FixedSizeList } from 'react-window';
<FixedSizeList height={600} itemCount={items.length} itemSize={50} width="100%">
  {({ index, style }) => <Row item={items[index]} style={style} />}
</FixedSizeList>

Database Query Profiling

数据库查询剖析

PostgreSQL EXPLAIN ANALYZE

sql

-- Wrap any query in EXPLAIN (ANALYZE, BUFFERS) to see execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.*, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id;

Read the output bottom-up. Each node shows:

```
actual time=X..Y
```
— startup time to first row, total time for all rows
```
rows=N
```
— actual rows returned
```
loops=N
```
— how many times this node executed

Red flags:

```
Seq Scan
```
on large tables — missing index
```
rows=1000
```
estimated vs
```
rows=1
```
actual — stale statistics, run
```
ANALYZE
```
```
Hash Join
```
with large hash batches — memory pressure, tune
```
work_mem
```
```
Nested Loop
```
on large outer result — cartesian product risk

sql

-- Wrap any query in EXPLAIN (ANALYZE, BUFFERS) to see execution plan
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.*, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id;

从下往上阅读输出结果。每个节点显示：

```
actual time=X..Y
```
——返回第一行的启动时间、所有行的总时间
```
rows=N
```
——实际返回的行数
```
loops=N
```
——该节点执行的次数

危险信号:

大表上的
```
Seq Scan
```
——缺少索引
预估
```
rows=1000
```
vs 实际
```
rows=1
```
——统计信息过时，运行
```
ANALYZE
```
更新
大哈希批处理的
```
Hash Join
```
——内存压力，调整
```
work_mem
```
大外部结果集的
```
Nested Loop
```
——存在笛卡尔积风险

Finding Slow Queries in Production

生产环境慢查询排查

sql

-- Enable pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries by total time
SELECT
  query,
  calls,
  total_exec_time / 1000 AS total_seconds,
  mean_exec_time AS mean_ms,
  rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

sql

-- Enable pg_stat_statements extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Top 10 slowest queries by total time
SELECT
  query,
  calls,
  total_exec_time / 1000 AS total_seconds,
  mean_exec_time AS mean_ms,
  rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Browser Profiling

浏览器性能剖析

See

references/browser-profiling.md

for the full Chrome Performance tab workflow, Core Web Vitals measurement, and layout thrashing diagnosis.

完整的Chrome Performance标签页工作流程、Core Web Vitals测量与布局抖动诊断，请查看

references/browser-profiling.md

。

Bottleneck Classification Rules

瓶颈分类规则

When the user provides profiling data, classify and rank bottlenecks using these rules. Process signals in priority order — higher-priority signals override lower ones.

当用户提供剖析数据时，使用以下规则对瓶颈进行分类与排序。按优先级处理信号——高优先级信号覆盖低优先级信号。

Priority 1: Database (check first — it's the bottleneck 70% of the time)

优先级1：数据库（优先检查——70%的场景中是瓶颈）

Signal	Classification
Any query >500ms	Critical — `type: database` . Next step: Run `EXPLAIN ANALYZE` on the query. Look for sequential scans on large tables (missing index) and N+1 patterns (same query repeated with different IDs).
Multiple queries >100ms per request	High — `type: database` . Next step: Aggregate query count per endpoint. If >5 queries per request, look for N+1 or missing JOINs. Consider a query count budget per endpoint.
Query count >20 per page load	High — `type: database` . Even if individual queries are fast, connection overhead and round-trip latency compound. Next step: Batch with `WHERE id IN (...)` or use a DataLoader pattern.

信号	分类
任何查询耗时>500ms	Critical — `type: database` 。下一步：对该查询运行 `EXPLAIN ANALYZE` 。检查大表上的顺序扫描（缺少索引）与N+1模式（重复执行相同查询仅ID不同）。
单次请求中多个查询耗时>100ms	High — `type: database` 。下一步：统计每个端点的查询次数。如果单次请求查询次数>5，检查N+1或缺失JOIN。考虑为每个端点设置查询次数预算。
页面加载时查询次数>20	High — `type: database` 。即使单个查询很快，连接开销与往返延迟也会叠加。下一步：使用 `WHERE id IN (...)` 批量查询，或采用DataLoader模式。

Priority 2: Event Loop (Node.js-specific — the most underdiagnosed bottleneck)

优先级2：事件循环（Node.js专属——最容易被误诊的瓶颈）

Signal	Classification
ELU >0.8	Critical — `type: cpu` . The event loop is saturated. Next step: Run `clinic flame` or `--prof` to find synchronous hot paths. Common culprits: JSON.parse on large payloads, synchronous crypto, regex backtracking.
ELU >0.5 with slow p99 latency	High — `type: cpu` . Event loop contention is causing tail latency. Next step: Look for blocking operations that run infrequently but hold the loop when they do (large sorts, template rendering, PDF generation).
ELU <0.2 with slow responses	This is NOT a CPU problem. `type: io` . Next step: The app is waiting on something external (DB, API calls, file system). Trace outbound requests with `clinic bubbleprof` or add timing logs to external calls.

信号	分类
ELU >0.8	Critical — `type: cpu` 。事件循环饱和。下一步：运行 `clinic flame` 或 `--prof` 找出同步热点路径。常见原因：大负载的JSON.parse、同步加密、正则回溯。
ELU >0.5且p99延迟高	High — `type: cpu` 。事件循环竞争导致尾部延迟。下一步：查找不频繁但会阻塞循环的操作（大数据排序、模板渲染、PDF生成）。
ELU <0.2且响应缓慢	这不是CPU问题。 `type: io` 。下一步：应用在等待外部资源（数据库、API调用、文件系统）。使用 `clinic bubbleprof` 追踪出站请求，或为外部调用添加计时日志。

Priority 3: Memory

优先级3：内存

Signal	Classification
Heap growth rate >10MB/min sustained	Critical — `type: memory` . Memory leak will OOM the process. Next step: Take two heap snapshots 5 minutes apart, compare in Chrome DevTools, look for growing object counts (retained size). Common suspects: event listener accumulation, closures capturing request objects, unbounded caches.
Heap growth proportional to request rate (resets on GC)	Medium — `type: memory` . Not a leak, just high allocation pressure. Next step: Check for unnecessary object creation in hot paths (cloning large objects, building strings with concatenation). Reduce allocation, don't chase GC.
`suspects` array from heap analysis	List each suspect with its retained size. High if any single object retains >50MB. Next step: Trace the retainer tree to find why it's not being collected.

信号	分类
堆内存增长率持续>10MB/分钟	Critical — `type: memory` 。内存泄漏会导致进程OOM。下一步：间隔5分钟拍摄两张堆快照，在Chrome DevTools中对比，查找对象数量增长的部分（保留大小）。常见嫌疑：事件监听器累积、闭包捕获请求对象、无界缓存。
堆内存增长与请求速率成正比（GC后重置）	Medium — `type: memory` 。不是泄漏，只是分配压力大。下一步：检查热点路径中不必要的对象创建（克隆大对象、字符串拼接）。减少分配，不要过度关注GC。
堆分析中的 `suspects` 数组	列出每个嫌疑对象及其保留大小。如果单个对象保留大小>50MB则标记为High。下一步：追踪保留树，找出未被回收的原因。

Priority 4: React Rendering (frontend)

优先级4：React渲染（前端）

Signal	Classification
Component render time >16ms	High — `type: rendering` . Dropping frames. Next step: Check if the component re-renders on every parent render (missing `React.memo` or unstable props). Profile with React DevTools "Why did this render?"
>5 re-renders per user interaction	Medium — `type: rendering` . Next step: Check for state updates that trigger cascading re-renders. Move state closer to where it's used, or split context providers.
Large component tree (>500 components mounted)	Medium — `type: rendering` . Next step: Virtualize lists ( `react-window` ), lazy-load off-screen components, check for unnecessary mount/unmount cycles.

信号	分类
组件渲染时间>16ms	High — `type: rendering` 。丢帧。下一步：检查组件是否在父组件每次渲染时都重渲染（缺少 `React.memo` 或props不稳定）。使用React DevTools的"Why did this render?"功能剖析。
单次用户交互导致>5次重渲染	Medium — `type: rendering` 。下一步：检查触发级联重渲染的状态更新。将状态移至更靠近使用的位置，或拆分上下文提供者。
大型组件树（挂载组件>500个）	Medium — `type: rendering` 。下一步：虚拟化列表（ `react-window` ）、懒加载屏幕外组件、检查不必要的挂载/卸载循环。

Priority 5: CPU (non-event-loop)

优先级5：CPU（非事件循环）

Signal	Classification
Single function >30% of `selfTime` in CPU profile	High — `type: cpu` . Hot function dominates. Next step: Read the function. If it's in your code, optimize it. If it's in a library, check if you're calling it unnecessarily or with pathologically large input.
Flame graph shows wide, flat profile (no single hot function)	Medium — `type: cpu` . Death by a thousand cuts. Next step: Look for patterns — are many functions doing similar work? This often means redundant computation (computing the same derived value multiple times per request).

信号	分类
单个函数在CPU剖析中占 `selfTime` >30%	High — `type: cpu` 。热点函数占主导。下一步：查看该函数。如果是自研代码则优化；如果是库代码，检查是否不必要调用或传入了极端大的输入。
火焰图显示宽而平的轮廓（无单个热点函数）	Medium — `type: cpu` 。千刀万剐式性能损耗。下一步：查找模式——是否多个函数在做相似工作？这通常意味着冗余计算（单次请求中多次计算相同派生值）。

Output Ranking

输出排序

After classifying all signals, rank the bottleneck list by:

Severity (critical first)
Actionability (clear next step ranks higher than vague "investigate further")
Estimated impact — "Adding an index will reduce this query from 800ms to 5ms" is more useful than "This might help"

Always include

estimatedImpact

as a concrete prediction: "Eliminating N+1 queries should reduce request count from 47 to 3, cutting endpoint latency by ~200ms" — not "should improve performance."

分类所有信号后，按以下规则对瓶颈列表排序：

严重程度（Critical优先）
可操作性（明确下一步的优先级高于模糊的"进一步调查"）
预估影响——"添加索引可将该查询从800ms降至5ms"比"可能会提升性能"更有价值

始终包含

estimatedImpact

作为具体预测："消除N+1查询可将请求次数从47降至3，将端点延迟减少约200ms"——而非"应该会提升性能"。

References

参考资料

```
references/node-profiling.md
```
— Consult for detailed Node.js profiling: --inspect flags, clinic.js commands, heap snapshot analysis, event loop monitoring, stream backpressure diagnosis
```
references/browser-profiling.md
```
— Consult for browser performance: Chrome Performance tab workflow, Lighthouse CI integration, React Profiler deep-dive, Core Web Vitals measurement, layout thrashing patterns

```
references/node-profiling.md
```
— 详细Node.js剖析指南：--inspect参数、clinic.js命令、堆快照分析、事件循环监控、流背压诊断
```
references/browser-profiling.md
```
— 浏览器性能指南：Chrome Performance标签页工作流程、Lighthouse CI集成、React Profiler深度解析、Core Web Vitals测量、布局抖动模式