sql-query-optimization

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

SQL Query Optimization Skill

SQL查询优化技能

Step 1 — Reproduce the Slow Query

步骤1 — 复现慢查询

Establish a reproducible baseline before making any changes:
  • Capture the exact query (including parameter values if possible).
  • Note the current execution time (p95 from APM, or run it manually 3–5 times).
  • Confirm the database engine (PostgreSQL, MySQL, SQLite, etc.) and version.
在进行任何更改之前,先建立可复现的基准:
  • 捕获精确的查询语句(尽可能包含参数值)。
  • 记录当前执行时间(来自APM的p95值,或手动运行3–5次取结果)。
  • 确认数据库引擎(PostgreSQL、MySQL、SQLite等)及其版本。

Step 2 — Run EXPLAIN (ANALYZE, BUFFERS)

步骤2 — 运行EXPLAIN (ANALYZE, BUFFERS)

PostgreSQL:
sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...;  -- paste the full query here
MySQL:
sql
EXPLAIN FORMAT=JSON
SELECT ...;
Capture the full output before making any changes.
PostgreSQL:
sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...;  -- paste the full query here
MySQL:
sql
EXPLAIN FORMAT=JSON
SELECT ...;
在进行任何更改前,捕获完整的输出结果。

Step 3 — Identify the Bottleneck

步骤3 — 识别瓶颈

Read the plan top-down (outermost → innermost). Look for:
SignalWhat it means
Seq Scan
on a large table
No index — or the planner chose not to use one
Rows estimate far from actualStale statistics — run
ANALYZE <table>
Hash Join
with a huge hash table
May need a nested-loop + index for small inputs
High
Buffers: shared hit
Heavily cached — latency is CPU-bound, not I/O
High
Buffers: shared read
I/O bound — consider indexes or read replicas
Bitmap Heap Scan
with many rows
Covering index may eliminate heap fetches
自上而下阅读执行计划(最外层 → 最内层)。重点关注以下内容:
信号含义
对大表执行
Seq Scan
没有索引——或者查询优化器选择不使用索引
估算行数与实际行数差距过大统计信息过时——执行
ANALYZE <table>
更新
使用
Hash Join
且哈希表过大
对于小数据集,可能需要嵌套循环+索引
Buffers: shared hit
数值高
缓存命中率高——延迟受CPU限制,而非I/O
Buffers: shared read
数值高
受I/O限制——考虑添加索引或使用只读副本
Bitmap Heap Scan
返回大量行
覆盖索引可消除堆读取操作

Step 4 — Fix

步骤4 — 修复方案

Add an index (most common fix):
sql
-- B-tree for equality and range
CREATE INDEX CONCURRENTLY ON orders(user_id);

-- Partial index for common filter
CREATE INDEX CONCURRENTLY ON orders(created_at)
  WHERE status = 'pending';

-- Covering index to eliminate heap fetch
CREATE INDEX CONCURRENTLY ON orders(user_id)
  INCLUDE (status, total_amount);

-- Expression index for function-wrapped column
CREATE INDEX CONCURRENTLY ON users(lower(email));
Always use
CONCURRENTLY
in production to avoid table locks.
Rewrite the query (when the plan is structurally wrong):
  • Replace correlated subqueries with
    JOIN
    or
    EXISTS
    .
  • Replace
    SELECT *
    with specific columns needed.
  • Replace
    OFFSET
    pagination with keyset pagination:
    sql
    -- Instead of: LIMIT 20 OFFSET 10000
    WHERE id > :last_seen_id ORDER BY id LIMIT 20
  • Replace
    IN (SELECT ...)
    with
    EXISTS (SELECT 1 FROM ...)
    for large subqueries.
Update statistics (when estimates are wrong):
sql
ANALYZE orders;           -- single table
ANALYZE;                  -- all tables (run as superuser)
添加索引(最常见的修复方式):
sql
-- B-tree for equality and range
CREATE INDEX CONCURRENTLY ON orders(user_id);

-- Partial index for common filter
CREATE INDEX CONCURRENTLY ON orders(created_at)
  WHERE status = 'pending';

-- Covering index to eliminate heap fetch
CREATE INDEX CONCURRENTLY ON orders(user_id)
  INCLUDE (status, total_amount);

-- Expression index for function-wrapped column
CREATE INDEX CONCURRENTLY ON users(lower(email));
生产环境中务必使用
CONCURRENTLY
以避免表锁。
重写查询语句(当执行计划结构不合理时):
  • JOIN
    EXISTS
    替换关联子查询。
  • 用所需的特定列替换
    SELECT *
  • 用键集分页替换
    OFFSET
    分页:
    sql
    -- Instead of: LIMIT 20 OFFSET 10000
    WHERE id > :last_seen_id ORDER BY id LIMIT 20
  • 对于大型子查询,用
    EXISTS (SELECT 1 FROM ...)
    替换
    IN (SELECT ...)
更新统计信息(当估算结果错误时):
sql
ANALYZE orders;           -- single table
ANALYZE;                  -- all tables (run as superuser)

Step 5 — Verify Improvement

步骤5 — 验证优化效果

Run
EXPLAIN (ANALYZE, BUFFERS)
again after the fix:
  • Confirm the plan uses the new index.
  • Compare actual execution time to the baseline.
  • Check that row estimates are now accurate.
Run the query 3–5 times to account for cache warm-up; report the steady-state time.
修复后再次运行
EXPLAIN (ANALYZE, BUFFERS)
  • 确认执行计划使用了新索引。
  • 将实际执行时间与基准值对比。
  • 检查行数估算是否准确。
运行查询3–5次以消除缓存预热的影响;报告稳定状态下的执行时间。

Step 6 — Document

步骤6 — 文档记录

Add a comment above the index in a migration file explaining which query it supports:
sql
-- Supports: orders list by user, filtered on status (POST /api/users/:id/orders)
CREATE INDEX CONCURRENTLY ON orders(user_id, status);
State the before/after execution time in the PR description.
在迁移文件中的索引上方添加注释,说明该索引支持的查询:
sql
-- Supports: orders list by user, filtered on status (POST /api/users/:id/orders)
CREATE INDEX CONCURRENTLY ON orders(user_id, status);
在PR描述中记录优化前后的执行时间。