sql-query-optimization
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSQL Query Optimization Skill
SQL查询优化技能
Step 1 — Reproduce the Slow Query
步骤1 — 复现慢查询
Establish a reproducible baseline before making any changes:
- Capture the exact query (including parameter values if possible).
- Note the current execution time (p95 from APM, or run it manually 3–5 times).
- Confirm the database engine (PostgreSQL, MySQL, SQLite, etc.) and version.
在进行任何更改之前,先建立可复现的基准:
- 捕获精确的查询语句(尽可能包含参数值)。
- 记录当前执行时间(来自APM的p95值,或手动运行3–5次取结果)。
- 确认数据库引擎(PostgreSQL、MySQL、SQLite等)及其版本。
Step 2 — Run EXPLAIN (ANALYZE, BUFFERS)
步骤2 — 运行EXPLAIN (ANALYZE, BUFFERS)
PostgreSQL:
sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...; -- paste the full query hereMySQL:
sql
EXPLAIN FORMAT=JSON
SELECT ...;Capture the full output before making any changes.
PostgreSQL:
sql
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT ...; -- paste the full query hereMySQL:
sql
EXPLAIN FORMAT=JSON
SELECT ...;在进行任何更改前,捕获完整的输出结果。
Step 3 — Identify the Bottleneck
步骤3 — 识别瓶颈
Read the plan top-down (outermost → innermost). Look for:
| Signal | What it means |
|---|---|
| No index — or the planner chose not to use one |
| Rows estimate far from actual | Stale statistics — run |
| May need a nested-loop + index for small inputs |
High | Heavily cached — latency is CPU-bound, not I/O |
High | I/O bound — consider indexes or read replicas |
| Covering index may eliminate heap fetches |
自上而下阅读执行计划(最外层 → 最内层)。重点关注以下内容:
| 信号 | 含义 |
|---|---|
对大表执行 | 没有索引——或者查询优化器选择不使用索引 |
| 估算行数与实际行数差距过大 | 统计信息过时——执行 |
使用 | 对于小数据集,可能需要嵌套循环+索引 |
| 缓存命中率高——延迟受CPU限制,而非I/O |
| 受I/O限制——考虑添加索引或使用只读副本 |
| 覆盖索引可消除堆读取操作 |
Step 4 — Fix
步骤4 — 修复方案
Add an index (most common fix):
sql
-- B-tree for equality and range
CREATE INDEX CONCURRENTLY ON orders(user_id);
-- Partial index for common filter
CREATE INDEX CONCURRENTLY ON orders(created_at)
WHERE status = 'pending';
-- Covering index to eliminate heap fetch
CREATE INDEX CONCURRENTLY ON orders(user_id)
INCLUDE (status, total_amount);
-- Expression index for function-wrapped column
CREATE INDEX CONCURRENTLY ON users(lower(email));Always use in production to avoid table locks.
CONCURRENTLYRewrite the query (when the plan is structurally wrong):
- Replace correlated subqueries with or
JOIN.EXISTS - Replace with specific columns needed.
SELECT * - Replace pagination with keyset pagination:
OFFSETsql-- Instead of: LIMIT 20 OFFSET 10000 WHERE id > :last_seen_id ORDER BY id LIMIT 20 - Replace with
IN (SELECT ...)for large subqueries.EXISTS (SELECT 1 FROM ...)
Update statistics (when estimates are wrong):
sql
ANALYZE orders; -- single table
ANALYZE; -- all tables (run as superuser)添加索引(最常见的修复方式):
sql
-- B-tree for equality and range
CREATE INDEX CONCURRENTLY ON orders(user_id);
-- Partial index for common filter
CREATE INDEX CONCURRENTLY ON orders(created_at)
WHERE status = 'pending';
-- Covering index to eliminate heap fetch
CREATE INDEX CONCURRENTLY ON orders(user_id)
INCLUDE (status, total_amount);
-- Expression index for function-wrapped column
CREATE INDEX CONCURRENTLY ON users(lower(email));生产环境中务必使用以避免表锁。
CONCURRENTLY重写查询语句(当执行计划结构不合理时):
- 用或
JOIN替换关联子查询。EXISTS - 用所需的特定列替换。
SELECT * - 用键集分页替换分页:
OFFSETsql-- Instead of: LIMIT 20 OFFSET 10000 WHERE id > :last_seen_id ORDER BY id LIMIT 20 - 对于大型子查询,用替换
EXISTS (SELECT 1 FROM ...)。IN (SELECT ...)
更新统计信息(当估算结果错误时):
sql
ANALYZE orders; -- single table
ANALYZE; -- all tables (run as superuser)Step 5 — Verify Improvement
步骤5 — 验证优化效果
Run again after the fix:
EXPLAIN (ANALYZE, BUFFERS)- Confirm the plan uses the new index.
- Compare actual execution time to the baseline.
- Check that row estimates are now accurate.
Run the query 3–5 times to account for cache warm-up; report the steady-state time.
修复后再次运行:
EXPLAIN (ANALYZE, BUFFERS)- 确认执行计划使用了新索引。
- 将实际执行时间与基准值对比。
- 检查行数估算是否准确。
运行查询3–5次以消除缓存预热的影响;报告稳定状态下的执行时间。
Step 6 — Document
步骤6 — 文档记录
Add a comment above the index in a migration file explaining which query it supports:
sql
-- Supports: orders list by user, filtered on status (POST /api/users/:id/orders)
CREATE INDEX CONCURRENTLY ON orders(user_id, status);State the before/after execution time in the PR description.
在迁移文件中的索引上方添加注释,说明该索引支持的查询:
sql
-- Supports: orders list by user, filtered on status (POST /api/users/:id/orders)
CREATE INDEX CONCURRENTLY ON orders(user_id, status);在PR描述中记录优化前后的执行时间。