sql-query-optimization

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

SQL Query Optimization

SQL查询优化

Status: Production Ready ✅ Last Updated: 2025-12-15 Latest Versions: PostgreSQL 17, MySQL 8.4 Dependencies: None

状态:生产就绪 ✅ 最后更新:2025-12-15 最新版本:PostgreSQL 17、MySQL 8.4 依赖项:无

Quick Start (10 Minutes)

快速入门(10分钟)

1. Identify Slow Query

1. 识别慢查询

sql
-- PostgreSQL: Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slowest queries
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
sql
-- PostgreSQL:启用pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- 查询最慢的语句
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

2. Analyze with EXPLAIN

2. 使用EXPLAIN分析

sql
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM orders WHERE user_id = 123;

-- Look for:
-- - Seq Scan on large tables → needs index
-- - High "Rows Removed by Filter" → poor selectivity
-- - Temp read/written → increase work_mem
sql
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM orders WHERE user_id = 123;

-- 重点关注:
-- - 大表上的Seq Scan(顺序扫描)→ 需要创建索引
-- - 高"Rows Removed by Filter"值 → 筛选性差
-- - 临时表读写 → 增大work_mem

3. Create Index

3. 创建索引

sql
-- Add missing index
CREATE INDEX CONCURRENTLY idx_orders_user
ON orders(user_id);

-- Verify improvement
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 123;
-- Execution time should drop 10-100x

sql
-- 添加缺失的索引
CREATE INDEX CONCURRENTLY idx_orders_user
ON orders(user_id);

-- 验证优化效果
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 123;
-- 执行时间应降低10-100倍

Critical Rules

关键规则

Always Do ✓

务必遵循 ✓

RuleWhyExample
Index foreign keysJOINs need indexed columns
CREATE INDEX idx_orders_user ON orders(user_id)
Use EXPLAIN ANALYZE before productionVerify query plan is optimal
EXPLAIN (ANALYZE, BUFFERS) <query>
Select specific columnsReduces data transfer 90%
SELECT id, name FROM users
not
SELECT *
Add LIMIT to unbounded queriesPrevents memory exhaustion
SELECT * FROM logs ORDER BY id LIMIT 100
Use prepared statementsPrevents SQL injection + faster
db.query('SELECT * FROM users WHERE id = $1', [id])
Run ANALYZE after bulk operationsUpdates query planner statistics
ANALYZE table_name
Monitor pg_stat_statementsTrack query performance over timeReview daily for regressions
Use connection poolingReduces connection overhead 10x
new Pool({ max: 20 })
规则原因示例
为外键创建索引JOIN操作需要索引列
CREATE INDEX idx_orders_user ON orders(user_id)
上线前使用EXPLAIN ANALYZE验证查询计划是否最优
EXPLAIN (ANALYZE, BUFFERS) <query>
选择特定列减少90%的数据传输
SELECT id, name FROM users
而非
SELECT *
为无限制查询添加LIMIT防止内存耗尽
SELECT * FROM logs ORDER BY id LIMIT 100
使用预编译语句防止SQL注入且速度更快
db.query('SELECT * FROM users WHERE id = $1', [id])
批量操作后运行ANALYZE更新查询规划器统计信息
ANALYZE table_name
监控pg_stat_statements跟踪查询性能变化每日检查是否有性能退化
使用连接池减少10倍的连接开销
new Pool({ max: 20 })

Never Do ✗

切勿执行 ✗

Anti-PatternProblemFix
SELECT * in productionFetches unnecessary columnsSelect specific columns only
Leading wildcard LIKE '%term%'Cannot use indexUse full-text search instead
String concatenation for SQLSQL injection vulnerabilityUse parameterized queries
No LIMIT on large resultsMemory exhaustionAlways add LIMIT + pagination
N+1 queries in loopsNetwork latency × NUse JOIN or batch loading
Ignoring EXPLAIN outputDeploy slow queries to productionAlways EXPLAIN before deploy
Multiple INSERTs in loopSlow bulk operationsUse batch INSERT with multiple VALUES
OFFSET for paginationO(n) time, scans skipped rowsUse cursor-based pagination

反模式问题修复方案
生产环境中使用SELECT *获取不必要的列仅选择特定列
前缀通配符LIKE '%term%'无法使用索引改用全文搜索
字符串拼接生成SQL存在SQL注入漏洞使用参数化查询
大结果集不设LIMIT内存耗尽始终添加LIMIT + 分页
循环中出现N+1查询网络延迟×N使用JOIN或批量加载
忽略EXPLAIN输出将慢查询部署到生产环境上线前务必执行EXPLAIN
循环中多次执行INSERT批量操作缓慢使用多值批量INSERT
使用OFFSET进行分页O(n)时间复杂度,需扫描跳过的行使用基于游标分页

Top 7 Critical Errors

7大关键错误

1. Sequential Scan on Large Table

1. 大表上的顺序扫描

Symptom:
Seq Scan on orders (cost=0.00..150000.00)
on 1M+ rows Cause: No index on filter column Fix:
CREATE INDEX idx_orders_column ON orders(column)
Impact: 10-100x faster
症状:100万+行的表上出现
Seq Scan on orders (cost=0.00..150000.00)
原因:筛选列无索引 修复
CREATE INDEX idx_orders_column ON orders(column)
影响:速度提升10-100倍

2. Missing Index on Foreign Key

2. 外键缺失索引

Symptom: Slow JOINs (5+ seconds) Cause: Foreign key columns not indexed Fix:
CREATE INDEX idx_orders_user_id ON orders(user_id)
Impact: 50-500x faster JOINs
症状:JOIN操作缓慢(5秒以上) 原因:外键列未创建索引 修复
CREATE INDEX idx_orders_user_id ON orders(user_id)
影响:JOIN速度提升50-500倍

3. N+1 Query Problem

3. N+1查询问题

Symptom: 1 + N queries for N records Cause: ORM lazy loading in loop Fix: Use JOIN or eager loading:
SELECT u.*, o.* FROM users u LEFT JOIN orders o ON u.id = o.user_id
Impact: N queries → 1 query
症状:获取N条记录需要1+N次查询 原因:循环中ORM懒加载 修复:使用JOIN或预加载:
SELECT u.*, o.* FROM users u LEFT JOIN orders o ON u.id = o.user_id
影响:N次查询 → 1次查询

4. Leading Wildcard LIKE

4. 前缀通配符LIKE

Symptom:
WHERE name LIKE '%search%'
sequential scan Cause: Index cannot match middle of string Fix: Use full-text search (GIN index) or trigrams Impact: 100-1000x faster
症状
WHERE name LIKE '%search%'
触发顺序扫描 原因:索引无法匹配字符串中间部分 修复:使用全文搜索(GIN索引)或三元组索引 影响:速度提升100-1000倍

5. SELECT * in Production

5. 生产环境使用SELECT *

Symptom: High network traffic, slow responses Cause: Fetches all 50 columns instead of needed 3 Fix:
SELECT id, name, email
(explicit column list) Impact: 90% less data transfer
症状:网络流量高、响应缓慢 原因:获取全部50列而非所需的3列 修复
SELECT id, name, email
(明确列列表) 影响:数据传输减少90%

6. Missing LIMIT on Large Results

6. 大结果集缺失LIMIT

Symptom: Server out of memory, query timeout Cause: Attempting to return 5M rows Fix:
SELECT * FROM logs WHERE ... LIMIT 100
+ pagination Impact: Constant memory usage
症状:服务器内存不足、查询超时 原因:尝试返回500万行数据 修复
SELECT * FROM logs WHERE ... LIMIT 100
+ 分页 影响:内存使用恒定

7. Stale Statistics After Bulk Load

7. 批量加载后统计信息过时

Symptom: Wrong query plan chosen despite index Cause: PostgreSQL statistics outdated Fix:
ANALYZE table_name
after bulk operations Impact: Correct query plan selection
See
references/error-catalog.md
for all 12 errors with detailed solutions.

症状:虽有索引但查询计划选择错误 原因:PostgreSQL统计信息过时 修复:批量操作后执行
ANALYZE table_name
影响:选择正确的查询计划
查看
references/error-catalog.md
获取全部12种错误的详细解决方案。

Common Patterns Summary

常见模式总结

PatternUse CaseExamplePerformance
B-Tree IndexEquality, range, sort queries
CREATE INDEX idx ON t(col)
Default, best general purpose
Composite IndexMulti-column WHERE clauses
CREATE INDEX idx ON t(c1, c2)
5-50x faster than single index
Covering IndexInclude all query columns
CREATE INDEX idx ON t(c1) INCLUDE (c2)
2-10x faster (no heap fetch)
Partial IndexFilter subset of rows
CREATE INDEX idx ON t(c) WHERE status='active'
50-90% smaller index
JOIN RewriteReplace IN subquery
INNER JOIN users u ON o.user_id = u.id
5-20x faster than subquery
Batch INSERTBulk data loading
INSERT INTO t VALUES (..),(..)
10-100x faster than individual
Cursor PaginationLarge offset performance
WHERE id > last_id LIMIT 100
Constant time vs O(n)

模式使用场景示例性能表现
B-Tree索引等值、范围、排序查询
CREATE INDEX idx ON t(col)
默认选项,通用最佳选择
复合索引多列WHERE条件
CREATE INDEX idx ON t(c1, c2)
比单索引快5-50倍
覆盖索引包含查询所需所有列
CREATE INDEX idx ON t(c1) INCLUDE (c2)
快2-10倍(无需堆表读取)
部分索引筛选行子集
CREATE INDEX idx ON t(c) WHERE status='active'
索引体积减小50-90%
JOIN重写替换IN子查询
INNER JOIN users u ON o.user_id = u.id
比子查询快5-20倍
批量INSERT批量数据加载
INSERT INTO t VALUES (..),(..)
比单条INSERT快10-100倍
游标分页大OFFSET场景优化
WHERE id > last_id LIMIT 100
恒定时间复杂度 vs O(n)

Configuration Summary

配置总结

PostgreSQL Config

PostgreSQL配置

sql
-- Increase work_mem for complex queries (reloadable - no restart needed)
SET work_mem = '256MB';

-- Increase shared_buffers for better caching (25% of RAM)
ALTER SYSTEM SET shared_buffers = '8GB';

-- IMPORTANT: shared_buffers requires a full PostgreSQL server restart!
-- This setting is NOT reloadable via pg_reload_conf()
--
-- To apply shared_buffers change:
-- 1. Stop PostgreSQL:   sudo systemctl stop postgresql
-- 2. Start PostgreSQL:  sudo systemctl start postgresql
-- OR use:              sudo systemctl restart postgresql
--
-- Verify the change took effect:
-- SHOW shared_buffers;

-- Enable auto-vacuum (reloadable - can use pg_reload_conf)
ALTER SYSTEM SET autovacuum = on;

-- Reload config (ONLY works for parameters that don't require restart)
-- This will NOT reload shared_buffers - restart required for that!
SELECT pg_reload_conf();
sql
-- 为复杂查询增大work_mem(可重载 - 无需重启)
SET work_mem = '256MB';

-- 增大shared_buffers以提升缓存效果(内存的25%)
ALTER SYSTEM SET shared_buffers = '8GB';

-- 重要提示:shared_buffers需要PostgreSQL服务器完全重启!
-- 此设置无法通过pg_reload_conf()重载
--
-- 应用shared_buffers更改的步骤:
-- 1. 停止PostgreSQL:   sudo systemctl stop postgresql
-- 2. 启动PostgreSQL:  sudo systemctl start postgresql
-- 或使用:              sudo systemctl restart postgresql
--
-- 验证更改是否生效:
-- SHOW shared_buffers;

-- 启用自动清理(可重载 - 可使用pg_reload_conf)
ALTER SYSTEM SET autovacuum = on;

-- 重载配置(仅适用于无需重启的参数)
-- 此操作不会重载shared_buffers - 该参数需要重启才能生效!
SELECT pg_reload_conf();

MySQL Config

MySQL配置

ini
undefined
ini
undefined

my.cnf

my.cnf

[mysqld] innodb_buffer_pool_size = 8G # 70% of RAM max_connections = 500 slow_query_log = 1 long_query_time = 1

---
[mysqld] innodb_buffer_pool_size = 8G # 内存的70% max_connections = 500 slow_query_log = 1 long_query_time = 1

---

When to Load References

何时加载参考文档

Performance Analysis:
  • Load
    references/explain-analysis.md
    when: Reading EXPLAIN output, understanding query plans, analyzing buffer statistics, comparing PostgreSQL vs MySQL EXPLAIN
  • Load
    references/performance-monitoring.md
    when: Setting up monitoring, tracking slow queries over time, monitoring cache hit ratios, identifying bloated tables
Index Optimization:
  • Load
    references/index-strategies.md
    when: Choosing index type (B-Tree, GIN, GiST, Hash), creating composite indexes, determining column order, using covering indexes, implementing partial indexes, monitoring index usage
Query Optimization:
  • Load
    references/query-rewrites.md
    when: Rewriting slow queries, converting subqueries to JOINs, eliminating N+1 queries, implementing pagination, optimizing LIKE queries, batching operations
Systematic Process:
  • Load
    references/optimization-workflow.md
    when: Following step-by-step optimization process, creating optimization hypothesis, measuring improvements, monitoring long-term performance
Error Resolution:
  • Load
    references/error-catalog.md
    when: Debugging specific errors (sequential scans, missing indexes, N+1 queries, etc.), understanding root causes, implementing verified solutions

性能分析:
  • 当需要读取EXPLAIN输出、理解查询计划、分析缓冲区统计信息、对比PostgreSQL与MySQL的EXPLAIN时,加载
    references/explain-analysis.md
  • 当需要设置监控、跟踪慢查询变化、监控缓存命中率、识别膨胀表时,加载
    references/performance-monitoring.md
索引优化:
  • 当需要选择索引类型(B-Tree、GIN、GiST、Hash)、创建复合索引、确定列顺序、使用覆盖索引、实现部分索引、监控索引使用情况时,加载
    references/index-strategies.md
查询优化:
  • 当需要重写慢查询、将子查询转换为JOIN、消除N+1查询、实现分页、优化LIKE查询、批量操作时,加载
    references/query-rewrites.md
系统化流程:
  • 当需要遵循分步优化流程、创建优化假设、衡量改进效果、监控长期性能时,加载
    references/optimization-workflow.md
错误排查:
  • 当需要调试特定错误(顺序扫描、缺失索引、N+1查询等)、理解根本原因、实施验证解决方案时,加载
    references/error-catalog.md

Using Bundled Resources

使用捆绑资源

Templates (Copy-Paste SQL)

模板(可复制粘贴的SQL)

bash
undefined
bash
undefined

EXPLAIN query templates

EXPLAIN查询模板

templates/explain-query.sql
templates/explain-query.sql

Index creation patterns

索引创建模式

templates/index-examples.sql
templates/index-examples.sql

Query rewrite examples

查询重写示例

templates/query-rewrites.sql
templates/query-rewrites.sql

Monitoring queries

监控查询

templates/monitoring-queries.sql
undefined
templates/monitoring-queries.sql
undefined

References (Deep Dives)

参考文档(深度解析)

bash
undefined
bash
undefined

Comprehensive guides

综合指南

references/error-catalog.md # All 12 errors + solutions references/explain-analysis.md # Reading query plans references/index-strategies.md # Index types & selection references/query-rewrites.md # Before/after optimizations references/performance-monitoring.md # Long-term monitoring references/optimization-workflow.md # Systematic process

---
references/error-catalog.md # 全部12种错误+解决方案 references/explain-analysis.md # 查询计划解读 references/index-strategies.md # 索引类型与选择 references/query-rewrites.md # 优化前后对比 references/performance-monitoring.md # 长期监控 references/optimization-workflow.md # 系统化流程

---

Dependencies

依赖项

PostgreSQL Extensions:
  • pg_stat_statements
    - Query performance tracking (built-in)
  • pg_trgm
    - Trigram similarity search (optional, for fuzzy matching)
MySQL:
  • performance_schema
    - Performance monitoring (enabled by default in 8.0+)
No additional dependencies required.

PostgreSQL扩展:
  • pg_stat_statements
    - 查询性能跟踪(内置)
  • pg_trgm
    - 三元组相似度搜索(可选,用于模糊匹配)
MySQL:
  • performance_schema
    - 性能监控(8.0+默认启用)
无需额外依赖项。

Known Issues Prevention

已知问题预防

IssueSymptomPrevention
Sequential scansSeq Scan on 1M+ rowsIndex filter columns before production
Missing FK indexesSlow JOINsAlways index foreign keys
N+1 queries1+N database callsUse JOIN or eager loading
Leading wildcardsLIKE '%x%' slowUse full-text search (GIN)
SELECT * bloatHigh network trafficSelect specific columns
No LIMITMemory exhaustionAlways LIMIT unbounded queries
Stale statisticsWrong query plansANALYZE after bulk operations
Wrong index orderIndex exists but not usedMatch query pattern
Missing compositeMultiple WHERE slowCreate composite index
No connection poolHigh latencyImplement pooling (20-50 connections)
SQL injectionSecurity vulnerabilityUse prepared statements only
Temp spillsDisk I/O on sortsIncrease work_mem

问题症状预防措施
顺序扫描100万+行的表上出现顺序扫描上线前为筛选列创建索引
外键索引缺失JOIN操作缓慢始终为外键创建索引
N+1查询1+N次数据库调用使用JOIN或预加载
前缀通配符LIKE '%x%'查询缓慢使用全文搜索(GIN索引)
SELECT *冗余网络流量高选择特定列
无LIMIT限制内存耗尽始终为无限制查询添加LIMIT
统计信息过时查询计划错误批量操作后执行ANALYZE
索引顺序错误存在索引但未被使用匹配查询模式调整索引
复合索引缺失多条件WHERE查询缓慢创建复合索引
无连接池延迟高实现连接池(20-50个连接)
SQL注入安全漏洞仅使用预编译语句
临时表溢出排序时磁盘I/O增大work_mem

Complete Setup Checklist

完整设置检查清单

Production Deployment:
  • Enable pg_stat_statements or performance_schema
  • Index all foreign key columns
  • Index columns in WHERE, JOIN, ORDER BY clauses
  • Replace SELECT * with specific columns
  • Add LIMIT to all unbounded queries
  • Use prepared statements (parameterized queries)
  • Implement connection pooling (20-50 connections)
  • Configure work_mem (256MB-1GB per connection)
  • Configure shared_buffers (25% of RAM for PostgreSQL)
  • Enable slow query logging (threshold: 100-1000ms)
  • Run EXPLAIN ANALYZE on all critical queries
  • Set up daily monitoring of pg_stat_statements
  • Schedule ANALYZE after nightly bulk operations
  • Monitor cache hit ratio (target: >99%)
  • Review and drop unused indexes monthly

生产部署:
  • 启用pg_stat_statements或performance_schema
  • 为所有外键列创建索引
  • 为WHERE、JOIN、ORDER BY子句中的列创建索引
  • 用特定列替换SELECT *
  • 为所有无限制查询添加LIMIT
  • 使用预编译语句(参数化查询)
  • 实现连接池(20-50个连接)
  • 配置work_mem(每个连接256MB-1GB)
  • 配置shared_buffers(PostgreSQL为内存的25%)
  • 启用慢查询日志(阈值:100-1000ms)
  • 对所有关键查询执行EXPLAIN ANALYZE
  • 设置每日pg_stat_statements监控
  • 夜间批量操作后安排ANALYZE执行
  • 监控缓存命中率(目标:>99%)
  • 每月审核并删除未使用的索引

Production Example

生产环境示例

Before Optimization:
sql
-- Query: Fetch user orders
SELECT * FROM orders WHERE user_id = 123;

-- Performance:
-- Execution time: 2500ms
-- Seq Scan on orders (1M rows scanned)
-- Network: 50MB transferred
-- No index on user_id
After Optimization:
sql
-- Add index
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

-- Optimize query
SELECT id, total, status, created_at
FROM orders
WHERE user_id = 123
LIMIT 100;

-- Performance:
-- Execution time: 12ms (208x faster!)
-- Index Scan using idx_orders_user_id (100 rows)
-- Network: 50KB transferred (1000x less!)
-- Covering index with INCLUDE
Result: 208x faster execution, 1000x less data transfer

For comprehensive optimization guidance, error resolution, and production patterns, load the appropriate reference files listed in "When to Load References" above.
优化前:
sql
-- 查询:获取用户订单
SELECT * FROM orders WHERE user_id = 123;

-- 性能:
-- 执行时间:2500ms
-- 对orders表执行顺序扫描(扫描100万行)
-- 网络传输:50MB
-- user_id列无索引
优化后:
sql
-- 添加索引
CREATE INDEX CONCURRENTLY idx_orders_user_id ON orders(user_id);

-- 优化查询
SELECT id, total, status, created_at
FROM orders
WHERE user_id = 123
LIMIT 100;

-- 性能:
-- 执行时间:12ms(快208倍!)
-- 使用idx_orders_user_id执行索引扫描(扫描100行)
-- 网络传输:50KB(减少1000倍!)
-- 使用包含INCLUDE的覆盖索引
结果:执行速度快208倍,数据传输减少1000倍

如需全面的优化指导、错误排查和生产环境模式,请加载上述「何时加载参考文档」部分列出的对应参考文件。