sql-query-optimization

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

SQL Query Optimization

SQL查询优化

Overview

概述

Analyze SQL queries to identify performance bottlenecks and implement optimization techniques. Includes query analysis, indexing strategies, and rewriting patterns for improved performance.
分析SQL查询以识别性能瓶颈,并实施优化技术。内容包括查询分析、索引策略以及用于提升性能的查询重写模式。

When to Use

适用场景

  • Slow query analysis and tuning
  • Query rewriting and refactoring
  • Index utilization verification
  • Join optimization
  • Subquery optimization
  • Query plan analysis (EXPLAIN)
  • Performance baseline establishment
  • 慢查询分析与调优
  • 查询重写与重构
  • 索引利用率验证
  • 连接查询优化
  • 子查询优化
  • 查询计划分析(EXPLAIN)
  • 性能基准建立

Query Analysis Framework

查询分析框架

1. Analyze Current Performance

1. 分析当前性能

PostgreSQL:
sql
-- Analyze query plan with execution time
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT u.id, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > NOW() - INTERVAL '1 year'
GROUP BY u.id, u.email;

-- Check table statistics
SELECT * FROM pg_stats
WHERE tablename = 'users' AND attname = 'created_at';
MySQL:
sql
-- Analyze query plan
EXPLAIN FORMAT=JSON
SELECT u.id, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > DATE_SUB(NOW(), INTERVAL 1 YEAR)
GROUP BY u.id, u.email;

-- Check table size
SELECT table_name, ROUND(((data_length + index_length) / 1024 / 1024), 2) AS 'Size_MB'
FROM information_schema.tables WHERE table_schema = 'database_name';
PostgreSQL:
sql
-- Analyze query plan with execution time
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT u.id, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > NOW() - INTERVAL '1 year'
GROUP BY u.id, u.email;

-- Check table statistics
SELECT * FROM pg_stats
WHERE tablename = 'users' AND attname = 'created_at';
MySQL:
sql
-- Analyze query plan
EXPLAIN FORMAT=JSON
SELECT u.id, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > DATE_SUB(NOW(), INTERVAL 1 YEAR)
GROUP BY u.id, u.email;

-- Check table size
SELECT table_name, ROUND(((data_length + index_length) / 1024 / 1024), 2) AS 'Size_MB'
FROM information_schema.tables WHERE table_schema = 'database_name';

2. Common Optimization Patterns

2. 常见优化模式

PostgreSQL - Index Optimization:
sql
-- Create indexes for frequently filtered columns
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at DESC)
WHERE status != 'cancelled';

-- Partial indexes for filtered queries
CREATE INDEX idx_active_products
ON products(category_id)
WHERE active = true;

-- Multi-column covering indexes
CREATE INDEX idx_users_email_verified_covering
ON users(email, verified)
INCLUDE (id, name, created_at);
MySQL - Index Optimization:
sql
-- Create composite index for multi-column filtering
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at DESC);

-- Use FULLTEXT index for text search
CREATE FULLTEXT INDEX idx_products_search
ON products(name, description);

-- Prefix indexes for large VARCHAR
CREATE INDEX idx_large_text
ON large_table(text_column(100));
PostgreSQL - 索引优化:
sql
-- Create indexes for frequently filtered columns
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at DESC)
WHERE status != 'cancelled';

-- Partial indexes for filtered queries
CREATE INDEX idx_active_products
ON products(category_id)
WHERE active = true;

-- Multi-column covering indexes
CREATE INDEX idx_users_email_verified_covering
ON users(email, verified)
INCLUDE (id, name, created_at);
MySQL - 索引优化:
sql
-- Create composite index for multi-column filtering
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at DESC);

-- Use FULLTEXT index for text search
CREATE FULLTEXT INDEX idx_products_search
ON products(name, description);

-- Prefix indexes for large VARCHAR
CREATE INDEX idx_large_text
ON large_table(text_column(100));

3. Query Rewriting Techniques

3. 查询重写技巧

PostgreSQL - Window Functions:
sql
-- Inefficient: multiple passes
SELECT p.id, p.name,
  (SELECT COUNT(*) FROM orders o WHERE o.product_id = p.id) as order_count,
  (SELECT SUM(quantity) FROM order_items oi WHERE oi.product_id = p.id) as total_sold
FROM products p;

-- Optimized: single pass with window functions
SELECT DISTINCT p.id, p.name,
  COUNT(*) OVER (PARTITION BY p.id) as order_count,
  SUM(oi.quantity) OVER (PARTITION BY p.id) as total_sold
FROM products p
LEFT JOIN order_items oi ON p.id = oi.product_id;
MySQL - JOIN Optimization:
sql
-- Inefficient: JOIN after aggregation
SELECT user_id, name, total_orders
FROM (
  SELECT u.id as user_id, u.name, COUNT(o.id) as total_orders
  FROM users u
  LEFT JOIN orders o ON u.id = o.user_id
  GROUP BY u.id, u.name
) subquery
WHERE total_orders > 5;

-- Optimized: aggregate with HAVING clause
SELECT u.id, u.name, COUNT(o.id) as total_orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name
HAVING COUNT(o.id) > 5;
PostgreSQL - 窗口函数:
sql
-- Inefficient: multiple passes
SELECT p.id, p.name,
  (SELECT COUNT(*) FROM orders o WHERE o.product_id = p.id) as order_count,
  (SELECT SUM(quantity) FROM order_items oi WHERE oi.product_id = p.id) as total_sold
FROM products p;

-- Optimized: single pass with window functions
SELECT DISTINCT p.id, p.name,
  COUNT(*) OVER (PARTITION BY p.id) as order_count,
  SUM(oi.quantity) OVER (PARTITION BY p.id) as total_sold
FROM products p
LEFT JOIN order_items oi ON p.id = oi.product_id;
MySQL - 连接查询优化:
sql
-- Inefficient: JOIN after aggregation
SELECT user_id, name, total_orders
FROM (
  SELECT u.id as user_id, u.name, COUNT(o.id) as total_orders
  FROM users u
  LEFT JOIN orders o ON u.id = o.user_id
  GROUP BY u.id, u.name
) subquery
WHERE total_orders > 5;

-- Optimized: aggregate with HAVING clause
SELECT u.id, u.name, COUNT(o.id) as total_orders
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name
HAVING COUNT(o.id) > 5;

4. Batch Operations

4. 批量操作

PostgreSQL - Bulk Insert:
sql
-- Inefficient: multiple round trips
INSERT INTO users (email, name) VALUES ('user1@example.com', 'User One');
INSERT INTO users (email, name) VALUES ('user2@example.com', 'User Two');

-- Optimized: single batch
INSERT INTO users (email, name) VALUES
  ('user1@example.com', 'User One'),
  ('user2@example.com', 'User Two'),
  ('user3@example.com', 'User Three')
ON CONFLICT (email) DO UPDATE SET updated_at = NOW();
MySQL - Bulk Update:
sql
-- Optimized: bulk update with VALUES clause
UPDATE products p
JOIN (
  SELECT id, price FROM product_updates
) AS updates ON p.id = updates.id
SET p.price = updates.price;
PostgreSQL - 批量插入:
sql
-- Inefficient: multiple round trips
INSERT INTO users (email, name) VALUES ('user1@example.com', 'User One');
INSERT INTO users (email, name) VALUES ('user2@example.com', 'User Two');

-- Optimized: single batch
INSERT INTO users (email, name) VALUES
  ('user1@example.com', 'User One'),
  ('user2@example.com', 'User Two'),
  ('user3@example.com', 'User Three')
ON CONFLICT (email) DO UPDATE SET updated_at = NOW();
MySQL - 批量更新:
sql
-- Optimized: bulk update with VALUES clause
UPDATE products p
JOIN (
  SELECT id, price FROM product_updates
) AS updates ON p.id = updates.id
SET p.price = updates.price;

Performance Monitoring

性能监控

PostgreSQL - Long Running Queries:
sql
-- Find slow queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 1000
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Reset statistics
SELECT pg_stat_statements_reset();
MySQL - Slow Query Log:
sql
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 2;

-- View slow queries
SELECT * FROM mysql.slow_log
ORDER BY start_time DESC LIMIT 10;
PostgreSQL - 长时运行查询:
sql
-- Find slow queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 1000
ORDER BY mean_exec_time DESC
LIMIT 10;

-- Reset statistics
SELECT pg_stat_statements_reset();
MySQL - 慢查询日志:
sql
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 2;

-- View slow queries
SELECT * FROM mysql.slow_log
ORDER BY start_time DESC LIMIT 10;

Key Optimization Checklist

关键优化检查清单

  • Use EXPLAIN/EXPLAIN ANALYZE before and after optimization
  • Add indexes to columns in WHERE, JOIN, and ORDER BY clauses
  • Use LIMIT when exploring large result sets
  • Avoid SELECT * when only specific columns needed
  • Use database functions instead of application-level processing
  • Batch operations to reduce network round trips
  • Partition large tables for improved query performance
  • Update statistics regularly with ANALYZE
  • 优化前后均使用EXPLAIN/EXPLAIN ANALYZE进行分析
  • 为WHERE、JOIN和ORDER BY子句中的列添加索引
  • 探索大型结果集时使用LIMIT
  • 仅需特定列时避免使用SELECT *
  • 使用数据库函数而非应用层处理
  • 采用批量操作以减少网络往返次数
  • 对大型表进行分区以提升查询性能
  • 定期使用ANALYZE更新统计信息

Common Pitfalls

常见误区

❌ Don't create indexes without testing impact ❌ Don't use LIKE with leading wildcard without full-text search ❌ Don't JOIN unnecessary tables ❌ Don't ignore ORDER BY performance impact ❌ Don't skip EXPLAIN analysis
✅ DO test query changes in development first ✅ DO monitor query performance after deployment ✅ DO update table statistics regularly ✅ DO use appropriate data types for columns ✅ DO consider materialized views for complex aggregations
❌ 不要在未测试影响的情况下创建索引 ❌ 不要在没有全文搜索的情况下使用前导通配符的LIKE查询 ❌ 不要连接不必要的表 ❌ 不要忽视ORDER BY对性能的影响 ❌ 不要跳过EXPLAIN分析
✅ 务必先在开发环境测试查询变更 ✅ 部署后务必监控查询性能 ✅ 务必定期更新表统计信息 ✅ 务必为列选择合适的数据类型 ✅ 复杂聚合查询可考虑使用物化视图

Resources

参考资源