database-indexing-strategy

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Database Indexing Strategy

数据库索引策略

Overview

概述

Design comprehensive indexing strategies to improve query performance, reduce lock contention, and maintain data integrity. Covers index types, design patterns, and maintenance procedures.
设计全面的索引策略以提升查询性能、减少锁竞争并维护数据完整性。涵盖索引类型、设计模式和维护流程。

When to Use

适用场景

  • Index creation and planning
  • Query performance optimization through indexing
  • Index type selection (B-tree, Hash, GiST, BRIN)
  • Composite and partial index design
  • Index maintenance and monitoring
  • Storage optimization with indexes
  • Full-text search index design
  • 索引创建与规划
  • 通过索引优化查询性能
  • 索引类型选择(B-tree、Hash、GiST、BRIN)
  • 复合索引与部分索引设计
  • 索引维护与监控
  • 利用索引优化存储
  • 全文搜索索引设计

Index Types and Use Cases

索引类型及适用场景

PostgreSQL Index Types

PostgreSQL 索引类型

B-tree Indexes (Default):
sql
-- Standard equality and range queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created_at ON orders(created_at DESC);

-- Composite indexes for multi-column queries
CREATE INDEX idx_orders_user_status
ON orders(user_id, status)
WHERE cancelled_at IS NULL;
Hash Indexes:
sql
-- Exact match queries only
CREATE INDEX idx_product_sku USING hash ON products(sku);

-- Good for equality lookups on large text fields
CREATE INDEX idx_uuid_hash USING hash ON sessions(session_id);
BRIN Indexes (Block Range):
sql
-- For large tables with monotonically increasing columns
CREATE INDEX idx_events_timestamp USING brin ON events(created_at)
WITH (pages_per_range = 128);

-- Excellent for time-series data
CREATE INDEX idx_logs_timestamp USING brin
ON application_logs(log_timestamp);
GiST & GIN Indexes:
sql
-- GiST for spatial data and complex types
CREATE INDEX idx_locations_geom USING gist ON locations(geom);

-- GIN for JSONB and array columns
CREATE INDEX idx_products_metadata USING gin ON products(metadata);
CREATE INDEX idx_user_tags USING gin ON users(tags);
B-tree 索引(默认):
sql
-- Standard equality and range queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created_at ON orders(created_at DESC);

-- Composite indexes for multi-column queries
CREATE INDEX idx_orders_user_status
ON orders(user_id, status)
WHERE cancelled_at IS NULL;
Hash 索引:
sql
-- Exact match queries only
CREATE INDEX idx_product_sku USING hash ON products(sku);

-- Good for equality lookups on large text fields
CREATE INDEX idx_uuid_hash USING hash ON sessions(session_id);
BRIN 索引(块范围索引):
sql
-- For large tables with monotonically increasing columns
CREATE INDEX idx_events_timestamp USING brin ON events(created_at)
WITH (pages_per_range = 128);

-- Excellent for time-series data
CREATE INDEX idx_logs_timestamp USING brin
ON application_logs(log_timestamp);
GiST & GIN 索引:
sql
-- GiST for spatial data and complex types
CREATE INDEX idx_locations_geom USING gist ON locations(geom);

-- GIN for JSONB and array columns
CREATE INDEX idx_products_metadata USING gin ON products(metadata);
CREATE INDEX idx_user_tags USING gin ON users(tags);

MySQL Index Types

MySQL 索引类型

B-tree Indexes:
sql
-- Standard index for most queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at);

-- Prefix indexes for large columns
CREATE INDEX idx_description_prefix
ON products(description(100));
FULLTEXT Indexes:
sql
-- Full-text search on text columns
CREATE FULLTEXT INDEX idx_products_search
ON products(name, description);

-- Query using MATCH...AGAINST
SELECT * FROM products
WHERE MATCH(name, description) AGAINST('laptop' IN BOOLEAN MODE);
Spatial Indexes:
sql
-- For geographic data
CREATE SPATIAL INDEX idx_locations
ON locations(geom);
B-tree 索引:
sql
-- Standard index for most queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_created
ON orders(user_id, created_at);

-- Prefix indexes for large columns
CREATE INDEX idx_description_prefix
ON products(description(100));
FULLTEXT 索引:
sql
-- Full-text search on text columns
CREATE FULLTEXT INDEX idx_products_search
ON products(name, description);

-- Query using MATCH...AGAINST
SELECT * FROM products
WHERE MATCH(name, description) AGAINST('laptop' IN BOOLEAN MODE);
空间索引:
sql
-- For geographic data
CREATE SPATIAL INDEX idx_locations
ON locations(geom);

Index Design Patterns

索引设计模式

Single Column Indexes

单列索引

PostgreSQL:
sql
-- Filtered index for active records only
CREATE INDEX idx_users_active
ON users(created_at)
WHERE deleted_at IS NULL;

-- Descending order for LIMIT queries
CREATE INDEX idx_posts_published DESC
ON posts(published_at DESC)
WHERE status = 'published';
MySQL:
sql
-- Simple equality lookup
CREATE INDEX idx_users_verified ON users(email_verified);

-- Range queries on numeric columns
CREATE INDEX idx_products_price ON products(price);
PostgreSQL:
sql
-- Filtered index for active records only
CREATE INDEX idx_users_active
ON users(created_at)
WHERE deleted_at IS NULL;

-- Descending order for LIMIT queries
CREATE INDEX idx_posts_published DESC
ON posts(published_at DESC)
WHERE status = 'published';
MySQL:
sql
-- Simple equality lookup
CREATE INDEX idx_users_verified ON users(email_verified);

-- Range queries on numeric columns
CREATE INDEX idx_products_price ON products(price);

Composite Indexes

复合索引

PostgreSQL - Optimal Ordering:
sql
-- Order: equality columns, then range, then sort
-- Query: WHERE user_id = X AND created_at > Y ORDER BY id
CREATE INDEX idx_optimal_composite
ON orders(user_id, created_at, id);

-- Covering index to eliminate table access
CREATE INDEX idx_covering_orders
ON orders(user_id, status, created_at)
INCLUDE (total, currency);
MySQL - Leftmost Prefix:
sql
-- MySQL uses leftmost prefix matching
-- Can be used by: (user_id), (user_id, status), (user_id, status, created_at)
CREATE INDEX idx_users_complex
ON users(user_id, status, created_at);

-- For queries: user_id + status + created_at
SELECT * FROM orders
WHERE user_id = 1 AND status = 'completed' AND created_at > '2024-01-01';
PostgreSQL - 最优排序:
sql
-- Order: equality columns, then range, then sort
-- Query: WHERE user_id = X AND created_at > Y ORDER BY id
CREATE INDEX idx_optimal_composite
ON orders(user_id, created_at, id);

-- Covering index to eliminate table access
CREATE INDEX idx_covering_orders
ON orders(user_id, status, created_at)
INCLUDE (total, currency);
MySQL - 最左前缀匹配:
sql
-- MySQL uses leftmost prefix matching
-- Can be used by: (user_id), (user_id, status), (user_id, status, created_at)
CREATE INDEX idx_users_complex
ON users(user_id, status, created_at);

-- For queries: user_id + status + created_at
SELECT * FROM orders
WHERE user_id = 1 AND status = 'completed' AND created_at > '2024-01-01';

Partial/Filtered Indexes

部分/过滤索引

PostgreSQL:
sql
-- Only index active products
CREATE INDEX idx_active_products
ON products(category_id)
WHERE active = true;

-- Reduce index size and improve performance
CREATE INDEX idx_not_cancelled_orders
ON orders(user_id, created_at)
WHERE status != 'cancelled';

-- Complex filter conditions
CREATE INDEX idx_vip_orders
ON orders(total DESC)
WHERE total > 10000 AND customer_type = 'vip';
PostgreSQL:
sql
-- Only index active products
CREATE INDEX idx_active_products
ON products(category_id)
WHERE active = true;

-- Reduce index size and improve performance
CREATE INDEX idx_not_cancelled_orders
ON orders(user_id, created_at)
WHERE status != 'cancelled';

-- Complex filter conditions
CREATE INDEX idx_vip_orders
ON orders(total DESC)
WHERE total > 10000 AND customer_type = 'vip';

Expression Indexes

表达式索引

PostgreSQL:
sql
-- Index on computed values
CREATE INDEX idx_users_email_lower
ON users(LOWER(email));

-- Enable case-insensitive searches
SELECT * FROM users WHERE LOWER(email) = 'john@example.com';

-- Date extraction indexes
CREATE INDEX idx_orders_year
ON orders(EXTRACT(YEAR FROM created_at));
PostgreSQL:
sql
-- Index on computed values
CREATE INDEX idx_users_email_lower
ON users(LOWER(email));

-- Enable case-insensitive searches
SELECT * FROM users WHERE LOWER(email) = 'john@example.com';

-- Date extraction indexes
CREATE INDEX idx_orders_year
ON orders(EXTRACT(YEAR FROM created_at));

Index Maintenance

索引维护

PostgreSQL Index Analysis:
sql
-- Check index size and usage
SELECT schemaname, tablename, indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as size,
  idx_scan as scans,
  idx_tup_read as tuples_read
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

-- Find unused indexes
SELECT schemaname, tablename, indexname
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE 'pg_toast%';

-- Rebuild fragmented indexes
REINDEX INDEX idx_users_email;
MySQL Index Statistics:
sql
-- Check index cardinality
SELECT object_schema, object_name, count_star
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE object_schema != 'mysql'
ORDER BY count_star DESC;

-- Update table statistics
ANALYZE TABLE users;
ANALYZE TABLE orders;
PostgreSQL 索引分析:
sql
-- Check index size and usage
SELECT schemaname, tablename, indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as size,
  idx_scan as scans,
  idx_tup_read as tuples_read
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

-- Find unused indexes
SELECT schemaname, tablename, indexname
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND indexrelname NOT LIKE 'pg_toast%';

-- Rebuild fragmented indexes
REINDEX INDEX idx_users_email;
MySQL 索引统计:
sql
-- Check index cardinality
SELECT object_schema, object_name, count_star
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE object_schema != 'mysql'
ORDER BY count_star DESC;

-- Update table statistics
ANALYZE TABLE users;
ANALYZE TABLE orders;

Concurrent Index Creation

并发索引创建

PostgreSQL - Non-blocking Index Creation:
sql
-- Create index without locking table (PostgreSQL 9.2+)
CREATE INDEX CONCURRENTLY idx_new_column
ON large_table(new_column);

-- Safe for production
REINDEX INDEX CONCURRENTLY idx_products_price;
MySQL - Concurrent Index Creation:
sql
-- MySQL 8.0 supports ALGORITHM=INPLACE with LOCK=NONE
ALTER TABLE users ADD INDEX idx_created (created_at),
ALGORITHM=INPLACE, LOCK=NONE;

-- Check online DDL progress
SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;
PostgreSQL - 非阻塞索引创建:
sql
-- Create index without locking table (PostgreSQL 9.2+)
CREATE INDEX CONCURRENTLY idx_new_column
ON large_table(new_column);

-- Safe for production
REINDEX INDEX CONCURRENTLY idx_products_price;
MySQL - 并发索引创建:
sql
-- MySQL 8.0 supports ALGORITHM=INPLACE with LOCK=NONE
ALTER TABLE users ADD INDEX idx_created (created_at),
ALGORITHM=INPLACE, LOCK=NONE;

-- Check online DDL progress
SELECT * FROM INFORMATION_SCHEMA.INNODB_TRX;

Performance Monitoring

性能监控

PostgreSQL - Index Performance:
sql
-- Top 10 most scanned indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC LIMIT 10;

-- Indexes with high read/scan ratio
SELECT indexname, idx_scan, idx_tup_read,
  CASE WHEN idx_scan = 0 THEN 0
    ELSE ROUND(idx_tup_read::numeric / idx_scan, 2) END as efficiency
FROM pg_stat_user_indexes
WHERE idx_scan > 0
ORDER BY efficiency DESC;
MySQL - Index Statistics:
sql
-- Show table index information
SHOW INDEX FROM products;

-- Check cardinality (distribution)
SELECT * FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
  AND TABLE_NAME = 'products'
ORDER BY SEQ_IN_INDEX;
PostgreSQL - 索引性能:
sql
-- Top 10 most scanned indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC LIMIT 10;

-- Indexes with high read/scan ratio
SELECT indexname, idx_scan, idx_tup_read,
  CASE WHEN idx_scan = 0 THEN 0
    ELSE ROUND(idx_tup_read::numeric / idx_scan, 2) END as efficiency
FROM pg_stat_user_indexes
WHERE idx_scan > 0
ORDER BY efficiency DESC;
MySQL - 索引统计:
sql
-- Show table index information
SHOW INDEX FROM products;

-- Check cardinality (distribution)
SELECT * FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = 'your_database'
  AND TABLE_NAME = 'products'
ORDER BY SEQ_IN_INDEX;

Index Creation Checklist

索引创建检查清单

  • Identify slow queries with EXPLAIN/EXPLAIN ANALYZE
  • Check filter columns, JOIN conditions, ORDER BY clauses
  • Consider index order (equality → range → sort)
  • Use partial indexes to reduce size on large tables
  • Include columns for covering indexes
  • Monitor index usage after creation
  • Drop unused indexes to save space
  • Rebuild fragmented indexes periodically
  • 使用EXPLAIN/EXPLAIN ANALYZE识别慢查询
  • 检查过滤列、JOIN条件、ORDER BY子句
  • 考虑索引顺序(等值条件 → 范围条件 → 排序)
  • 对大表使用部分索引以减小尺寸
  • 为覆盖索引包含必要列
  • 创建后监控索引使用情况
  • 删除未使用的索引以节省空间
  • 定期重建碎片化索引

Common Mistakes

常见错误

❌ Don't create too many indexes (write performance impact) ❌ Don't create indexes without testing first ❌ Don't ignore index size and storage impact ❌ Don't forget to update table statistics after bulk operations ❌ Don't create duplicate indexes
✅ DO create indexes on foreign keys ✅ DO test index impact on INSERT/UPDATE performance ✅ DO use covering indexes for common queries ✅ DO drop unused indexes regularly ✅ DO monitor index fragmentation
❌ 不要创建过多索引(影响写入性能) ❌ 不要未经测试就创建索引 ❌ 不要忽略索引尺寸和存储影响 ❌ 批量操作后不要忘记更新表统计信息 ❌ 不要创建重复索引
✅ 要为外键创建索引 ✅ 要测试索引对INSERT/UPDATE性能的影响 ✅ 要为常用查询使用覆盖索引 ✅ 要定期删除未使用的索引 ✅ 要监控索引碎片化情况

Resources

参考资源