postgresql-best-practices

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

PostgreSQL Best Practices

PostgreSQL开发最佳实践

Core Principles

核心原则

Leverage PostgreSQL's advanced features for robust data modeling
Optimize queries using EXPLAIN ANALYZE and proper indexing strategies
Use native PostgreSQL data types appropriately
Implement proper connection pooling and resource management
Follow PostgreSQL-specific security best practices

利用PostgreSQL的高级特性实现稳健的数据建模
使用EXPLAIN ANALYZE和合理的索引策略优化查询
恰当使用PostgreSQL原生数据类型
实现合理的连接池与资源管理
遵循PostgreSQL特定的安全最佳实践

Schema Design

模式设计

Data Types

数据类型

Use appropriate native types:
```
UUID
```
,
```
JSONB
```
,
```
ARRAY
```
,
```
INET
```
,
```
CIDR
```
Prefer
```
TIMESTAMPTZ
```
over
```
TIMESTAMP
```
for timezone-aware applications
Use
```
TEXT
```
instead of
```
VARCHAR
```
when no length limit is needed
Consider
```
NUMERIC
```
for precise decimal calculations (financial data)
Use
```
SERIAL
```
or
```
BIGSERIAL
```
for auto-incrementing IDs, or
```
UUID
```
for distributed systems

sql

CREATE TABLE orders (
    order_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    customer_id UUID NOT NULL REFERENCES customers(customer_id),
    order_data JSONB NOT NULL DEFAULT '{}',
    tags TEXT[] DEFAULT '{}',
    total_amount NUMERIC(12, 2) NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

使用合适的原生类型：
```
UUID
```
、
```
JSONB
```
、
```
ARRAY
```
、
```
INET
```
、
```
CIDR
```
对于需支持时区的应用，优先使用
```
TIMESTAMPTZ
```
而非
```
TIMESTAMP
```
当无需长度限制时，使用
```
TEXT
```
替代
```
VARCHAR
```
对于精确的十进制计算（如金融数据），考虑使用
```
NUMERIC
```
对于自增ID使用
```
SERIAL
```
或
```
BIGSERIAL
```
，分布式系统中使用
```
UUID
```

sql

CREATE TABLE orders (
    order_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    customer_id UUID NOT NULL REFERENCES customers(customer_id),
    order_data JSONB NOT NULL DEFAULT '{}',
    tags TEXT[] DEFAULT '{}',
    total_amount NUMERIC(12, 2) NOT NULL,
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Table Design

表设计

Always define primary keys
Use foreign keys with appropriate ON DELETE/UPDATE actions
Add NOT NULL constraints where appropriate
Use CHECK constraints for data validation
Consider partitioning for large tables

sql

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    sku VARCHAR(50) NOT NULL UNIQUE,
    name TEXT NOT NULL,
    price NUMERIC(10, 2) NOT NULL CHECK (price >= 0),
    status VARCHAR(20) NOT NULL DEFAULT 'active'
        CHECK (status IN ('active', 'inactive', 'discontinued')),
    metadata JSONB DEFAULT '{}'
);

始终定义主键
使用带有合适ON DELETE/UPDATE动作的外键
在合适的字段添加NOT NULL约束
使用CHECK约束进行数据验证
针对大表考虑使用分区

sql

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    sku VARCHAR(50) NOT NULL UNIQUE,
    name TEXT NOT NULL,
    price NUMERIC(10, 2) NOT NULL CHECK (price >= 0),
    status VARCHAR(20) NOT NULL DEFAULT 'active'
        CHECK (status IN ('active', 'inactive', 'discontinued')),
    metadata JSONB DEFAULT '{}'
);

Partitioning

分区

Use declarative partitioning for large tables (millions of rows)
Choose appropriate partition strategy: RANGE, LIST, or HASH
Create indexes on partitioned tables after partitioning

sql

CREATE TABLE events (
    event_id BIGSERIAL,
    event_type VARCHAR(50) NOT NULL,
    payload JSONB,
    created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_q1 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

针对大表（百万级行）使用声明式分区
选择合适的分区策略：RANGE、LIST或HASH
分区完成后为分区表创建索引

sql

CREATE TABLE events (
    event_id BIGSERIAL,
    event_type VARCHAR(50) NOT NULL,
    payload JSONB,
    created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_q1 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');

Indexing Strategies

索引策略

Index Types

索引类型

Use B-tree indexes (default) for equality and range queries
Use GIN indexes for JSONB, arrays, and full-text search
Use GiST indexes for geometric data and range types
Use BRIN indexes for large, naturally ordered data
Consider partial indexes for filtered queries

sql

-- B-tree index for common lookups
CREATE INDEX idx_orders_customer ON orders(customer_id);

-- GIN index for JSONB queries
CREATE INDEX idx_orders_data ON orders USING GIN (order_data);

-- Partial index for active records only
CREATE INDEX idx_active_products ON products(name) WHERE status = 'active';

-- Covering index to avoid table lookup
CREATE INDEX idx_orders_covering ON orders(customer_id)
    INCLUDE (order_date, total_amount);

使用B-tree索引（默认）处理等值和范围查询
针对JSONB、数组和全文搜索使用GIN索引
针对几何数据和范围类型使用GiST索引
针对大型、自然有序的数据使用BRIN索引
针对过滤查询考虑使用部分索引

sql

-- B-tree index for common lookups
CREATE INDEX idx_orders_customer ON orders(customer_id);

-- GIN index for JSONB queries
CREATE INDEX idx_orders_data ON orders USING GIN (order_data);

-- Partial index for active records only
CREATE INDEX idx_active_products ON products(name) WHERE status = 'active';

-- Covering index to avoid table lookup
CREATE INDEX idx_orders_covering ON orders(customer_id)
    INCLUDE (order_date, total_amount);

Index Maintenance

索引维护

Regularly run ANALYZE to update statistics
Use REINDEX for bloated indexes
Monitor index usage with
```
pg_stat_user_indexes
```
Remove unused indexes to reduce write overhead

sql

-- Check index usage
SELECT schemaname, relname, indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;

定期运行ANALYZE更新统计信息
针对膨胀的索引使用REINDEX
使用
```
pg_stat_user_indexes
```
监控索引使用情况
删除未使用的索引以减少写入开销

sql

-- Check index usage
SELECT schemaname, relname, indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;

Query Optimization

查询优化

EXPLAIN ANALYZE

Always analyze query plans for slow queries
Look for sequential scans on large tables
Identify missing indexes from query plans
Watch for high row estimates vs actual rows

sql

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT c.name, COUNT(o.order_id)
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.created_at > '2024-01-01'
GROUP BY c.customer_id, c.name;

始终为慢查询分析执行计划
关注大表上的顺序扫描
从执行计划中识别缺失的索引
注意预估行数与实际行数的差异

sql

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT c.name, COUNT(o.order_id)
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.created_at > '2024-01-01'
GROUP BY c.customer_id, c.name;

Common Table Expressions (CTEs)

通用表表达式（CTEs）

Use CTEs for complex query organization
Note: CTEs are optimization fences in older PostgreSQL versions
Use
```
MATERIALIZED
```
/
```
NOT MATERIALIZED
```
hints in PostgreSQL 12+

sql

WITH recent_orders AS MATERIALIZED (
    SELECT customer_id, COUNT(*) AS order_count, SUM(total) AS total_spent
    FROM orders
    WHERE order_date > CURRENT_DATE - INTERVAL '30 days'
    GROUP BY customer_id
)
SELECT c.name, ro.order_count, ro.total_spent
FROM customers c
JOIN recent_orders ro ON c.customer_id = ro.customer_id
WHERE ro.total_spent > 1000;

使用CTEs组织复杂查询
注意：在旧版PostgreSQL中，CTEs是优化屏障
在PostgreSQL 12+中使用
```
MATERIALIZED
```
/
```
NOT MATERIALIZED
```
提示

sql

WITH recent_orders AS MATERIALIZED (
    SELECT customer_id, COUNT(*) AS order_count, SUM(total) AS total_spent
    FROM orders
    WHERE order_date > CURRENT_DATE - INTERVAL '30 days'
    GROUP BY customer_id
)
SELECT c.name, ro.order_count, ro.total_spent
FROM customers c
JOIN recent_orders ro ON c.customer_id = ro.customer_id
WHERE ro.total_spent > 1000;

Window Functions

窗口函数

Use window functions for analytics queries
Leverage PARTITION BY and ORDER BY for complex calculations

sql

SELECT
    order_id,
    customer_id,
    total_amount,
    SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS running_total,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS order_rank
FROM orders;

使用窗口函数处理分析类查询
利用PARTITION BY和ORDER BY进行复杂计算

sql

SELECT
    order_id,
    customer_id,
    total_amount,
    SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY order_date) AS running_total,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS order_rank
FROM orders;

JSONB Best Practices

JSONB最佳实践

Use JSONB over JSON for better performance and indexing
Create GIN indexes for JSONB columns you query
Use containment operators (@>, <@) for efficient queries
Extract frequently queried fields to regular columns

sql

-- Efficient JSONB query with GIN index
SELECT * FROM products
WHERE metadata @> '{"category": "electronics"}';

-- Extract specific fields
SELECT
    product_id,
    metadata->>'brand' AS brand,
    (metadata->>'rating')::numeric AS rating
FROM products
WHERE metadata ? 'rating';

为获得更好的性能和索引支持，优先使用JSONB而非JSON
为需要查询的JSONB列创建GIN索引
使用包含操作符（@>, <@）实现高效查询
将频繁查询的字段提取为常规列

sql

-- Efficient JSONB query with GIN index
SELECT * FROM products
WHERE metadata @> '{"category": "electronics"}';

-- Extract specific fields
SELECT
    product_id,
    metadata->>'brand' AS brand,
    (metadata->>'rating')::numeric AS rating
FROM products
WHERE metadata ? 'rating';

Connection Management

连接管理

Connection Pooling

连接池

Use PgBouncer or pgpool-II for connection pooling
Set appropriate pool sizes based on workload
Use transaction pooling mode for short-lived connections

使用PgBouncer或pgpool-II实现连接池
根据工作负载设置合适的池大小
针对短连接使用事务池模式

Connection Settings

连接设置

sql

-- Recommended session settings
SET statement_timeout = '30s';
SET lock_timeout = '10s';
SET idle_in_transaction_session_timeout = '60s';

sql

-- Recommended session settings
SET statement_timeout = '30s';
SET lock_timeout = '10s';
SET idle_in_transaction_session_timeout = '60s';

Transactions and Locking

事务与锁

Use appropriate transaction isolation levels
Keep transactions short to reduce lock contention
Use advisory locks for application-level locking
Monitor and resolve lock conflicts

sql

-- Use advisory locks for application coordination
SELECT pg_advisory_lock(hashtext('resource_name'));
-- Do work
SELECT pg_advisory_unlock(hashtext('resource_name'));

-- Check for blocking queries
SELECT blocked_locks.pid AS blocked_pid,
       blocking_locks.pid AS blocking_pid,
       blocked_activity.query AS blocked_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.relation = blocked_locks.relation
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocked_activity
    ON blocked_activity.pid = blocked_locks.pid;

使用合适的事务隔离级别
保持事务简短以减少锁竞争
使用 advisory locks 实现应用级锁
监控并解决锁冲突

sql

-- Use advisory locks for application coordination
SELECT pg_advisory_lock(hashtext('resource_name'));
-- Do work
SELECT pg_advisory_unlock(hashtext('resource_name'));

-- Check for blocking queries
SELECT blocked_locks.pid AS blocked_pid,
       blocking_locks.pid AS blocking_pid,
       blocked_activity.query AS blocked_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.relation = blocked_locks.relation
    AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocked_activity
    ON blocked_activity.pid = blocked_locks.pid;

Maintenance

维护

Vacuum and Analyze

Vacuum与Analyze

Enable autovacuum and tune for your workload
Run manual VACUUM ANALYZE after bulk operations
Monitor table bloat

sql

-- Check table bloat
SELECT schemaname, relname,
       n_live_tup, n_dead_tup,
       round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

启用autovacuum并根据工作负载调优
批量操作后手动运行VACUUM ANALYZE
监控表膨胀情况

sql

-- Check table bloat
SELECT schemaname, relname,
       n_live_tup, n_dead_tup,
       round(n_dead_tup * 100.0 / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Backup Strategies

备份策略

Use pg_dump for logical backups
Use pg_basebackup for physical backups
Implement point-in-time recovery (PITR) with WAL archiving
Test backup restoration regularly

使用pg_dump进行逻辑备份
使用pg_basebackup进行物理备份
结合WAL归档实现时间点恢复（PITR）
定期测试备份恢复

Security

安全

Use SSL/TLS for connections
Implement row-level security (RLS) for multi-tenant applications
Use roles and GRANT/REVOKE for access control
Audit sensitive operations with pgAudit extension

sql

-- Enable row-level security
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY documents_tenant_policy ON documents
    FOR ALL
    USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- Grant minimal privileges
GRANT SELECT, INSERT, UPDATE ON orders TO app_user;
GRANT USAGE ON SEQUENCE orders_order_id_seq TO app_user;

为连接使用SSL/TLS
针对多租户应用实现行级安全（RLS）
使用角色和GRANT/REVOKE进行访问控制
使用pgAudit扩展审计敏感操作

sql

-- Enable row-level security
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

CREATE POLICY documents_tenant_policy ON documents
    FOR ALL
    USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- Grant minimal privileges
GRANT SELECT, INSERT, UPDATE ON orders TO app_user;
GRANT USAGE ON SEQUENCE orders_order_id_seq TO app_user;

Monitoring

监控

Monitor with pg_stat_statements extension
Track slow queries and optimize regularly
Set up alerts for replication lag, connection count, and disk usage
Use pg_stat_activity to monitor active queries

sql

-- Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slow queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

使用pg_stat_statements扩展进行监控
跟踪慢查询并定期优化
针对复制延迟、连接数和磁盘使用设置告警
使用pg_stat_activity监控活跃查询

sql

-- Enable pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slow queries
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;