postgresql-table-design
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePostgreSQL Table Design
PostgreSQL 表设计
Core Rules
核心规则
- Define a PRIMARY KEY for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer ; use
BIGINT GENERATED ALWAYS AS IDENTITYonly when global uniqueness/opacity is needed.UUID - Normalize first (to 3NF) to eliminate data redundancy and update anomalies; denormalize only for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
- Add NOT NULL everywhere it’s semantically required; use DEFAULTs for common values.
- Create indexes for access paths you actually query: PK/unique (auto), FK columns (manual!), frequent filters/sorts, and join keys.
- Prefer TIMESTAMPTZ for event time; NUMERIC for money; TEXT for strings; BIGINT for integer values, DOUBLE PRECISION for floats (or for exact decimal arithmetic).
NUMERIC
- 为参考表(如用户表、订单表等)定义PRIMARY KEY。时间序列/事件/日志类数据并非总是需要主键。使用主键时,优先选择;仅当需要全局唯一性或不透明性时才使用
BIGINT GENERATED ALWAYS AS IDENTITY。UUID - **先规范化(至3NF)**以消除数据冗余和更新异常;仅在连接性能被证实存在问题、且读取操作能获得高投资回报率的情况下,才进行反规范化。过早反规范化会增加维护负担。
- 在所有语义上必填的字段添加NOT NULL约束;为常用值设置DEFAULT默认值。
- 为实际查询的访问路径创建索引:主键/唯一索引(自动创建)、外键列(需手动创建!)、频繁过滤/排序的字段,以及连接键。
- 事件时间优先使用TIMESTAMPTZ;金额使用NUMERIC;字符串使用TEXT;整数值使用BIGINT;浮点数使用DOUBLE PRECISION(如需精确十进制运算则使用)。
NUMERIC
PostgreSQL “Gotchas”
PostgreSQL 常见陷阱
- Identifiers: unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use for table/column names.
snake_case - Unique + NULLs: UNIQUE allows multiple NULLs. Use (PG15+) to restrict to one NULL.
UNIQUE (...) NULLS NOT DISTINCT - FK indexes: PostgreSQL does not auto-index FK columns. Add them.
- No silent coercions: length/precision overflows error out (no truncation). Example: inserting 999 into fails with error, unlike some databases that silently truncate or round.
NUMERIC(2,0) - Sequences/identity have gaps (normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
- Heap storage: no clustered PK by default (unlike SQL Server/MySQL InnoDB); is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
CLUSTER - MVCC: updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.
- 标识符:未加引号的标识符会被转换为小写。避免使用带引号/大小写混合的名称。命名规范:表/列名使用。
snake_case - 唯一约束与NULL值:UNIQUE约束允许多个NULL值。使用(PostgreSQL 15+版本支持)来限制仅允许一个NULL值。
UNIQUE (...) NULLS NOT DISTINCT - 外键索引:PostgreSQL不会自动为外键列创建索引,需要手动添加。
- 无隐式类型转换:长度/精度溢出会直接报错(不会自动截断)。例如:将999插入字段会报错,而部分数据库会自动截断或四舍五入。
NUMERIC(2,0) - 序列/自增字段存在间隙(属于正常现象;无需“修复”)。回滚、崩溃和并发事务会导致ID序列出现间隙(如1,2,5,6...)。这是预期行为——不要尝试让ID连续。
- 堆存储:默认情况下不会对主键进行聚簇(与SQL Server/MySQL InnoDB不同);命令是一次性的重组操作,后续插入不会维持聚簇顺序。磁盘上的行顺序为插入顺序,除非显式执行聚簇操作。
CLUSTER - MVCC(多版本并发控制):更新/删除操作会留下死元组;由vacuum进程清理——设计时应避免频繁更新宽行数据导致的膨胀。
Data Types
数据类型
- IDs: preferred (
BIGINT GENERATED ALWAYS AS IDENTITYalso fine);GENERATED BY DEFAULTwhen merging/federating/used in a distributed system or for opaque IDs. Generate withUUID(preferred if using PG18+) oruuidv7()(if using an older PG version).gen_random_uuid() - Integers: prefer unless storage space is critical;
BIGINTfor smaller ranges; avoidINTEGERunless constrained.SMALLINT - Floats: prefer over
DOUBLE PRECISIONunless storage space is critical. UseREALfor exact decimal arithmetic.NUMERIC - Strings: prefer ; if length limits needed, use
TEXTinstead ofCHECK (LENGTH(col) <= n); avoidVARCHAR(n). UseCHAR(n)for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage:BYTEA(no TOAST),PLAIN(compress + out-of-line),EXTENDED(out-of-line, no compress),EXTERNAL(compress, keep in-line if possible). DefaultMAINusually optimal. Control withEXTENDEDandALTER TABLE tbl ALTER COLUMN col SET STORAGE strategyfor threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes onALTER TABLE tbl SET (toast_tuple_target = 4096)(preferred unless column needs case-insensitive PK/FK/UNIQUE) orLOWER(col).CITEXT - Money: (never float).
NUMERIC(p,s) - Time: for timestamps;
TIMESTAMPTZfor date-only;DATEfor durations. AvoidINTERVAL(without timezone). UseTIMESTAMPfor transaction start time,now()for current wall-clock time.clock_timestamp() - Booleans: with
BOOLEANconstraint unless tri-state values are required.NOT NULL - Enums: for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
CREATE TYPE ... AS ENUM - Arrays: ,
TEXT[], etc. Use for ordered lists where you query elements. Index with GIN for containment (INTEGER[],@>) and overlap (<@) queries. Access:&&(1-indexed),arr[1](slicing). Good for tags, categories; avoid for relations—use junction tables instead. Literal syntax:arr[1:3]or'{val1,val2}'.ARRAY[val1,val2] - Range types: ,
daterange,numrangefor intervals. Support overlap (tstzrange), containment (&&), operators. Index with GiST. Good for scheduling, versioning, numeric ranges. Pick a bounds scheme and use it consistently; prefer@>(inclusive/exclusive) by default.[) - Network types: for IP addresses,
INETfor network ranges,CIDRfor MAC addresses. Support network operators (MACADDR,<<,>>).&& - Geometric types: ,
POINT,LINE,POLYGONfor 2D spatial data. Index with GiST. Consider PostGIS for advanced spatial features.CIRCLE - Text search: for full-text search documents,
TSVECTORfor search queries. IndexTSQUERYwith GIN. Always specify language:tsvectorandto_tsvector('english', col). Never use single-argument versions. This applies to both index expressions and queries.to_tsquery('english', 'query') - Domain types: for reusable custom types with validation. Enforces constraints across tables.
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$') - Composite types: for structured data within columns. Access with
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)syntax.(col).field - JSONB: preferred over JSON; index with GIN. Use only for optional/semi-structured attrs. ONLY use JSON if the original ordering of the contents MUST be preserved.
- Vector types: type by
vectorfor vector similarity search for embeddings.pgvector
- ID字段:优先选择(
BIGINT GENERATED ALWAYS AS IDENTITY也可);在合并/联邦/分布式系统中使用或需要不透明ID时使用GENERATED BY DEFAULT。使用UUID(PostgreSQL 18+版本优先)或uuidv7()(旧版本PostgreSQL)生成UUID。gen_random_uuid() - 整数类型:除非存储空间受限,否则优先使用;小范围数值使用
BIGINT;避免使用INTEGER,除非有明确限制。SMALLINT - 浮点类型:除非存储空间受限,否则优先使用而非
DOUBLE PRECISION。如需精确十进制运算,使用REAL。NUMERIC - 字符串类型:优先使用;如需限制长度,使用
TEXT而非CHECK (LENGTH(col) <= n);避免使用VARCHAR(n)。二进制数据使用CHAR(n)。大字符串/二进制数据(超过默认阈值2KB)会自动存储在TOAST中并压缩。TOAST存储策略:BYTEA(不使用TOAST)、PLAIN(压缩+离线存储)、EXTENDED(离线存储、不压缩)、EXTERNAL(压缩、尽可能在线存储)。默认的MAIN通常是最优选择。可通过EXTENDED和ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy调整阈值。大小写不敏感处理:针对区域/重音处理使用非确定性排序规则;针对纯ASCII,优先使用基于ALTER TABLE tbl SET (toast_tuple_target = 4096)的表达式索引(除非列需要大小写不敏感的主键/外键/唯一约束),或使用LOWER(col)类型。CITEXT - 金额类型:使用(绝不要用浮点类型)。
NUMERIC(p,s) - 时间类型:时间戳使用;仅日期使用
TIMESTAMPTZ;时长使用DATE。避免使用INTERVAL(无时区)。事务开始时间使用TIMESTAMP,当前系统时间使用now()。clock_timestamp() - 布尔类型:使用并添加
BOOLEAN约束,除非需要三态值。NOT NULL - 枚举类型:对于小型、稳定的集合(如美国各州、星期几),使用创建枚举。对于业务逻辑驱动且不断变化的值(如订单状态),使用TEXT(或INT)+ CHECK约束或查找表。
CREATE TYPE ... AS ENUM - 数组类型:、
TEXT[]等。用于需要查询元素的有序列表。使用GIN索引支持包含(INTEGER[]、@>)和重叠(<@)查询。访问方式:&&(从1开始索引)、arr[1](切片)。适用于标签、分类;避免用于关系型数据——改用关联表。字面量语法:arr[1:3]或'{val1,val2}'。ARRAY[val1,val2] - 范围类型:、
daterange、numrange用于表示区间。支持重叠(tstzrange)、包含(&&)等操作符。使用GiST索引。适用于调度、版本控制、数值范围。选择一种边界方案并保持一致;默认优先使用@>(左闭右开)。[) - 网络类型:用于IP地址,
INET用于网络范围,CIDR用于MAC地址。支持网络操作符(MACADDR、<<、>>)。&& - 几何类型:、
POINT、LINE、POLYGON用于二维空间数据。使用GiST索引。如需高级空间功能,考虑使用PostGIS扩展。CIRCLE - 全文搜索:用于全文搜索文档,
TSVECTOR用于搜索查询。使用GIN索引TSQUERY字段。必须指定语言:tsvector和to_tsvector('english', col)。绝不要使用单参数版本。此规则适用于索引表达式和查询。to_tsquery('english', 'query') - 域类型:用于创建可复用的带验证规则的自定义类型,可跨表强制执行约束。
CREATE DOMAIN email AS TEXT CHECK (VALUE ~ '^[^@]+@[^@]+$') - 复合类型:用于在列中存储结构化数据。通过
CREATE TYPE address AS (street TEXT, city TEXT, zip TEXT)语法访问字段。(col).field - JSONB:优先于JSON使用;使用GIN索引。仅用于可选/半结构化属性。仅当必须保留内容原始顺序时才使用JSON。
- 向量类型:由扩展提供的
pgvector类型,用于嵌入向量的相似度搜索。vector
Do not use the following data types
以下数据类型请勿使用
- DO NOT use (without time zone); DO use
timestampinstead.timestamptz - DO NOT use or
char(n); DO usevarchar(n)instead.text - DO NOT use type; DO use
moneyinstead.numeric - DO NOT use type; DO use
timetzinstead.timestamptz - DO NOT use or any other precision specification; DO use
timestamptz(0)insteadtimestamptz - DO NOT use type; DO use
serialinstead.generated always as identity
- 请勿使用(无时区);请改用
timestamp。timestamptz - 请勿使用或
char(n);请改用varchar(n)。text - 请勿使用类型;请改用
money。numeric - 请勿使用类型;请改用
timetz。timestamptz - 请勿使用或任何其他精度指定;请直接使用
timestamptz(0)。timestamptz - 请勿使用类型;请改用
serial。generated always as identity
Table Types
表类型
- Regular: default; fully durable, logged.
- TEMPORARY: session-scoped, auto-dropped, not logged. Faster for scratch work.
- UNLOGGED: persistent but not crash-safe. Faster writes; good for caches/staging.
- 常规表:默认类型;完全持久化、已记录日志。
- 临时表:会话级作用域,自动删除,不记录日志。适用于临时数据处理,速度更快。
- 无日志表(UNLOGGED):持久化但不保证崩溃安全。写入速度更快;适用于缓存/临时 staging 数据。
Row-Level Security
行级安全
Enable with . Create policies: . Built-in user-based access control at the row level.
ALTER TABLE tbl ENABLE ROW LEVEL SECURITYCREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())通过启用。创建策略:。实现基于用户的内置行级访问控制。
ALTER TABLE tbl ENABLE ROW LEVEL SECURITYCREATE POLICY user_access ON orders FOR SELECT TO app_users USING (user_id = current_user_id())Constraints
约束
- PK: implicit UNIQUE + NOT NULL; creates a B-tree index.
- FK: specify action (
ON DELETE/UPDATE,CASCADE,RESTRICT,SET NULL). Add explicit index on referencing column—speeds up joins and prevents locking issues on parent deletes/updates. UseSET DEFAULTfor circular FK dependencies checked at transaction end.DEFERRABLE INITIALLY DEFERRED - UNIQUE: creates a B-tree index; allows multiple NULLs unless (PG15+). Standard behavior:
NULLS NOT DISTINCTand(1, NULL)are allowed. With(1, NULL): only oneNULLS NOT DISTINCTallowed. Prefer(1, NULL)unless you specifically need duplicate NULLs.NULLS NOT DISTINCT - CHECK: row-local constraints; NULL values pass the check (three-valued logic). Example: allows NULL prices. Combine with
CHECK (price > 0)to enforce:NOT NULL.price NUMERIC NOT NULL CHECK (price > 0) - EXCLUDE: prevents overlapping values using operators. prevents double-booking rooms. Requires appropriate index type (often GiST).
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
- 主键(PK):隐式包含UNIQUE + NOT NULL约束;会创建B树索引。
- 外键(FK):指定动作(
ON DELETE/UPDATE、CASCADE、RESTRICT、SET NULL)。为引用列添加显式索引——可加速连接并避免父表删除/更新时的锁定问题。对于循环外键依赖,使用SET DEFAULT在事务结束时检查约束。DEFERRABLE INITIALLY DEFERRED - 唯一约束(UNIQUE):创建B树索引;允许多个NULL值,除非使用(PostgreSQL 15+)。标准行为:
NULLS NOT DISTINCT和(1, NULL)是允许的。使用(1, NULL)时:仅允许一个NULLS NOT DISTINCT。除非明确需要重复NULL值,否则优先使用(1, NULL)。NULLS NOT DISTINCT - 检查约束(CHECK):行级本地约束;NULL值会通过检查(三值逻辑)。示例:允许price为NULL。如需强制非空,需结合
CHECK (price > 0):NOT NULL。price NUMERIC NOT NULL CHECK (price > 0) - 排除约束(EXCLUDE):使用操作符防止值重叠。可防止房间重复预订。需要合适的索引类型(通常是GiST)。
EXCLUDE USING gist (room_id WITH =, booking_period WITH &&)
Indexing
索引
- B-tree: default for equality/range queries (,
=,<,>,BETWEEN)ORDER BY - Composite: order matters—index used if equality on leftmost prefix (uses index on
WHERE a = ? AND b > ?, but(a,b)does not). Put most selective/frequently filtered columns first.WHERE b = ? - Covering: - includes non-key columns for index-only scans without visiting table.
CREATE INDEX ON tbl (id) INCLUDE (name, email) - Partial: for hot subsets (→
WHERE status = 'active'). Any query withCREATE INDEX ON tbl (user_id) WHERE status = 'active'can use this index.status = 'active' - Expression: for computed search keys (). Expression must match exactly in WHERE clause:
CREATE INDEX ON tbl (LOWER(email)).WHERE LOWER(email) = 'user@example.com' - GIN: JSONB containment/existence, arrays (,
@>), full-text search (?)@@ - GiST: ranges, geometry, exclusion constraints
- BRIN: very large, naturally ordered data (time-series)—minimal storage overhead. Effective when row order on disk correlates with indexed column (insertion order or after ).
CLUSTER
- B树索引:默认用于等值/范围查询(、
=、<、>、BETWEEN)ORDER BY - 复合索引:列的顺序很重要——如果查询对最左侧前缀列使用等值条件(如),则会使用
WHERE a = ? AND b > ?索引,但(a,b)不会使用该索引。将选择性最高/过滤最频繁的列放在前面。WHERE b = ? - 覆盖索引:- 包含非键列,支持仅索引扫描,无需访问表数据。
CREATE INDEX ON tbl (id) INCLUDE (name, email) - 部分索引:针对热点子集(如→
WHERE status = 'active')。任何包含CREATE INDEX ON tbl (user_id) WHERE status = 'active'的查询都可使用该索引。status = 'active' - 表达式索引:用于计算后的搜索键(如)。WHERE子句中的表达式必须与索引完全匹配:
CREATE INDEX ON tbl (LOWER(email))。WHERE LOWER(email) = 'user@example.com' - GIN索引:支持JSONB包含/存在查询、数组查询(、
@>)、全文搜索(?)@@ - GiST索引:支持范围类型、几何类型、排除约束
- BRIN索引:适用于超大规模、自然有序的数据(如时间序列)——存储开销极小。当磁盘上的行顺序与索引列相关时(插入顺序或执行后)效果最佳。
CLUSTER
Partitioning
分区
- Use for very large tables (>100M rows) where queries consistently filter on partition key (often time/date).
- Alternate use: use for tables where data maintenance tasks dictates e.g. data pruned or bulk replaced periodically
- RANGE: common for time-series (). Create partitions:
PARTITION BY RANGE (created_at). TimescaleDB automates time-based or ID-based partitioning with retention policies and compression.CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01') - LIST: for discrete values (). Example:
PARTITION BY LIST (region).FOR VALUES IN ('us-east', 'us-west') - HASH: for even distribution when no natural key (). Creates N partitions with modulus.
PARTITION BY HASH (user_id) - Constraint exclusion: requires constraints on partitions for query planner to prune. Auto-created for declarative partitioning (PG10+).
CHECK - Prefer declarative partitioning or hypertables. Do NOT use table inheritance.
- Limitations: no global UNIQUE constraints—include partition key in PK/UNIQUE. FKs from partitioned tables not supported; use triggers.
- 适用于超大型表(超过1亿行),且查询始终按分区键过滤(通常是时间/日期)。
- 另一种场景:数据维护需求决定,例如定期清理或批量替换数据
- 范围分区:常用于时间序列数据()。创建分区:
PARTITION BY RANGE (created_at)。TimescaleDB可自动完成基于时间或ID的分区,并提供保留策略和压缩功能。CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01') - 列表分区:用于离散值()。示例:
PARTITION BY LIST (region)。FOR VALUES IN ('us-east', 'us-west') - 哈希分区:无自然键时用于均匀分布数据()。通过取模创建N个分区。
PARTITION BY HASH (user_id) - 约束排除:需要为分区添加约束,以便查询规划器进行分区裁剪。声明式分区(PostgreSQL 10+)会自动创建该约束。
CHECK - 优先使用声明式分区或 hypertables。请勿使用表继承。
- 局限性:无全局唯一约束——需将分区键包含在主键/唯一约束中。分区表的外键不被支持;需使用触发器实现。
Special Considerations
特殊场景考量
Update-Heavy Tables
更新频繁的表
- Separate hot/cold columns—put frequently updated columns in separate table to minimize bloat.
- Use to leave space for HOT updates that avoid index maintenance.
fillfactor=90 - Avoid updating indexed columns—prevents beneficial HOT updates.
- Partition by update patterns—separate frequently updated rows in a different partition from stable data.
- 分离冷热列——将频繁更新的列放在单独的表中,以减少数据膨胀。
- 使用——为HOT更新预留空间,避免索引维护。
fillfactor=90 - 避免更新带索引的列——会阻止有效的HOT更新。
- 按更新模式分区——将频繁更新的行与稳定数据放在不同分区。
Insert-Heavy Workloads
插入密集型工作负载
- Minimize indexes—only create what you query; every index slows inserts.
- Use or multi-row
COPYinstead of single-row inserts.INSERT - UNLOGGED tables for rebuildable staging data—much faster writes.
- Defer index creation for bulk loads—>drop index, load data, recreate indexes.
- Partition by time/hash to distribute load. TimescaleDB automates partitioning and compression of insert-heavy data.
- Use a natural key for primary key such as a (timestamp, device_id) if enforcing global uniqueness is important many insert-heavy tables don't need a primary key at all.
- If you do need a surrogate key, Prefer over
BIGINT GENERATED ALWAYS AS IDENTITY.UUID
- 最小化索引数量——仅创建查询所需的索引;每个索引都会减慢插入速度。
- **使用或多行
COPY**替代单行插入。INSERT - **无日志表(UNLOGGED)**用于可重建的staging数据——写入速度更快。
- 批量加载时延迟创建索引——先删除索引,加载数据,再重新创建索引。
- 按时间/哈希分区以分散负载。TimescaleDB可自动完成插入密集型数据的分区和压缩。
- 使用自然键作为主键,如(timestamp, device_id),如果需要强制全局唯一性;许多插入密集型表根本不需要主键。
- 如果确实需要代理键,优先选择而非
BIGINT GENERATED ALWAYS AS IDENTITY。UUID
Upsert-Friendly Design
友好的Upsert设计
- Requires UNIQUE index on conflict target columns—needs exact matching unique index (partial indexes don't work).
ON CONFLICT (col1, col2) - Use to reference would-be-inserted values; only update columns that actually changed to reduce write overhead.
EXCLUDED.column - faster than
DO NOTHINGwhen no actual update needed.DO UPDATE
- 需要唯一索引在冲突目标列上——需要完全匹配的唯一索引(部分索引无效)。
ON CONFLICT (col1, col2) - **使用**引用待插入的值;仅更新实际变化的列,以减少写入开销。
EXCLUDED.column - 比
DO NOTHING更快当无需实际更新时。DO UPDATE
Safe Schema Evolution
安全的架构演进
- Transactional DDL: most DDL operations can run in transactions and be rolled back—for safe testing.
BEGIN; ALTER TABLE...; ROLLBACK; - Concurrent index creation: avoids blocking writes but can't run in transactions.
CREATE INDEX CONCURRENTLY - Volatile defaults cause rewrites: adding columns with volatile defaults (e.g.,
NOT NULL,now()) rewrites entire table. Non-volatile defaults are fast.gen_random_uuid() - Drop constraints before columns: then
ALTER TABLE DROP CONSTRAINTto avoid dependency issues.DROP COLUMN - Function signature changes: with different arguments creates overloads, not replacements. DROP old version if no overload desired.
CREATE OR REPLACE
- 事务性DDL:大多数DDL操作可在事务中执行并回滚——可用于安全测试。
BEGIN; ALTER TABLE...; ROLLBACK; - 并发创建索引:不会阻塞写入,但不能在事务中执行。
CREATE INDEX CONCURRENTLY - 易变默认值会导致重写:添加带易变默认值(如、
now())的gen_random_uuid()列会重写整个表。非易变默认值的操作速度更快。NOT NULL - 先删除约束再删除列:然后
ALTER TABLE DROP CONSTRAINT以避免依赖问题。DROP COLUMN - 函数签名变更:使用不同参数会创建重载,而非替换。如果不需要重载,请删除旧版本。
CREATE OR REPLACE
Generated Columns
生成列
- for computed, indexable fields. PG18+ adds
... GENERATED ALWAYS AS (<expr>) STOREDcolumns (computed on read, not stored).VIRTUAL
- 用于可索引的计算字段。PostgreSQL 18+新增
... GENERATED ALWAYS AS (<expr>) STORED列(读取时计算,不存储)。VIRTUAL
Extensions
扩展
- :
pgcryptofor password hashing.crypt() - : alternative UUID functions; prefer
uuid-osspfor new projects.pgcrypto - : fuzzy text search with
pg_trgmoperator,%function. Index with GIN forsimilarity()acceleration.LIKE '%pattern%' - : case-insensitive text type. Prefer expression indexes on
citextunless you need case-insensitive constraints.LOWER(col) - /
btree_gin: enable mixed-type indexes (e.g., GIN index on both JSONB and text columns).btree_gist - : key-value pairs; mostly superseded by JSONB but useful for simple string mappings.
hstore - : essential for time-series—automated partitioning, retention, compression, continuous aggregates.
timescaledb - : comprehensive geospatial support beyond basic geometric types—essential for location-based applications.
postgis - : vector similarity search for embeddings.
pgvector - : audit logging for all database activity.
pgaudit
- :
pgcrypto用于密码哈希。crypt() - :替代UUID函数;新项目优先使用
uuid-ossp。pgcrypto - :支持模糊文本搜索,使用
pg_trgm操作符、%函数。使用GIN索引加速similarity()查询。LIKE '%pattern%' - :大小写不敏感的文本类型。除非需要大小写不敏感的约束,否则优先使用基于
citext的表达式索引。LOWER(col) - /
btree_gin:支持混合类型索引(如同时包含JSONB和文本列的GIN索引)。btree_gist - :键值对;大部分场景已被JSONB取代,但在简单字符串映射时仍有用。
hstore - :时间序列数据必备——自动分区、数据保留、压缩、连续聚合。
timescaledb - :提供超越基础几何类型的全面地理空间支持——位置应用必备。
postgis - :用于嵌入向量的相似度搜索。
pgvector - :记录所有数据库活动的审计日志。
pgaudit
JSONB Guidance
JSONB 指南
- Prefer with GIN index.
JSONB - Default: → accelerates:
CREATE INDEX ON tbl USING GIN (jsonb_col);- Containment
jsonb_col @> '{"k":"v"}' - Key existence , any/all keys
jsonb_col ? 'k',?\|?& - Path containment on nested docs
- Disjunction
jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])
- Containment
- Heavy workloads: consider opclass
@>for smaller/faster containment-only indexes:jsonb_path_opsCREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);- Trade-off: loses support for key existence (,
?,?|) queries—only supports containment (?&)@>
- Equality/range on a specific scalar field: extract and index with B-tree (generated column or expression):
ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;CREATE INDEX ON tbl (price);- Prefer queries like (uses B-tree) over
WHERE price BETWEEN 100 AND 500without index.WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500
- Arrays inside JSONB: use GIN + for containment (e.g., tags). Consider
@>if only doing containment.jsonb_path_ops - Keep core relations in tables; use JSONB for optional/variable attributes.
- Use constraints to limit allowed JSONB values in a column e.g.
config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')
- 优先使用并搭配GIN索引。
JSONB - 默认索引:→ 加速以下操作:
CREATE INDEX ON tbl USING GIN (jsonb_col);- 包含查询
jsonb_col @> '{"k":"v"}' - 键存在查询 、任意/所有键存在
jsonb_col ? 'k'、?|?& - 嵌套文档的路径包含查询
- 析取查询
jsonb_col @> ANY(ARRAY['{"status":"active"}', '{"status":"pending"}'])
- 包含查询
- 密集工作负载:考虑使用
@>操作符类创建更小/更快的仅包含查询索引:jsonb_path_opsCREATE INDEX ON tbl USING GIN (jsonb_col jsonb_path_ops);- 权衡:失去键存在(、
?、?|)查询支持——仅支持包含查询(?&)@>
- 特定标量字段的等值/范围查询:提取字段并使用B树索引(生成列或表达式):
ALTER TABLE tbl ADD COLUMN price INT GENERATED ALWAYS AS ((jsonb_col->>'price')::INT) STORED;CREATE INDEX ON tbl (price);- 优先使用(使用B树索引)而非无索引的
WHERE price BETWEEN 100 AND 500。WHERE (jsonb_col->>'price')::INT BETWEEN 100 AND 500
- JSONB内的数组:使用GIN + 进行包含查询(如标签)。如果仅进行包含查询,考虑
@>。jsonb_path_ops - 核心关系数据放在表中;JSONB用于可选/可变属性。
- 使用约束限制列中允许的JSONB值,例如
config JSONB NOT NULL CHECK(jsonb_typeof(config) = 'object')
Examples
示例
Users
用户表
sql
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);sql
CREATE TABLE users (
user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX ON users (LOWER(email));
CREATE INDEX ON users (created_at);Orders
订单表
sql
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
total NUMERIC(10,2) NOT NULL CHECK (total > 0),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);sql
CREATE TABLE orders (
order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(user_id),
status TEXT NOT NULL DEFAULT 'PENDING' CHECK (status IN ('PENDING','PAID','CANCELED')),
total NUMERIC(10,2) NOT NULL CHECK (total > 0),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ON orders (user_id);
CREATE INDEX ON orders (created_at);JSONB
JSONB示例
sql
CREATE TABLE profiles (
user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
attrs JSONB NOT NULL DEFAULT '{}',
theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);sql
CREATE TABLE profiles (
user_id BIGINT PRIMARY KEY REFERENCES users(user_id),
attrs JSONB NOT NULL DEFAULT '{}',
theme TEXT GENERATED ALWAYS AS (attrs->>'theme') STORED
);
CREATE INDEX profiles_attrs_gin ON profiles USING GIN (attrs);