storage-format

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Storage Format Guide

存储格式指南

Database File Structure

数据库文件结构

┌─────────────────────────────┐
│ Page 1: Header + Schema     │  ← First 100 bytes = DB header
├─────────────────────────────┤
│ Page 2..N: B-tree pages     │  ← Tables and indexes
│            Overflow pages   │
│            Freelist pages   │
└─────────────────────────────┘
Page size: power of 2, 512-65536 bytes. Default 4096.
┌─────────────────────────────┐
│ Page 1: Header + Schema     │  ← First 100 bytes = DB header
├─────────────────────────────┤
│ Page 2..N: B-tree pages     │  ← Tables and indexes
│            Overflow pages   │
│            Freelist pages   │
└─────────────────────────────┘
页面大小:2的幂,范围512-65536字节,默认值为4096。

Database Header (First 100 Bytes)

数据库头部(前100字节)

OffsetSizeField
016Magic:
"SQLite format 3\0"
162Page size (big-endian)
181Write format version (1=rollback, 2=WAL)
191Read format version
244Change counter
284Database size in pages
324First freelist trunk page
364Total freelist pages
404Schema cookie
564Text encoding (1=UTF8, 2=UTF16LE, 3=UTF16BE)
All multi-byte integers: big-endian.
OffsetSizeField
016Magic:
"SQLite format 3\0"
162页面大小(大端字节序)
181写入格式版本(1=回滚,2=WAL)
191读取格式版本
244变更计数器
284数据库总页面数
324首个空闲列表主干页面
364空闲列表总页面数
404模式标识
564文本编码(1=UTF8,2=UTF16LE,3=UTF16BE)
所有多字节整数均采用大端字节序

Page Types

页面类型

FlagTypePurpose
0x02Interior indexIndex B-tree internal node
0x05Interior tableTable B-tree internal node
0x0aLeaf indexIndex B-tree leaf
0x0dLeaf tableTable B-tree leaf
-OverflowPayload exceeding cell capacity
-FreelistUnused pages (trunk or leaf)
FlagTypePurpose
0x02索引内部节点索引B树内部节点
0x05表内部节点表B树内部节点
0x0a索引叶子节点索引B树叶子节点
0x0d表叶子节点表B树叶子节点
-溢出页负载超出单元格容量时使用
-空闲页未使用的页面(主干或叶子)

B-tree Structure

B树结构

Two B-tree types:
  • Table B-tree: 64-bit rowid keys, stores row data
  • Index B-tree: Arbitrary keys (index columns + rowid)
Interior page:  [ptr0] key1 [ptr1] key2 [ptr2] ...
                   │         │         │
                   ▼         ▼         ▼
               child     child     child
               pages     pages     pages

Leaf page:     key1:data  key2:data  key3:data ...
Page 1 always root of
sqlite_schema
table.
两种B树类型:
  • 表B树:使用64位rowid作为键,存储行数据
  • 索引B树:使用任意键(索引列+rowid)
Interior page:  [ptr0] key1 [ptr1] key2 [ptr2] ...
                   │         │         │
                   ▼         ▼         ▼
               child     child     child
               pages     pages     pages

Leaf page:     key1:data  key2:data  key3:data ...
页面1始终是
sqlite_schema
表的根节点。

Cell Format

单元格格式

Table Leaf Cell

表叶子单元格

[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]
[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]

Table Interior Cell

表内部单元格

[left_child_page: u32] [rowid: varint]
[left_child_page: u32] [rowid: varint]

Index Cells

索引单元格

Similar but key is arbitrary (columns + rowid), not just rowid.
与表单元格类似,但键为任意值(列+rowid),而非仅rowid。

Record Format (Payload)

记录格式(负载)

[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...
Serial types:
TypeMeaning
0NULL
1-41/2/3/4 byte signed int
56 byte signed int
68 byte signed int
7IEEE 754 float
8Integer 0
9Integer 1
≥12 evenBLOB, length=(N-12)/2
≥13 oddText, length=(N-13)/2
[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...
序列化类型:
TypeMeaning
0NULL
1-41/2/3/4字节有符号整数
56字节有符号整数
68字节有符号整数
7IEEE 754浮点数
8整数0
9整数1
≥12 evenBLOB,长度=(N-12)/2
≥13 odd文本,长度=(N-13)/2

Overflow Pages

溢出页

When payload exceeds threshold, excess stored in overflow chain:
[next_page: u32] [data...]
Last page has next_page=0.
当负载超过阈值时,超出部分存储在溢出链中:
[next_page: u32] [data...]
最后一个页面的next_page值为0。

Freelist

空闲列表

Linked list of trunk pages, each containing leaf page numbers:
Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]
由主干页面组成的链表,每个主干页面包含多个叶子页面编号:
Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]

Turso Implementation

Turso实现

Key files:
  • core/storage/sqlite3_ondisk.rs
    - On-disk format,
    PageType
    enum
  • core/storage/btree.rs
    - B-tree operations (large file)
  • core/storage/pager.rs
    - Page management
  • core/storage/buffer_pool.rs
    - Page caching
核心文件:
  • core/storage/sqlite3_ondisk.rs
    - 磁盘格式定义,包含
    PageType
    枚举
  • core/storage/btree.rs
    - B树操作(大文件处理)
  • core/storage/pager.rs
    - 页面管理
  • core/storage/buffer_pool.rs
    - 页面缓存

Debugging Storage

存储调试

bash
undefined
bash
undefined

Integrity check

Integrity check

cargo run --bin tursodb test.db "PRAGMA integrity_check;"
cargo run --bin tursodb test.db "PRAGMA integrity_check;"

Page count

Page count

cargo run --bin tursodb test.db "PRAGMA page_count;"
cargo run --bin tursodb test.db "PRAGMA page_count;"

Freelist info

Freelist info

cargo run --bin tursodb test.db "PRAGMA freelist_count;"
undefined
cargo run --bin tursodb test.db "PRAGMA freelist_count;"
undefined

References

参考资料