vector-databases

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Vector Databases

向量数据库

When to Use This Skill

何时使用该技能

Use this skill when:

Choosing between vector database options
Designing semantic/similarity search systems
Optimizing vector search performance
Understanding ANN algorithm trade-offs
Scaling vector search infrastructure
Implementing hybrid search (vectors + filters)

Keywords: vector database, embeddings, vector search, similarity search, ANN, approximate nearest neighbor, HNSW, IVF, FAISS, Pinecone, Weaviate, Milvus, Qdrant, Chroma, pgvector, cosine similarity, semantic search

在以下场景使用该技能：

选择向量数据库选项时
设计语义/相似性搜索系统时
优化向量搜索性能时
理解ANN算法的权衡取舍时
扩展向量搜索基础设施时
实现混合搜索（向量 + 过滤器）时

关键词： 向量数据库、嵌入向量、向量搜索、相似性搜索、ANN、近似最近邻、HNSW、IVF、FAISS、Pinecone、Weaviate、Milvus、Qdrant、Chroma、pgvector、余弦相似度、语义搜索

Vector Database Comparison

向量数据库对比

Managed Services

托管服务

Database	Strengths	Limitations	Best For
Pinecone	Fully managed, easy scaling, enterprise	Vendor lock-in, cost at scale	Enterprise production
Weaviate Cloud	GraphQL, hybrid search, modules	Complexity	Knowledge graphs
Zilliz Cloud	Milvus-based, high performance	Learning curve	High-scale production
MongoDB Atlas Vector	Existing MongoDB users	Newer feature	MongoDB shops
Elastic Vector	Existing Elastic stack	Resource heavy	Search platforms

数据库	优势	局限性	适用场景
Pinecone	完全托管、易于扩展、企业级	厂商锁定、大规模使用成本高	企业级生产环境
Weaviate Cloud	支持GraphQL、混合搜索、模块化	复杂度较高	知识图谱场景
Zilliz Cloud	基于Milvus、高性能	学习曲线较陡	高规模生产环境
MongoDB Atlas Vector	适合现有MongoDB用户	功能较新	使用MongoDB的企业
Elastic Vector	适配现有Elastic栈	资源消耗大	搜索平台场景

Self-Hosted Options

自托管选项

Database	Strengths	Limitations	Best For
Milvus	Feature-rich, scalable, GPU support	Operational complexity	Large-scale production
Qdrant	Rust performance, filtering, easy	Smaller ecosystem	Performance-focused
Weaviate	Modules, semantic, hybrid	Memory usage	Knowledge applications
Chroma	Simple, Python-native	Limited scale	Development, prototyping
pgvector	PostgreSQL extension	Performance limits	Postgres shops
FAISS	Library, not DB, fastest	No persistence, no filtering	Research, embedded

数据库	优势	局限性	适用场景
Milvus	功能丰富、可扩展、支持GPU	运维复杂度高	大规模生产环境
Qdrant	Rust实现性能优异、支持过滤、易于使用	生态系统较小	性能优先场景
Weaviate	模块化、语义化、支持混合搜索	内存占用高	知识类应用
Chroma	简单易用、原生支持Python	扩展能力有限	开发、原型验证
pgvector	PostgreSQL扩展	性能上限有限	使用PostgreSQL的企业
FAISS	是库而非数据库、速度最快	无持久化、不支持过滤	研究、嵌入式场景

Selection Decision Tree

选型决策树

text

Need managed, don't want operations?
├── Yes → Pinecone (simplest) or Weaviate Cloud
└── No (self-hosted)
    └── Already using PostgreSQL?
        ├── Yes, <1M vectors → pgvector
        └── No
            └── Need maximum performance at scale?
                ├── Yes → Milvus or Qdrant
                └── No
                    └── Prototyping/development?
                        ├── Yes → Chroma
                        └── No → Qdrant (balanced choice)

text

需要托管服务，不想运维？
├── 是 → Pinecone（最简单）或Weaviate Cloud
└── 否（自托管）
    └── 已在使用PostgreSQL？
        ├── 是，向量数<100万 → pgvector
        └── 否
            └── 需要大规模场景下的极致性能？
                ├── 是 → Milvus或Qdrant
                └── 否
                    └── 用于原型验证/开发？
                        ├── 是 → Chroma
                        └── 否 → Qdrant（均衡选择）

ANN Algorithms

ANN算法

Algorithm Overview

算法概述

text

Exact KNN:
• Search ALL vectors
• O(n) time complexity
• Perfect accuracy
• Impractical at scale

Approximate NN (ANN):
• Search SUBSET of vectors
• O(log n) to O(1) complexity
• Near-perfect accuracy
• Practical at any scale

text

精确KNN:
• 搜索所有向量
• 时间复杂度O(n)
• 准确率100%
• 大规模场景下不实用

近似最近邻（ANN）:
• 搜索向量子集
• 时间复杂度O(log n)至O(1)
• 接近完美的准确率
• 适用于任何规模场景

HNSW (Hierarchical Navigable Small World)

HNSW（分层可导航小世界）

text

Layer 3: ○───────────────────────○  (sparse, long connections)
          │                       │
Layer 2: ○───○───────○───────○───○  (medium density)
          │   │       │       │   │
Layer 1: ○─○─○─○─○─○─○─○─○─○─○─○─○  (denser)
          │││││││││││││││││││││││
Layer 0: ○○○○○○○○○○○○○○○○○○○○○○○○○  (all vectors)

Search: Start at top layer, greedily descend
• Fast: O(log n) search time
• High recall: >95% typically
• Memory: Extra graph storage

HNSW Parameters:

Parameter	Description	Trade-off
`M`	Connections per node	Memory vs. recall
`ef_construction`	Build-time search width	Build time vs. recall
`ef_search`	Query-time search width	Latency vs. recall

text

第3层: ○───────────────────────○ （稀疏、长连接）
          │                       │
第2层: ○───○───────○───────○───○ （中等密度）
          │   │       │       │   │
第1层: ○─○─○─○─○─○─○─○─○─○─○─○─○ （高密度）
          │││││││││││││││││││││││
第0层: ○○○○○○○○○○○○○○○○○○○○○○○○○ （所有向量）

搜索流程: 从顶层开始，贪心式向下遍历
• 速度快：搜索时间复杂度O(log n)
• 召回率高：通常>95%
• 内存占用：额外的图存储开销

HNSW参数:

参数	描述	权衡取舍
`M`	每个节点的连接数	内存占用 vs 召回率
`ef_construction`	构建时的搜索宽度	构建时间 vs 召回率
`ef_search`	查询时的搜索宽度	延迟 vs 召回率

IVF (Inverted File Index)

IVF（倒排文件索引）

text

Clustering Phase:
┌─────────────────────────────────────────┐
│     Cluster vectors into K centroids    │
│                                         │
│    ●         ●         ●         ●     │
│   /│\       /│\       /│\       /│\    │
│  ○○○○○     ○○○○○     ○○○○○     ○○○○○   │
│ Cluster 1  Cluster 2 Cluster 3 Cluster 4│
└─────────────────────────────────────────┘

Search Phase:
1. Find nprobe nearest centroids
2. Search only those clusters
3. Much faster than exhaustive

IVF Parameters:

Parameter	Description	Trade-off
`nlist`	Number of clusters	Build time vs. search quality
`nprobe`	Clusters to search	Latency vs. recall

text

聚类阶段:
┌─────────────────────────────────────────┐
│     将向量聚类为K个质心                  │
│                                         │
│    ●         ●         ●         ●     │
│   /│\       /│\       /│\       /│\    │
│  ○○○○○     ○○○○○     ○○○○○     ○○○○○   │
│ 聚类1     聚类2     聚类3     聚类4      │
└─────────────────────────────────────────┘

搜索阶段:
1. 找到nprobe个最近的质心
2. 仅搜索这些聚类中的向量
3. 比全量搜索快得多

IVF参数:

参数	描述	权衡取舍
`nlist`	聚类数量	构建时间 vs 搜索质量
`nprobe`	要搜索的聚类数	延迟 vs 召回率

IVF-PQ (Product Quantization)

IVF-PQ（乘积量化）

text

Original Vector (128 dim):
[0.1, 0.2, ..., 0.9]  (128 × 4 bytes = 512 bytes)

PQ Compressed (8 subvectors, 8-bit codes):
[23, 45, 12, 89, 56, 34, 78, 90]  (8 bytes)

Memory reduction: 64x
Accuracy trade-off: ~5% recall drop

text

原始向量（128维）:
[0.1, 0.2, ..., 0.9] （128×4字节=512字节）

PQ压缩后（8个子向量，8位编码）:
[23, 45, 12, 89, 56, 34, 78, 90] （8字节）

内存缩减比例: 64倍
准确率权衡: 召回率下降约5%

Algorithm Comparison

算法对比

Algorithm	Search Speed	Memory	Build Time	Recall
Flat/Brute	Slow (O(n))	Low	None	100%
IVF	Fast	Low	Medium	90-95%
IVF-PQ	Very fast	Very low	Medium	85-92%
HNSW	Very fast	High	Slow	95-99%
HNSW+PQ	Very fast	Medium	Slow	90-95%

算法	搜索速度	内存占用	构建时间	召回率
Flat/Brute	慢（O(n)）	低	无	100%
IVF	快	低	中等	90-95%
IVF-PQ	极快	极低	中等	85-92%
HNSW	极快	高	慢	95-99%
HNSW+PQ	极快	中等	慢	90-95%

When to Use Which

算法选型指南

text

< 100K vectors:
└── Flat index (exact search is fast enough)

100K - 1M vectors:
└── HNSW (best recall/speed trade-off)

1M - 100M vectors:
├── Memory available → HNSW
└── Memory constrained → IVF-PQ or HNSW+PQ

> 100M vectors:
└── Sharded IVF-PQ or distributed HNSW

text

向量数<10万:
└── 扁平索引（精确搜索速度足够）

向量数10万-100万:
└── HNSW（召回率/速度权衡最优）

向量数100万-1亿:
├── 内存充足 → HNSW
└── 内存受限 → IVF-PQ或HNSW+PQ

向量数>1亿:
└── 分片IVF-PQ或分布式HNSW

Distance Metrics

距离度量

Common Metrics

常用度量方式

Metric	Formula	Range	Best For
Cosine Similarity	`A·B / (\|\|A\|\| \|\|B\|\|)`	[-1, 1]	Normalized embeddings
Dot Product	`A·B`	(-∞, ∞)	When magnitude matters
Euclidean (L2)	`√Σ(A-B)²`	[0, ∞)	Absolute distances
Manhattan (L1)	`Σ\|A-B\|`	[0, ∞)	High-dimensional sparse

度量方式	公式	范围	适用场景
余弦相似度	`A·B / (		A
点积	`A·B`	(-∞, ∞)	向量幅度有意义的场景
欧几里得距离（L2）	`√Σ(A-B)²`	[0, ∞)	绝对距离场景
曼哈顿距离（L1）	`Σ	A-B	`

Metric Selection

度量方式选型

text

Embeddings pre-normalized (unit vectors)?
├── Yes → Cosine = Dot Product (use Dot, faster)
└── No
    └── Magnitude meaningful?
        ├── Yes → Dot Product
        └── No → Cosine Similarity

Note: Most embedding models output normalized vectors
      → Dot product is usually the best choice

text

嵌入向量已预归一化（单位向量）？
├── 是 → 余弦相似度=点积（使用点积，速度更快）
└── 否
    └── 向量幅度有意义？
        ├── 是 → 点积
        └── 否 → 余弦相似度

注意：大多数嵌入模型输出归一化向量
      → 点积通常是最佳选择

Filtering and Hybrid Search

过滤与混合搜索

Pre-filtering vs Post-filtering

预过滤 vs 后过滤

text

Pre-filtering (Filter → Search):
┌─────────────────────────────────────────┐
│ 1. Apply metadata filter               │
│    (category = "electronics")           │
│    Result: 10K of 1M vectors           │
│                                         │
│ 2. Vector search on 10K vectors        │
│    Much faster, guaranteed filter match │
└─────────────────────────────────────────┘

Post-filtering (Search → Filter):
┌─────────────────────────────────────────┐
│ 1. Vector search on 1M vectors         │
│    Return top-1000                      │
│                                         │
│ 2. Apply metadata filter               │
│    May return < K results!             │
└─────────────────────────────────────────┘

text

预过滤（过滤→搜索）:
┌─────────────────────────────────────────┐
│ 1. 应用元数据过滤                       │
│    （例如：category = "电子产品"）        │
│    结果：100万向量中筛选出1万条          │
│                                         │
│ 2. 对1万条向量执行向量搜索              │
│    速度快得多，确保结果符合过滤条件      │
└─────────────────────────────────────────┘

后过滤（搜索→过滤）:
┌─────────────────────────────────────────┐
│ 1. 对100万向量执行向量搜索              │
│    返回前1000条结果                     │
│                                         │
│ 2. 应用元数据过滤                       │
│    可能返回少于K条结果！                │
└─────────────────────────────────────────┘

Hybrid Search Architecture

混合搜索架构

text

Query: "wireless headphones under $100"
           │
     ┌─────┴─────┐
     ▼           ▼
 ┌───────┐  ┌───────┐
 │Vector │  │Filter │
 │Search │  │ Build │
 │"wire- │  │price  │
 │less   │  │< 100  │
 │head-  │  │       │
 │phones"│  │       │
 └───────┘  └───────┘
     │           │
     └─────┬─────┘
           ▼
    ┌───────────┐
    │  Combine  │
    │  Results  │
    └───────────┘

text

查询: "100美元以下的无线耳机"
           │
     ┌─────┴─────┐
     ▼           ▼
 ┌───────┐  ┌───────┐
 │向量搜索│  │过滤条件│
 │"无线   │  │价格<100│
 │耳机"   │  │       │
 └───────┘  └───────┘
     │           │
     └─────┬─────┘
           ▼
    ┌───────────┐
    │  合并结果  │
    └───────────┘

Metadata Index Design

元数据索引设计

Metadata Type	Index Strategy	Query Example
Categorical	Bitmap/hash index	category = "books"
Numeric range	B-tree	price BETWEEN 10 AND 50
Keyword search	Inverted index	tags CONTAINS "sale"
Geospatial	R-tree/geohash	location NEAR (lat, lng)

元数据类型	索引策略	查询示例
分类型	位图/哈希索引	category = "书籍"
数值范围	B树索引	price BETWEEN 10 AND 50
关键词搜索	倒排索引	tags CONTAINS "促销"
地理空间	R树/地理哈希	location NEAR (纬度, 经度)

Scaling Strategies

扩展策略

Sharding Approaches

分片方式

text

Naive Sharding (by ID):
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Shard 1 │ │ Shard 2 │ │ Shard 3 │
│ IDs 0-N │ │IDs N-2N │ │IDs 2N-3N│
└─────────┘ └─────────┘ └─────────┘
Query → Search ALL shards → Merge results

Semantic Sharding (by cluster):
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Shard 1 │ │ Shard 2 │ │ Shard 3 │
│ Tech    │ │ Health  │ │ Finance │
│ docs    │ │ docs    │ │ docs    │
└─────────┘ └─────────┘ └─────────┘
Query → Route to relevant shard(s) → Faster!

text

简单分片（按ID）:
┌─────────┐ ┌─────────┐ ┌─────────┐
│ 分片1 │ │ 分片2 │ │ 分片3 │
│ ID 0-N │ │ID N-2N │ │ID 2N-3N│
└─────────┘ └─────────┘ └─────────┘
查询 → 搜索所有分片 → 合并结果

语义分片（按聚类）:
┌─────────┐ ┌─────────┐ ┌─────────┐
│ 分片1 │ │ 分片2 │ │ 分片3 │
│ 科技文档 │ │ 健康文档 │ │ 金融文档 │
└─────────┘ └─────────┘ └─────────┘
查询 → 路由到相关分片 → 速度更快！

Replication

复制

text

┌─────────────────────────────────────────┐
│              Load Balancer              │
└─────────────────────────────────────────┘
         │           │           │
         ▼           ▼           ▼
    ┌─────────┐ ┌─────────┐ ┌─────────┐
    │Replica 1│ │Replica 2│ │Replica 3│
    │  (Read) │ │  (Read) │ │  (Read) │
    └─────────┘ └─────────┘ └─────────┘
         │           │           │
         └───────────┼───────────┘
                     │
                ┌─────────┐
                │ Primary │
                │ (Write) │
                └─────────┘

text

┌─────────────────────────────────────────┐
│              负载均衡器                │
└─────────────────────────────────────────┘
         │           │           │
         ▼           ▼           ▼
    ┌─────────┐ ┌─────────┐ ┌─────────┐
    │副本1│ │ 副本2│ │ 副本3│
    │ （读） │ │ （读） │ │ （读） │
    └─────────┘ └─────────┘ └─────────┘
         │           │           │
         └───────────┼───────────┘
                     │
                ┌─────────┐
                │ 主节点 │
                │ （写） │
                └─────────┘

Scaling Decision Matrix

扩展决策矩阵

Scale (vectors)	Architecture	Replication
< 1M	Single node	Optional
1-10M	Single node, more RAM	For HA
10-100M	Sharded, few nodes	Required
100M-1B	Sharded, many nodes	Required
> 1B	Sharded + tiered	Required

规模（向量数）	架构	复制策略
<100万	单节点	可选
100万-1000万	单节点，增加内存	用于高可用
1000万-1亿	分片，少量节点	必需
1亿-10亿	分片，多节点	必需
>10亿	分片+分层架构	必需

Performance Optimization

性能优化

Index Build Optimization

索引构建优化

Optimization	Description	Impact
Batch insertion	Insert in batches of 1K-10K	10x faster
Parallel build	Multi-threaded index construction	2-4x faster
Incremental index	Add to existing index	Avoids rebuild
GPU acceleration	Use GPU for training (IVF)	10-100x faster

优化方式	描述	影响
批量插入	以1000-10000条为批量插入	速度提升10倍
并行构建	多线程构建索引	速度提升2-4倍
增量索引	向现有索引中添加向量	避免重建索引
GPU加速	使用GPU训练（IVF）	速度提升10-100倍

Query Optimization

查询优化

Optimization	Description	Impact
Warm cache	Keep index in memory	10x latency reduction
Query batching	Batch similar queries	Higher throughput
Reduce dimensions	PCA, random projection	2-4x faster
Early termination	Stop when "good enough"	Lower latency

优化方式	描述	影响
预热缓存	将索引保存在内存中	延迟降低10倍
查询批量处理	批量处理相似查询	提高吞吐量
降维	使用PCA、随机投影	速度提升2-4倍
提前终止	当结果足够好时停止搜索	降低延迟

Memory Optimization

内存优化

text

Memory per vector:
┌────────────────────────────────────────┐
│ 1536 dims × 4 bytes = 6KB per vector   │
│                                        │
│ 1M vectors:                            │
│   Raw: 6GB                             │
│   + HNSW graph: +2-4GB (M-dependent)   │
│   = 8-10GB total                       │
│                                        │
│ With PQ (64 subquantizers):            │
│   1M vectors: ~64MB                    │
│   = 100x reduction                     │
└────────────────────────────────────────┘

text

单向量内存占用:
┌────────────────────────────────────────┐
│ 1536维 ×4字节=每条向量6KB              │
│                                        │
│ 100万条向量:
│   原始数据:6GB
│   + HNSW图:额外2-4GB（取决于M参数）
│   总计:8-10GB
│                                        │
│ 使用PQ（64个子量化器）:
│   100万条向量:约64MB
│   内存缩减100倍
└────────────────────────────────────────┘

Operational Considerations

运维注意事项

Backup and Recovery

备份与恢复

Strategy	Description	RPO/RTO
Snapshots	Periodic full backup	Hours
WAL replication	Write-ahead log streaming	Minutes
Real-time sync	Synchronous replication	Seconds

策略	描述	RPO/RTO
快照	定期全量备份	小时级
WAL复制	预写日志流复制	分钟级
实时同步	同步复制	秒级

Monitoring Metrics

监控指标

Metric	Description	Alert Threshold
Query latency p99	99th percentile latency	> 100ms
Recall	Search accuracy	< 90%
QPS	Queries per second	Capacity dependent
Memory usage	Index memory	> 80%
Index freshness	Time since last update	Domain dependent

指标	描述	告警阈值
99分位查询延迟	99%查询的延迟	>100ms
召回率	搜索准确率	<90%
QPS	每秒查询数	取决于系统容量
内存使用率	索引内存占用	>80%
索引新鲜度	上次更新至今的时间	取决于业务场景

Index Maintenance

索引维护

text

┌─────────────────────────────────────────┐
│        Index Maintenance Tasks          │
├─────────────────────────────────────────┤
│ • Compaction: Merge small segments      │
│ • Reindex: Rebuild degraded index       │
│ • Vacuum: Remove deleted vectors        │
│ • Optimize: Tune parameters             │
│                                         │
│ Schedule during low-traffic periods     │
└─────────────────────────────────────────┘

text

┌─────────────────────────────────────────┐
│        索引维护任务                    │
├─────────────────────────────────────────┤
│ • 压缩：合并小分段                     │
│ • 重建索引：修复性能下降的索引         │
│ • 清理：删除已删除的向量               │
│ • 调优：调整参数                       │
│                                         │
│ 建议在低流量时段执行                   │
└─────────────────────────────────────────┘

Common Patterns

常见模式

Multi-Tenant Vector Search

多租户向量搜索

text

Option 1: Namespace/Collection per tenant
┌─────────────────────────────────────────┐
│ tenant_1_collection                     │
│ tenant_2_collection                     │
│ tenant_3_collection                     │
└─────────────────────────────────────────┘
Pro: Complete isolation
Con: Many indexes, operational overhead

Option 2: Single collection + tenant filter
┌─────────────────────────────────────────┐
│ shared_collection                       │
│   metadata: { tenant_id: "..." }        │
│   Pre-filter by tenant_id               │
└─────────────────────────────────────────┘
Pro: Simpler operations
Con: Requires efficient filtering

text

选项1：每个租户一个命名空间/集合
┌─────────────────────────────────────────┐
│ tenant_1_collection                     │
│ tenant_2_collection                     │
│ tenant_3_collection                     │
└─────────────────────────────────────────┘
优点：完全隔离
缺点：索引数量多，运维开销大

选项2：单集合+租户过滤器
┌─────────────────────────────────────────┐
│ shared_collection                       │
│   元数据: { tenant_id: "..." }        │
│   按tenant_id预过滤                     │
└─────────────────────────────────────────┘
优点：运维简单
缺点：需要高效的过滤能力

Real-Time Updates

实时更新

text

Write Path:
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Write     │    │   Buffer    │    │   Merge     │
│   Request   │───▶│   (Memory)  │───▶│   to Index  │
└─────────────┘    └─────────────┘    └─────────────┘

Strategy:
1. Buffer writes in memory
2. Periodically merge to main index
3. Search: main index + buffer
4. Compact periodically

text

写入流程:
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   写入请求   │    │   内存缓冲区 │    │   合并到索引 │
│              │───▶│              │───▶│              │
└─────────────┘    └─────────────┘    └─────────────┘

策略:
1. 将写入请求缓存在内存中
2. 定期合并到主索引
3. 搜索时查询主索引+缓冲区
4. 定期执行压缩

Embedding Versioning

嵌入向量版本管理

text

Version 1 embeddings ──┐
                       │
Version 2 embeddings ──┼──▶ Parallel indexes during migration
                       │
                       │    ┌─────────────────────┐
                       └───▶│ Gradual reindexing  │
                            │ Blue-green switch   │
                            └─────────────────────┘

text

版本1嵌入向量 ──┐
                       │
版本2嵌入向量 ──┼──▶ 迁移期间并行运行多个索引
                       │
                       │    ┌─────────────────────┐
                       └───▶│ 渐进式重建索引      │
                            │ 蓝绿部署切换        │
                            └─────────────────────┘

Cost Estimation

成本估算

Storage Costs

存储成本

text

Cost = (vectors × dimensions × bytes × replication) / GB × $/GB/month

Example:
10M vectors × 1536 dims × 4 bytes × 3 replicas = 184 GB
At $0.10/GB/month = $18.40/month storage

Note: Memory (for serving) costs more than storage

text

成本 = (向量数 × 维度 × 字节数 × 副本数) / GB × 美元/GB/月

示例:
1000万条向量 ×1536维 ×4字节 ×3副本=184GB
按0.10美元/GB/月计算=每月存储成本18.40美元

注意：服务内存成本高于存储成本

Compute Costs

计算成本

text

Factors:
• QPS (queries per second)
• Latency requirements
• Index type (HNSW needs more RAM)
• Filtering complexity

Rule of thumb:
• 1M vectors, HNSW, <50ms latency: 16GB RAM
• 10M vectors, HNSW, <50ms latency: 64-128GB RAM
• 100M vectors: Distributed system required

text

影响因素:
• QPS（每秒查询数）
• 延迟要求
• 索引类型（HNSW需要更多内存）
• 过滤复杂度

经验法则:
• 100万条向量，HNSW，延迟<50ms:16GB内存
• 1000万条向量，HNSW，延迟<50ms:64-128GB内存
• 1亿条向量:需要分布式系统

Related Skills

Version History

版本历史

v1.0.0 (2025-12-26): Initial release - Vector database patterns for systems design

v1.0.0 (2025-12-26): 初始版本 - 系统设计中的向量数据库模式

Last Updated

最后更新

Date: 2025-12-26

日期: 2025-12-26