qdrant-vector-search
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseQdrant - Vector Similarity Search Engine
Qdrant - 向量相似度搜索引擎
High-performance vector database written in Rust for production RAG and semantic search.
基于Rust编写的高性能向量数据库,适用于生产级RAG和语义搜索。
When to use Qdrant
何时使用Qdrant
Use Qdrant when:
- Building production RAG systems requiring low latency
- Need hybrid search (vectors + metadata filtering)
- Require horizontal scaling with sharding/replication
- Want on-premise deployment with full data control
- Need multi-vector storage per record (dense + sparse)
- Building real-time recommendation systems
Key features:
- Rust-powered: Memory-safe, high performance
- Rich filtering: Filter by any payload field during search
- Multiple vectors: Dense, sparse, multi-dense per point
- Quantization: Scalar, product, binary for memory efficiency
- Distributed: Raft consensus, sharding, replication
- REST + gRPC: Both APIs with full feature parity
Use alternatives instead:
- Chroma: Simpler setup, embedded use cases
- FAISS: Maximum raw speed, research/batch processing
- Pinecone: Fully managed, zero ops preferred
- Weaviate: GraphQL preference, built-in vectorizers
使用Qdrant的场景:
- 构建需要低延迟的生产级RAG系统
- 需要混合搜索(向量 + 元数据过滤)
- 要求支持分片/复制的水平扩展
- 希望部署在本地并完全控制数据
- 需要为每条记录存储多向量(稠密 + 稀疏)
- 构建实时推荐系统
核心特性:
- Rust驱动:内存安全,高性能
- 丰富的过滤功能:搜索时可按任意负载字段过滤
- 多向量支持:每个点支持稠密、稀疏、多稠密向量
- 量化功能:标量、乘积、二进制量化,提升内存效率
- 分布式架构:Raft共识算法、分片、复制
- REST + gRPC:两种API完全功能对等
可选择替代方案的场景:
- Chroma:设置更简单,适用于嵌入式用例
- FAISS:追求最大原始速度,适用于研究/批处理
- Pinecone:偏好全托管、零运维的方案
- Weaviate:偏好GraphQL,内置向量化工具
Quick start
快速开始
Installation
安装
bash
undefinedbash
undefinedPython client
Python client
pip install qdrant-client
pip install qdrant-client
Docker (recommended for development)
Docker (recommended for development)
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
Docker with persistent storage
Docker with persistent storage
docker run -p 6333:6333 -p 6334:6334
-v $(pwd)/qdrant_storage:/qdrant/storage
qdrant/qdrant
-v $(pwd)/qdrant_storage:/qdrant/storage
qdrant/qdrant
undefineddocker run -p 6333:6333 -p 6334:6334
-v $(pwd)/qdrant_storage:/qdrant/storage
qdrant/qdrant
-v $(pwd)/qdrant_storage:/qdrant/storage
qdrant/qdrant
undefinedBasic usage
基本使用
python
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStructpython
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStructConnect to Qdrant
Connect to Qdrant
client = QdrantClient(host="localhost", port=6333)
client = QdrantClient(host="localhost", port=6333)
Create collection
Create collection
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
Insert vectors with payload
Insert vectors with payload
client.upsert(
collection_name="documents",
points=[
PointStruct(
id=1,
vector=[0.1, 0.2, ...], # 384-dim vector
payload={"title": "Doc 1", "category": "tech"}
),
PointStruct(
id=2,
vector=[0.3, 0.4, ...],
payload={"title": "Doc 2", "category": "science"}
)
]
)
client.upsert(
collection_name="documents",
points=[
PointStruct(
id=1,
vector=[0.1, 0.2, ...], # 384-dim vector
payload={"title": "Doc 1", "category": "tech"}
),
PointStruct(
id=2,
vector=[0.3, 0.4, ...],
payload={"title": "Doc 2", "category": "science"}
)
]
)
Search with filtering
Search with filtering
results = client.search(
collection_name="documents",
query_vector=[0.15, 0.25, ...],
query_filter={
"must": [{"key": "category", "match": {"value": "tech"}}]
},
limit=10
)
for point in results:
print(f"ID: {point.id}, Score: {point.score}, Payload: {point.payload}")
undefinedresults = client.search(
collection_name="documents",
query_vector=[0.15, 0.25, ...],
query_filter={
"must": [{"key": "category", "match": {"value": "tech"}}]
},
limit=10
)
for point in results:
print(f"ID: {point.id}, Score: {point.score}, Payload: {point.payload}")
undefinedCore concepts
核心概念
Points - Basic data unit
点 - 基本数据单元
python
from qdrant_client.models import PointStructpython
from qdrant_client.models import PointStructPoint = ID + Vector(s) + Payload
Point = ID + Vector(s) + Payload
point = PointStruct(
id=123, # Integer or UUID string
vector=[0.1, 0.2, 0.3, ...], # Dense vector
payload={ # Arbitrary JSON metadata
"title": "Document title",
"category": "tech",
"timestamp": 1699900000,
"tags": ["python", "ml"]
}
)
point = PointStruct(
id=123, # Integer or UUID string
vector=[0.1, 0.2, 0.3, ...], # Dense vector
payload={ # Arbitrary JSON metadata
"title": "Document title",
"category": "tech",
"timestamp": 1699900000,
"tags": ["python", "ml"]
}
)
Batch upsert (recommended)
Batch upsert (recommended)
client.upsert(
collection_name="documents",
points=[point1, point2, point3],
wait=True # Wait for indexing
)
undefinedclient.upsert(
collection_name="documents",
points=[point1, point2, point3],
wait=True # Wait for indexing
)
undefinedCollections - Vector containers
集合 - 向量容器
python
from qdrant_client.models import VectorParams, Distance, HnswConfigDiffpython
from qdrant_client.models import VectorParams, Distance, HnswConfigDiffCreate with HNSW configuration
Create with HNSW configuration
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(
size=384, # Vector dimensions
distance=Distance.COSINE # COSINE, EUCLID, DOT, MANHATTAN
),
hnsw_config=HnswConfigDiff(
m=16, # Connections per node (default 16)
ef_construct=100, # Build-time accuracy (default 100)
full_scan_threshold=10000 # Switch to brute force below this
),
on_disk_payload=True # Store payload on disk
)
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(
size=384, # Vector dimensions
distance=Distance.COSINE # COSINE, EUCLID, DOT, MANHATTAN
),
hnsw_config=HnswConfigDiff(
m=16, # Connections per node (default 16)
ef_construct=100, # Build-time accuracy (default 100)
full_scan_threshold=10000 # Switch to brute force below this
),
on_disk_payload=True # Store payload on disk
)
Collection info
Collection info
info = client.get_collection("documents")
print(f"Points: {info.points_count}, Vectors: {info.vectors_count}")
undefinedinfo = client.get_collection("documents")
print(f"Points: {info.points_count}, Vectors: {info.vectors_count}")
undefinedDistance metrics
距离度量
| Metric | Use Case | Range |
|---|---|---|
| Text embeddings, normalized vectors | 0 to 2 |
| Spatial data, image features | 0 to ∞ |
| Recommendations, unnormalized | -∞ to ∞ |
| Sparse features, discrete data | 0 to ∞ |
| 度量方式 | 适用场景 | 范围 |
|---|---|---|
| 文本嵌入、归一化向量 | 0 到 2 |
| 空间数据、图像特征 | 0 到 ∞ |
| 推荐系统、非归一化向量 | -∞ 到 ∞ |
| 稀疏特征、离散数据 | 0 到 ∞ |
Search operations
搜索操作
Basic search
基础搜索
python
undefinedpython
undefinedSimple nearest neighbor search
Simple nearest neighbor search
results = client.search(
collection_name="documents",
query_vector=[0.1, 0.2, ...],
limit=10,
with_payload=True,
with_vectors=False # Don't return vectors (faster)
)
undefinedresults = client.search(
collection_name="documents",
query_vector=[0.1, 0.2, ...],
limit=10,
with_payload=True,
with_vectors=False # Don't return vectors (faster)
)
undefinedFiltered search
带过滤的搜索
python
from qdrant_client.models import Filter, FieldCondition, MatchValue, Rangepython
from qdrant_client.models import Filter, FieldCondition, MatchValue, RangeComplex filtering
Complex filtering
results = client.search(
collection_name="documents",
query_vector=query_embedding,
query_filter=Filter(
must=[
FieldCondition(key="category", match=MatchValue(value="tech")),
FieldCondition(key="timestamp", range=Range(gte=1699000000))
],
must_not=[
FieldCondition(key="status", match=MatchValue(value="archived"))
]
),
limit=10
)
results = client.search(
collection_name="documents",
query_vector=query_embedding,
query_filter=Filter(
must=[
FieldCondition(key="category", match=MatchValue(value="tech")),
FieldCondition(key="timestamp", range=Range(gte=1699000000))
],
must_not=[
FieldCondition(key="status", match=MatchValue(value="archived"))
]
),
limit=10
)
Shorthand filter syntax
Shorthand filter syntax
results = client.search(
collection_name="documents",
query_vector=query_embedding,
query_filter={
"must": [
{"key": "category", "match": {"value": "tech"}},
{"key": "price", "range": {"gte": 10, "lte": 100}}
]
},
limit=10
)
undefinedresults = client.search(
collection_name="documents",
query_vector=query_embedding,
query_filter={
"must": [
{"key": "category", "match": {"value": "tech"}},
{"key": "price", "range": {"gte": 10, "lte": 100}}
]
},
limit=10
)
undefinedBatch search
批量搜索
python
from qdrant_client.models import SearchRequestpython
from qdrant_client.models import SearchRequestMultiple queries in one request
Multiple queries in one request
results = client.search_batch(
collection_name="documents",
requests=[
SearchRequest(vector=[0.1, ...], limit=5),
SearchRequest(vector=[0.2, ...], limit=5, filter={"must": [...]}),
SearchRequest(vector=[0.3, ...], limit=10)
]
)
undefinedresults = client.search_batch(
collection_name="documents",
requests=[
SearchRequest(vector=[0.1, ...], limit=5),
SearchRequest(vector=[0.2, ...], limit=5, filter={"must": [...]}),
SearchRequest(vector=[0.3, ...], limit=10)
]
)
undefinedRAG integration
RAG集成
With sentence-transformers
与sentence-transformers结合使用
python
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStructpython
from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStructInitialize
Initialize
encoder = SentenceTransformer("all-MiniLM-L6-v2")
client = QdrantClient(host="localhost", port=6333)
encoder = SentenceTransformer("all-MiniLM-L6-v2")
client = QdrantClient(host="localhost", port=6333)
Create collection
Create collection
client.create_collection(
collection_name="knowledge_base",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
client.create_collection(
collection_name="knowledge_base",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
Index documents
Index documents
documents = [
{"id": 1, "text": "Python is a programming language", "source": "wiki"},
{"id": 2, "text": "Machine learning uses algorithms", "source": "textbook"},
]
points = [
PointStruct(
id=doc["id"],
vector=encoder.encode(doc["text"]).tolist(),
payload={"text": doc["text"], "source": doc["source"]}
)
for doc in documents
]
client.upsert(collection_name="knowledge_base", points=points)
documents = [
{"id": 1, "text": "Python is a programming language", "source": "wiki"},
{"id": 2, "text": "Machine learning uses algorithms", "source": "textbook"},
]
points = [
PointStruct(
id=doc["id"],
vector=encoder.encode(doc["text"]).tolist(),
payload={"text": doc["text"], "source": doc["source"]}
)
for doc in documents
]
client.upsert(collection_name="knowledge_base", points=points)
RAG retrieval
RAG retrieval
def retrieve(query: str, top_k: int = 5) -> list[dict]:
query_vector = encoder.encode(query).tolist()
results = client.search(
collection_name="knowledge_base",
query_vector=query_vector,
limit=top_k
)
return [{"text": r.payload["text"], "score": r.score} for r in results]
def retrieve(query: str, top_k: int = 5) -> list[dict]:
query_vector = encoder.encode(query).tolist()
results = client.search(
collection_name="knowledge_base",
query_vector=query_vector,
limit=top_k
)
return [{"text": r.payload["text"], "score": r.score} for r in results]
Use in RAG pipeline
Use in RAG pipeline
context = retrieve("What is Python?")
prompt = f"Context: {context}\n\nQuestion: What is Python?"
undefinedcontext = retrieve("What is Python?")
prompt = f"Context: {context}\n\nQuestion: What is Python?"
undefinedWith LangChain
与LangChain结合使用
python
from langchain_community.vectorstores import Qdrant
from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Qdrant.from_documents(documents, embeddings, url="http://localhost:6333", collection_name="docs")
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})python
from langchain_community.vectorstores import Qdrant
from langchain_community.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Qdrant.from_documents(documents, embeddings, url="http://localhost:6333", collection_name="docs")
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})With LlamaIndex
与LlamaIndex结合使用
python
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
vector_store = QdrantVectorStore(client=client, collection_name="llama_docs")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()python
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
vector_store = QdrantVectorStore(client=client, collection_name="llama_docs")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()Multi-vector support
多向量支持
Named vectors (different embedding models)
命名向量(不同嵌入模型)
python
from qdrant_client.models import VectorParams, Distancepython
from qdrant_client.models import VectorParams, DistanceCollection with multiple vector types
Collection with multiple vector types
client.create_collection(
collection_name="hybrid_search",
vectors_config={
"dense": VectorParams(size=384, distance=Distance.COSINE),
"sparse": VectorParams(size=30000, distance=Distance.DOT)
}
)
client.create_collection(
collection_name="hybrid_search",
vectors_config={
"dense": VectorParams(size=384, distance=Distance.COSINE),
"sparse": VectorParams(size=30000, distance=Distance.DOT)
}
)
Insert with named vectors
Insert with named vectors
client.upsert(
collection_name="hybrid_search",
points=[
PointStruct(
id=1,
vector={
"dense": dense_embedding,
"sparse": sparse_embedding
},
payload={"text": "document text"}
)
]
)
client.upsert(
collection_name="hybrid_search",
points=[
PointStruct(
id=1,
vector={
"dense": dense_embedding,
"sparse": sparse_embedding
},
payload={"text": "document text"}
)
]
)
Search specific vector
Search specific vector
results = client.search(
collection_name="hybrid_search",
query_vector=("dense", query_dense), # Specify which vector
limit=10
)
undefinedresults = client.search(
collection_name="hybrid_search",
query_vector=("dense", query_dense), # Specify which vector
limit=10
)
undefinedSparse vectors (BM25, SPLADE)
稀疏向量(BM25、SPLADE)
python
from qdrant_client.models import SparseVectorParams, SparseIndexParams, SparseVectorpython
from qdrant_client.models import SparseVectorParams, SparseIndexParams, SparseVectorCollection with sparse vectors
Collection with sparse vectors
client.create_collection(
collection_name="sparse_search",
vectors_config={},
sparse_vectors_config={"text": SparseVectorParams(index=SparseIndexParams(on_disk=False))}
)
client.create_collection(
collection_name="sparse_search",
vectors_config={},
sparse_vectors_config={"text": SparseVectorParams(index=SparseIndexParams(on_disk=False))}
)
Insert sparse vector
Insert sparse vector
client.upsert(
collection_name="sparse_search",
points=[PointStruct(id=1, vector={"text": SparseVector(indices=[1, 5, 100], values=[0.5, 0.8, 0.2])}, payload={"text": "document"})]
)
undefinedclient.upsert(
collection_name="sparse_search",
points=[PointStruct(id=1, vector={"text": SparseVector(indices=[1, 5, 100], values=[0.5, 0.8, 0.2])}, payload={"text": "document"})]
)
undefinedQuantization (memory optimization)
量化(内存优化)
python
from qdrant_client.models import ScalarQuantization, ScalarQuantizationConfig, ScalarTypepython
from qdrant_client.models import ScalarQuantization, ScalarQuantizationConfig, ScalarTypeScalar quantization (4x memory reduction)
Scalar quantization (4x memory reduction)
client.create_collection(
collection_name="quantized",
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
quantization_config=ScalarQuantization(
scalar=ScalarQuantizationConfig(
type=ScalarType.INT8,
quantile=0.99, # Clip outliers
always_ram=True # Keep quantized in RAM
)
)
)
client.create_collection(
collection_name="quantized",
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
quantization_config=ScalarQuantization(
scalar=ScalarQuantizationConfig(
type=ScalarType.INT8,
quantile=0.99, # Clip outliers
always_ram=True # Keep quantized in RAM
)
)
)
Search with rescoring
Search with rescoring
results = client.search(
collection_name="quantized",
query_vector=query,
search_params={"quantization": {"rescore": True}}, # Rescore top results
limit=10
)
undefinedresults = client.search(
collection_name="quantized",
query_vector=query,
search_params={"quantization": {"rescore": True}}, # Rescore top results
limit=10
)
undefinedPayload indexing
负载索引
python
from qdrant_client.models import PayloadSchemaTypepython
from qdrant_client.models import PayloadSchemaTypeCreate payload index for faster filtering
Create payload index for faster filtering
client.create_payload_index(
collection_name="documents",
field_name="category",
field_schema=PayloadSchemaType.KEYWORD
)
client.create_payload_index(
collection_name="documents",
field_name="timestamp",
field_schema=PayloadSchemaType.INTEGER
)
client.create_payload_index(
collection_name="documents",
field_name="category",
field_schema=PayloadSchemaType.KEYWORD
)
client.create_payload_index(
collection_name="documents",
field_name="timestamp",
field_schema=PayloadSchemaType.INTEGER
)
Index types: KEYWORD, INTEGER, FLOAT, GEO, TEXT (full-text), BOOL
Index types: KEYWORD, INTEGER, FLOAT, GEO, TEXT (full-text), BOOL
undefinedundefinedProduction deployment
生产环境部署
Qdrant Cloud
Qdrant云服务
python
from qdrant_client import QdrantClientpython
from qdrant_client import QdrantClientConnect to Qdrant Cloud
Connect to Qdrant Cloud
client = QdrantClient(
url="https://your-cluster.cloud.qdrant.io",
api_key="your-api-key"
)
undefinedclient = QdrantClient(
url="https://your-cluster.cloud.qdrant.io",
api_key="your-api-key"
)
undefinedPerformance tuning
性能调优
python
undefinedpython
undefinedOptimize for search speed (higher recall)
Optimize for search speed (higher recall)
client.update_collection(
collection_name="documents",
hnsw_config=HnswConfigDiff(ef_construct=200, m=32)
)
client.update_collection(
collection_name="documents",
hnsw_config=HnswConfigDiff(ef_construct=200, m=32)
)
Optimize for indexing speed (bulk loads)
Optimize for indexing speed (bulk loads)
client.update_collection(
collection_name="documents",
optimizer_config={"indexing_threshold": 20000}
)
undefinedclient.update_collection(
collection_name="documents",
optimizer_config={"indexing_threshold": 20000}
)
undefinedBest practices
最佳实践
- Batch operations - Use batch upsert/search for efficiency
- Payload indexing - Index fields used in filters
- Quantization - Enable for large collections (>1M vectors)
- Sharding - Use for collections >10M vectors
- On-disk storage - Enable for large payloads
on_disk_payload - Connection pooling - Reuse client instances
- 批量操作 - 使用批量插入/搜索提升效率
- 负载索引 - 为过滤字段创建索引
- 量化功能 - 针对超过100万向量的大型集合启用
- 分片 - 针对超过1000万向量的集合使用
- 磁盘存储 - 针对大型负载启用
on_disk_payload - 连接池 - 复用客户端实例
Common issues
常见问题
Slow search with filters:
python
undefined带过滤的搜索速度慢:
python
undefinedCreate payload index for filtered fields
Create payload index for filtered fields
client.create_payload_index(
collection_name="docs",
field_name="category",
field_schema=PayloadSchemaType.KEYWORD
)
**Out of memory:**
```pythonclient.create_payload_index(
collection_name="docs",
field_name="category",
field_schema=PayloadSchemaType.KEYWORD
)
**内存不足:**
```pythonEnable quantization and on-disk storage
Enable quantization and on-disk storage
client.create_collection(
collection_name="large_collection",
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
quantization_config=ScalarQuantization(...),
on_disk_payload=True
)
**Connection issues:**
```pythonclient.create_collection(
collection_name="large_collection",
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
quantization_config=ScalarQuantization(...),
on_disk_payload=True
)
**连接问题:**
```pythonUse timeout and retry
Use timeout and retry
client = QdrantClient(
host="localhost",
port=6333,
timeout=30,
prefer_grpc=True # gRPC for better performance
)
undefinedclient = QdrantClient(
host="localhost",
port=6333,
timeout=30,
prefer_grpc=True # gRPC for better performance
)
undefinedReferences
参考资料
- Advanced Usage - Distributed mode, hybrid search, recommendations
- Troubleshooting - Common issues, debugging, performance tuning
- 高级用法 - 分布式模式、混合搜索、推荐系统
- 故障排除 - 常见问题、调试、性能调优
Resources
资源
- GitHub: https://github.com/qdrant/qdrant (22k+ stars)
- Docs: https://qdrant.tech/documentation/
- Python Client: https://github.com/qdrant/qdrant-client
- Cloud: https://cloud.qdrant.io
- Version: 1.12.0+
- License: Apache 2.0
- GitHub:https://github.com/qdrant/qdrant(22k+星标)
- 文档:https://qdrant.tech/documentation/
- Python客户端:https://github.com/qdrant/qdrant-client
- 云服务:https://cloud.qdrant.io
- 版本:1.12.0+
- 许可证:Apache 2.0