Qdrant - Vector Similarity Search Engine

Qdrant - 向量相似度搜索引擎

High-performance vector database written in Rust for production RAG and semantic search.

基于Rust编写的高性能向量数据库，适用于生产级RAG和语义搜索。

When to use Qdrant

何时使用Qdrant

Use Qdrant when:

Building production RAG systems requiring low latency
Need hybrid search (vectors + metadata filtering)
Require horizontal scaling with sharding/replication
Want on-premise deployment with full data control
Need multi-vector storage per record (dense + sparse)
Building real-time recommendation systems

Key features:

Rust-powered: Memory-safe, high performance
Rich filtering: Filter by any payload field during search
Multiple vectors: Dense, sparse, multi-dense per point
Quantization: Scalar, product, binary for memory efficiency
Distributed: Raft consensus, sharding, replication
REST + gRPC: Both APIs with full feature parity

Use alternatives instead:

Chroma: Simpler setup, embedded use cases
FAISS: Maximum raw speed, research/batch processing
Pinecone: Fully managed, zero ops preferred
Weaviate: GraphQL preference, built-in vectorizers

使用Qdrant的场景：

构建需要低延迟的生产级RAG系统
需要混合搜索（向量 + 元数据过滤）
要求支持分片/复制的水平扩展
希望部署在本地并完全控制数据
需要为每条记录存储多向量（稠密 + 稀疏）
构建实时推荐系统

核心特性：

Rust驱动：内存安全，高性能
丰富的过滤功能：搜索时可按任意负载字段过滤
多向量支持：每个点支持稠密、稀疏、多稠密向量
量化功能：标量、乘积、二进制量化，提升内存效率
分布式架构：Raft共识算法、分片、复制
REST + gRPC：两种API完全功能对等

可选择替代方案的场景：

Chroma：设置更简单，适用于嵌入式用例
FAISS：追求最大原始速度，适用于研究/批处理
Pinecone：偏好全托管、零运维的方案
Weaviate：偏好GraphQL，内置向量化工具

Quick start

快速开始

Installation

安装

bash

undefined

bash

undefined

Python client

pip install qdrant-client

Docker (recommended for development)

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

Docker with persistent storage

docker run -p 6333:6333 -p 6334:6334
-v $(pwd)/qdrant_storage:/qdrant/storage
qdrant/qdrant

undefined

docker run -p 6333:6333 -p 6334:6334
-v $(pwd)/qdrant_storage:/qdrant/storage
qdrant/qdrant

undefined

Basic usage

基本使用

python

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

python

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

Connect to Qdrant

client = QdrantClient(host="localhost", port=6333)

Create collection

client.create_collection( collection_name="documents", vectors_config=VectorParams(size=384, distance=Distance.COSINE) )

Insert vectors with payload

client.upsert( collection_name="documents", points=[ PointStruct( id=1, vector=[0.1, 0.2, ...], # 384-dim vector payload={"title": "Doc 1", "category": "tech"} ), PointStruct( id=2, vector=[0.3, 0.4, ...], payload={"title": "Doc 2", "category": "science"} ) ] )

Search with filtering

results = client.search( collection_name="documents", query_vector=[0.15, 0.25, ...], query_filter={ "must": [{"key": "category", "match": {"value": "tech"}}] }, limit=10 )

for point in results: print(f"ID: {point.id}, Score: {point.score}, Payload: {point.payload}")

undefined

results = client.search( collection_name="documents", query_vector=[0.15, 0.25, ...], query_filter={ "must": [{"key": "category", "match": {"value": "tech"}}] }, limit=10 )

for point in results: print(f"ID: {point.id}, Score: {point.score}, Payload: {point.payload}")

undefined

Core concepts

核心概念

Points - Basic data unit

点 - 基本数据单元

python

from qdrant_client.models import PointStruct

python

from qdrant_client.models import PointStruct

Point = ID + Vector(s) + Payload

point = PointStruct( id=123, # Integer or UUID string vector=[0.1, 0.2, 0.3, ...], # Dense vector payload={ # Arbitrary JSON metadata "title": "Document title", "category": "tech", "timestamp": 1699900000, "tags": ["python", "ml"] } )

Batch upsert (recommended)

client.upsert( collection_name="documents", points=[point1, point2, point3], wait=True # Wait for indexing )

undefined

client.upsert( collection_name="documents", points=[point1, point2, point3], wait=True # Wait for indexing )

undefined

Collections - Vector containers

集合 - 向量容器

python

from qdrant_client.models import VectorParams, Distance, HnswConfigDiff

python

from qdrant_client.models import VectorParams, Distance, HnswConfigDiff

Create with HNSW configuration

client.create_collection( collection_name="documents", vectors_config=VectorParams( size=384, # Vector dimensions distance=Distance.COSINE # COSINE, EUCLID, DOT, MANHATTAN ), hnsw_config=HnswConfigDiff( m=16, # Connections per node (default 16) ef_construct=100, # Build-time accuracy (default 100) full_scan_threshold=10000 # Switch to brute force below this ), on_disk_payload=True # Store payload on disk )

Collection info

info = client.get_collection("documents") print(f"Points: {info.points_count}, Vectors: {info.vectors_count}")

undefined

info = client.get_collection("documents") print(f"Points: {info.points_count}, Vectors: {info.vectors_count}")

undefined

Distance metrics

距离度量

Metric	Use Case	Range
`COSINE`	Text embeddings, normalized vectors	0 to 2
`EUCLID`	Spatial data, image features	0 to ∞
`DOT`	Recommendations, unnormalized	-∞ to ∞
`MANHATTAN`	Sparse features, discrete data	0 to ∞

度量方式	适用场景	范围
`COSINE`	文本嵌入、归一化向量	0 到 2
`EUCLID`	空间数据、图像特征	0 到 ∞
`DOT`	推荐系统、非归一化向量	-∞ 到 ∞
`MANHATTAN`	稀疏特征、离散数据	0 到 ∞

Search operations

搜索操作

Basic search

基础搜索

python

undefined

python

undefined

Simple nearest neighbor search

results = client.search( collection_name="documents", query_vector=[0.1, 0.2, ...], limit=10, with_payload=True, with_vectors=False # Don't return vectors (faster) )

undefined

results = client.search( collection_name="documents", query_vector=[0.1, 0.2, ...], limit=10, with_payload=True, with_vectors=False # Don't return vectors (faster) )

undefined

Filtered search

带过滤的搜索

python

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

python

from qdrant_client.models import Filter, FieldCondition, MatchValue, Range

Complex filtering

results = client.search( collection_name="documents", query_vector=query_embedding, query_filter=Filter( must=[ FieldCondition(key="category", match=MatchValue(value="tech")), FieldCondition(key="timestamp", range=Range(gte=1699000000)) ], must_not=[ FieldCondition(key="status", match=MatchValue(value="archived")) ] ), limit=10 )

Shorthand filter syntax

results = client.search( collection_name="documents", query_vector=query_embedding, query_filter={ "must": [ {"key": "category", "match": {"value": "tech"}}, {"key": "price", "range": {"gte": 10, "lte": 100}} ] }, limit=10 )

undefined

results = client.search( collection_name="documents", query_vector=query_embedding, query_filter={ "must": [ {"key": "category", "match": {"value": "tech"}}, {"key": "price", "range": {"gte": 10, "lte": 100}} ] }, limit=10 )

undefined

Batch search

批量搜索

python

from qdrant_client.models import SearchRequest

python

from qdrant_client.models import SearchRequest

Multiple queries in one request

results = client.search_batch( collection_name="documents", requests=[ SearchRequest(vector=[0.1, ...], limit=5), SearchRequest(vector=[0.2, ...], limit=5, filter={"must": [...]}), SearchRequest(vector=[0.3, ...], limit=10) ] )

undefined

results = client.search_batch( collection_name="documents", requests=[ SearchRequest(vector=[0.1, ...], limit=5), SearchRequest(vector=[0.2, ...], limit=5, filter={"must": [...]}), SearchRequest(vector=[0.3, ...], limit=10) ] )

undefined

RAG integration

RAG集成

With sentence-transformers

与sentence-transformers结合使用

python

from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

python

from sentence_transformers import SentenceTransformer
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance, PointStruct

Initialize

encoder = SentenceTransformer("all-MiniLM-L6-v2") client = QdrantClient(host="localhost", port=6333)

Create collection

client.create_collection( collection_name="knowledge_base", vectors_config=VectorParams(size=384, distance=Distance.COSINE) )

Index documents

documents = [ {"id": 1, "text": "Python is a programming language", "source": "wiki"}, {"id": 2, "text": "Machine learning uses algorithms", "source": "textbook"}, ]

points = [ PointStruct( id=doc["id"], vector=encoder.encode(doc["text"]).tolist(), payload={"text": doc["text"], "source": doc["source"]} ) for doc in documents ] client.upsert(collection_name="knowledge_base", points=points)

documents = [ {"id": 1, "text": "Python is a programming language", "source": "wiki"}, {"id": 2, "text": "Machine learning uses algorithms", "source": "textbook"}, ]

points = [ PointStruct( id=doc["id"], vector=encoder.encode(doc["text"]).tolist(), payload={"text": doc["text"], "source": doc["source"]} ) for doc in documents ] client.upsert(collection_name="knowledge_base", points=points)

RAG retrieval

def retrieve(query: str, top_k: int = 5) -> list[dict]: query_vector = encoder.encode(query).tolist() results = client.search( collection_name="knowledge_base", query_vector=query_vector, limit=top_k ) return [{"text": r.payload["text"], "score": r.score} for r in results]

Use in RAG pipeline

context = retrieve("What is Python?") prompt = f"Context: {context}\n\nQuestion: What is Python?"

undefined

context = retrieve("What is Python?") prompt = f"Context: {context}\n\nQuestion: What is Python?"

undefined

With LangChain

与LangChain结合使用

python

from langchain_community.vectorstores import Qdrant
from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Qdrant.from_documents(documents, embeddings, url="http://localhost:6333", collection_name="docs")
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

python

from langchain_community.vectorstores import Qdrant
from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Qdrant.from_documents(documents, embeddings, url="http://localhost:6333", collection_name="docs")
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

With LlamaIndex

与LlamaIndex结合使用

python

from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext

vector_store = QdrantVectorStore(client=client, collection_name="llama_docs")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()

python

from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import VectorStoreIndex, StorageContext

vector_store = QdrantVectorStore(client=client, collection_name="llama_docs")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)
query_engine = index.as_query_engine()

Multi-vector support

多向量支持

Named vectors (different embedding models)

命名向量（不同嵌入模型）

python

from qdrant_client.models import VectorParams, Distance

python

from qdrant_client.models import VectorParams, Distance

Collection with multiple vector types

client.create_collection( collection_name="hybrid_search", vectors_config={ "dense": VectorParams(size=384, distance=Distance.COSINE), "sparse": VectorParams(size=30000, distance=Distance.DOT) } )

Insert with named vectors

client.upsert( collection_name="hybrid_search", points=[ PointStruct( id=1, vector={ "dense": dense_embedding, "sparse": sparse_embedding }, payload={"text": "document text"} ) ] )

Search specific vector

results = client.search( collection_name="hybrid_search", query_vector=("dense", query_dense), # Specify which vector limit=10 )

undefined

results = client.search( collection_name="hybrid_search", query_vector=("dense", query_dense), # Specify which vector limit=10 )

undefined

Sparse vectors (BM25, SPLADE)

稀疏向量（BM25、SPLADE）

python

from qdrant_client.models import SparseVectorParams, SparseIndexParams, SparseVector

python

from qdrant_client.models import SparseVectorParams, SparseIndexParams, SparseVector

Collection with sparse vectors

client.create_collection( collection_name="sparse_search", vectors_config={}, sparse_vectors_config={"text": SparseVectorParams(index=SparseIndexParams(on_disk=False))} )

Insert sparse vector

client.upsert( collection_name="sparse_search", points=[PointStruct(id=1, vector={"text": SparseVector(indices=[1, 5, 100], values=[0.5, 0.8, 0.2])}, payload={"text": "document"})] )

undefined

client.upsert( collection_name="sparse_search", points=[PointStruct(id=1, vector={"text": SparseVector(indices=[1, 5, 100], values=[0.5, 0.8, 0.2])}, payload={"text": "document"})] )

undefined

Quantization (memory optimization)

量化（内存优化）

python

from qdrant_client.models import ScalarQuantization, ScalarQuantizationConfig, ScalarType

python

from qdrant_client.models import ScalarQuantization, ScalarQuantizationConfig, ScalarType

Scalar quantization (4x memory reduction)

client.create_collection( collection_name="quantized", vectors_config=VectorParams(size=384, distance=Distance.COSINE), quantization_config=ScalarQuantization( scalar=ScalarQuantizationConfig( type=ScalarType.INT8, quantile=0.99, # Clip outliers always_ram=True # Keep quantized in RAM ) ) )

Search with rescoring

results = client.search( collection_name="quantized", query_vector=query, search_params={"quantization": {"rescore": True}}, # Rescore top results limit=10 )

undefined

results = client.search( collection_name="quantized", query_vector=query, search_params={"quantization": {"rescore": True}}, # Rescore top results limit=10 )

undefined

Payload indexing

负载索引

python

from qdrant_client.models import PayloadSchemaType

python

from qdrant_client.models import PayloadSchemaType

Create payload index for faster filtering

client.create_payload_index( collection_name="documents", field_name="category", field_schema=PayloadSchemaType.KEYWORD )

client.create_payload_index( collection_name="documents", field_name="timestamp", field_schema=PayloadSchemaType.INTEGER )

client.create_payload_index( collection_name="documents", field_name="category", field_schema=PayloadSchemaType.KEYWORD )

client.create_payload_index( collection_name="documents", field_name="timestamp", field_schema=PayloadSchemaType.INTEGER )

Index types: KEYWORD, INTEGER, FLOAT, GEO, TEXT (full-text), BOOL

undefined

undefined

Production deployment

生产环境部署

Qdrant Cloud

Qdrant云服务

python

from qdrant_client import QdrantClient

python

from qdrant_client import QdrantClient

Connect to Qdrant Cloud

client = QdrantClient( url="https://your-cluster.cloud.qdrant.io", api_key="your-api-key" )

undefined

client = QdrantClient( url="https://your-cluster.cloud.qdrant.io", api_key="your-api-key" )

undefined

Performance tuning

性能调优

python

undefined

python

undefined

Optimize for search speed (higher recall)

client.update_collection( collection_name="documents", hnsw_config=HnswConfigDiff(ef_construct=200, m=32) )

Optimize for indexing speed (bulk loads)

client.update_collection( collection_name="documents", optimizer_config={"indexing_threshold": 20000} )

undefined

client.update_collection( collection_name="documents", optimizer_config={"indexing_threshold": 20000} )

undefined

Best practices

最佳实践

Batch operations - Use batch upsert/search for efficiency
Payload indexing - Index fields used in filters
Quantization - Enable for large collections (>1M vectors)
Sharding - Use for collections >10M vectors
On-disk storage - Enable
```
on_disk_payload
```
for large payloads
Connection pooling - Reuse client instances

批量操作 - 使用批量插入/搜索提升效率
负载索引 - 为过滤字段创建索引
量化功能 - 针对超过100万向量的大型集合启用
分片 - 针对超过1000万向量的集合使用
磁盘存储 - 针对大型负载启用
```
on_disk_payload
```
连接池 - 复用客户端实例

Common issues

常见问题

Slow search with filters:

python

undefined

带过滤的搜索速度慢：

python

undefined

Create payload index for filtered fields

client.create_payload_index( collection_name="docs", field_name="category", field_schema=PayloadSchemaType.KEYWORD )


**Out of memory:**
```python

client.create_payload_index( collection_name="docs", field_name="category", field_schema=PayloadSchemaType.KEYWORD )


**内存不足：**
```python

Enable quantization and on-disk storage

client.create_collection( collection_name="large_collection", vectors_config=VectorParams(size=384, distance=Distance.COSINE), quantization_config=ScalarQuantization(...), on_disk_payload=True )


**Connection issues:**
```python

client.create_collection( collection_name="large_collection", vectors_config=VectorParams(size=384, distance=Distance.COSINE), quantization_config=ScalarQuantization(...), on_disk_payload=True )


**连接问题：**
```python

Use timeout and retry

client = QdrantClient( host="localhost", port=6333, timeout=30, prefer_grpc=True # gRPC for better performance )

undefined

client = QdrantClient( host="localhost", port=6333, timeout=30, prefer_grpc=True # gRPC for better performance )

undefined

References

参考资料

Advanced Usage - Distributed mode, hybrid search, recommendations
Troubleshooting - Common issues, debugging, performance tuning

高级用法 - 分布式模式、混合搜索、推荐系统
故障排除 - 常见问题、调试、性能调优

Resources

资源

GitHub: https://github.com/qdrant/qdrant (22k+ stars)
Docs: https://qdrant.tech/documentation/
Python Client: https://github.com/qdrant/qdrant-client
Cloud: https://cloud.qdrant.io
Version: 1.12.0+
License: Apache 2.0

GitHub：https://github.com/qdrant/qdrant（22k+星标）
文档：https://qdrant.tech/documentation/
Python客户端：https://github.com/qdrant/qdrant-client
云服务：https://cloud.qdrant.io
版本：1.12.0+
许可证：Apache 2.0

qdrant-vector-search

Original

Translation

Qdrant - Vector Similarity Search Engine

Qdrant - 向量相似度搜索引擎

When to use Qdrant

何时使用Qdrant

Quick start

快速开始

Installation

安装

Python client

Python client

Docker (recommended for development)

Docker (recommended for development)

Docker with persistent storage

Docker with persistent storage

Basic usage

基本使用

Connect to Qdrant

Connect to Qdrant

Create collection

Create collection

Insert vectors with payload

Insert vectors with payload

Search with filtering

Search with filtering

Core concepts

核心概念

Points - Basic data unit

点 - 基本数据单元

Point = ID + Vector(s) + Payload

Point = ID + Vector(s) + Payload

Batch upsert (recommended)

Batch upsert (recommended)

Collections - Vector containers

集合 - 向量容器

Create with HNSW configuration

Create with HNSW configuration

Collection info

Collection info

Distance metrics

距离度量

Search operations

搜索操作

Basic search

基础搜索

Simple nearest neighbor search

Simple nearest neighbor search

Filtered search

带过滤的搜索

Complex filtering

Complex filtering

Shorthand filter syntax

Shorthand filter syntax

Batch search

批量搜索

Multiple queries in one request

Multiple queries in one request

RAG integration

RAG集成

With sentence-transformers

与sentence-transformers结合使用

Initialize

Initialize

Create collection

Create collection

Index documents

Index documents

RAG retrieval

RAG retrieval

Use in RAG pipeline

Use in RAG pipeline

With LangChain

与LangChain结合使用

With LlamaIndex

与LlamaIndex结合使用

Multi-vector support

多向量支持

Named vectors (different embedding models)