documentdb-vector-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Vector Search — Azure DocumentDB (
cosmosSearch
)

向量搜索 — Azure DocumentDB(
cosmosSearch

Azure DocumentDB's native vector index type is
cosmosSearch
. Pick the sub-type by scale:
Index sub-typeScale sweet spotTier
vector-diskann
(recommended)
Up to 500k+ vectorsM30+
vector-hnsw
Up to ~50k vectorsM30+
vector-ivf
Under ~10k vectorsM10+
Similarity options:
COS
(cosine),
L2
(Euclidean),
IP
(inner product).
Azure DocumentDB的原生向量索引类型为
cosmosSearch
。可根据规模选择子类型:
索引子类型适用规模区间层级
vector-diskann
(推荐)
最多50万+向量M30+
vector-hnsw
最多约5万向量M30+
vector-ivf
少于约1万向量M10+
相似度选项:
COS
(余弦)、
L2
(欧几里得)、
IP
(内积)。

Rules

规则

  • vector-choose-index-type — Prefer DiskANN for production; use HNSW up to 50k, IVF under 10k.
  • vector-create-diskann-index — Create a
    vector-diskann
    index with correct
    dimensions
    ,
    similarity
    ,
    maxDegree
    , and
    lBuild
    .
  • vector-knn-query — Query with
    $search
    +
    cosmosSearch
    ; tune
    lSearch
    and
    k
    ; combine with pre-filters.
  • vector-product-quantization — Shrink high-dimensional vectors (up to 16,000 dims) while preserving recall.
  • vector-half-precision — Halve vector memory with fp16 indexing and minimal recall loss.
  • vector-normalize-embeddings — Normalize embeddings when using cosine similarity; store model + dimensions alongside vectors.
  • vector-choose-index-type — 生产环境优先选择DiskANN;向量规模达5万时使用HNSW,少于1万时使用IVF。
  • vector-create-diskann-index — 创建
    vector-diskann
    索引时需设置正确的
    dimensions
    similarity
    maxDegree
    lBuild
    参数。
  • vector-knn-query — 使用
    $search
    +
    cosmosSearch
    进行查询;调优
    lSearch
    k
    参数;结合预筛选器使用。
  • vector-product-quantization — 在保留召回率的前提下压缩高维向量(最高支持16000维)。
  • vector-half-precision — 通过fp16索引将向量内存占用减半,且召回率损失极小。
  • vector-normalize-embeddings — 使用余弦相似度时需归一化嵌入向量;将模型及维度信息与向量一同存储。