storing-and-querying-vectors

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Store and Query Vectors with Amazon S3 Vectors

使用Amazon S3 Vectors存储和查询向量

Overview

概述

Amazon S3 Vectors is a cost-effective AWS service for storing and querying vector embeddings at scale. Optimized for long-term storage with subsecond latency for cold queries, as low as 100ms for warm queries.
Amazon S3 Vectors是AWS推出的一款经济高效的服务,用于大规模存储和查询向量嵌入。针对长期存储进行优化,冷查询延迟可达亚秒级,热查询延迟低至100ms。

Decision Guide

决策指南

  • Hundreds/thousands of sustained queries per second (QPS): Wrong tool. Recommend OpenSearch.
  • Hybrid search, aggregations, faceted search: Recommend OpenSearch with S3 Vectors as storage engine. For OpenSearch integration, search AWS docs for
    "Using S3 Vectors with OpenSearch Service"
    .
  • Tiered (bulk + hot): S3 Vectors for storage + OpenSearch Serverless for real-time. See
    references/limits-and-patterns.md
    .
  • Cost-effective storage, infrequent queries, RAG: S3 Vectors is the right fit. Proceed.
For latest guidance, search AWS docs for
"S3 Vectors best practices"
.
  • 每秒数千次持续查询(QPS):工具选择错误。推荐使用OpenSearch。
  • 混合搜索、聚合、分面搜索:推荐将OpenSearch与S3 Vectors作为存储引擎结合使用。如需OpenSearch集成,请在AWS文档中搜索
    "Using S3 Vectors with OpenSearch Service"
  • 分层存储(批量+热数据):使用S3 Vectors进行存储 + OpenSearch Serverless处理实时数据。请查看
    references/limits-and-patterns.md
  • 低成本存储、低频查询、RAG场景:S3 Vectors是合适的选择,可以继续操作。
如需最新指导,请在AWS文档中搜索
"S3 Vectors best practices"

Common Tasks

常见任务

Classify the request before starting:
  • Simple query: Existing index, skip to Step 6
  • Standard: You MUST list existing indexes first and suggest reusing if relevant. Else, new index + store vectors, follow Steps 2-6
  • Migration or multi-tenant: Read
    references/limits-and-patterns.md
    first, then Steps 2-6
You MUST execute commands using AWS MCP server tools when connected. Fall back to AWS CLI only if AWS MCP is unavailable. You MUST explain each step to the user before executing.
开始操作前请先对请求进行分类:
  • 简单查询:已有索引,直接跳至步骤6
  • 标准流程:必须先列出所有现有索引,建议复用相关索引。若无合适索引,则创建新索引并存储向量,遵循步骤2-6
  • 迁移或多租户场景:先阅读
    references/limits-and-patterns.md
    ,再执行步骤2-6
连接后必须使用AWS MCP服务器工具执行命令。仅当AWS MCP不可用时,才回退使用AWS CLI。执行每个步骤前必须向用户解释操作内容。

1. Verify Dependencies

1. 验证依赖项

Constraints:
  • You MUST check whether AWS MCP tools or AWS CLI is available and inform user if missing
  • You MUST confirm target AWS region
约束条件:
  • 必须检查AWS MCP工具或AWS CLI是否可用,若缺失需告知用户
  • 必须确认目标AWS区域

2. Create a Vector Bucket

2. 创建向量存储桶

You MUST confirm bucket name with user. Names: 3-63 chars, lowercase letters, numbers, hyphens only. Encryption (SSE-S3 default or SSE-KMS for compliance) is immutable after creation.
bash
aws s3vectors create-vector-bucket \
  --vector-bucket-name <BUCKET_NAME>
Constraints:
  • You MUST explain encryption cannot be changed after creation
  • For SSE-KMS, KMS key policy MUST grant
    kms:GenerateDataKey
    and
    kms:Decrypt
    to the S3 Vectors service principal
    indexing.s3vectors.amazonaws.com
    . You MUST use full KMS key ARN (not alias). See
    references/limits-and-patterns.md
    for command example.
必须与用户确认存储桶名称。命名规则:3-63个字符,仅包含小写字母、数字和连字符。加密方式(默认SSE-S3或合规性要求的SSE-KMS)创建后不可更改。
bash
aws s3vectors create-vector-bucket \
  --vector-bucket-name <BUCKET_NAME>
约束条件:
  • 必须向用户说明加密方式创建后无法更改
  • 若使用SSE-KMS,KMS密钥策略必须授予S3 Vectors服务主体
    indexing.s3vectors.amazonaws.com
    kms:GenerateDataKey
    kms:Decrypt
    权限。必须使用完整的KMS密钥ARN(而非别名)。命令示例请查看
    references/limits-and-patterns.md

3. Create a Vector Index

3. 创建向量索引

Every parameter is immutable after creation.
Pre-flight checklist (confirm ALL with user):
  1. Dimension (required, integer 1-4096) -- MUST match embedding model output
  2. Distance metric (required) --
    cosine
    or
    euclidean
    . Use embedding model's recommended metric;
  3. Non-filterable metadata keys (optional, max 10, 1-63 chars) -- Declare at creation or lose forever. For Bedrock Knowledge Bases integration, search AWS docs for
    "S3 Vectors Bedrock Knowledge Bases prerequisites"
    to get the required key names.
  4. Encryption (optional) -- Inherits from bucket. Override per-index if needed.
bash
aws s3vectors create-index \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --dimension <DIM> \
  --distance-metric <cosine|euclidean> \
  --data-type float32 \
  --metadata-configuration '{"nonFilterableMetadataKeys":["<KEY1>","<KEY2>"]}'
Omit
--metadata-configuration
if no non-filterable keys are needed.
Index names: 3-63 chars, lowercase, numbers, hyphens, dots. Unique within bucket. Filterable metadata: 2 KB limit. Total metadata (filterable + non-filterable combined): 40 KB. See
references/metadata-filtering.md
.
所有参数创建后均不可更改
预检查清单(需与用户确认所有项):
  1. 维度(必填,整数1-4096)——必须与嵌入模型输出匹配
  2. 距离度量(必填)——
    cosine
    euclidean
    。使用嵌入模型推荐的度量方式;
  3. 不可过滤元数据键(可选,最多10个,1-63字符)——必须在创建时声明,否则无法添加。如需与Bedrock知识库集成,请在AWS文档中搜索
    "S3 Vectors Bedrock Knowledge Bases prerequisites"
    获取所需键名。
  4. 加密(可选)——继承自存储桶。如有需要可按索引单独设置。
bash
aws s3vectors create-index \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --dimension <DIM> \
  --distance-metric <cosine|euclidean> \
  --data-type float32 \
  --metadata-configuration '{"nonFilterableMetadataKeys":["<KEY1>","<KEY2>"]}'
若无不可过滤键,可省略
--metadata-configuration
参数。
索引命名规则:3-63个字符,仅包含小写字母、数字、连字符和点。在存储桶内唯一。可过滤元数据限制为2 KB。总元数据(可过滤+不可过滤)限制为40 KB。请查看
references/metadata-filtering.md

4. Generate Embeddings (if needed)

4. 生成嵌入向量(如需)

Skip to Step 5 (store) or Step 6 (query) if user already has embeddings.
Constraints:
  • You MUST ask which embedding model to use if not specified
  • You MUST NOT assume a default model
  • Dimension MUST match Step 3
  • You MUST use the same model for both storing and querying
Generate embeddings with Bedrock invoke-model:
bash
aws bedrock-runtime invoke-model \
  --model-id <MODEL_ID> \
  --content-type application/json \
  --cli-binary-format raw-in-base64-out \
  --body '{"inputText": "your text"}' \
  invoke-model-output.json
You MUST use
--cli-binary-format raw-in-base64-out
for CLI v2. Output file is required for CLI. The response key is model-dependent (e.g., embedding for Titan, embeddings for Cohere). For Titan, parse with
json.load(open('invoke-model-output.json'))['embedding']
. Use
embedding
array as
float32
in put-vectors or query-vectors. For batch embedding generation, use AWS SDK or CLI.
若用户已有嵌入向量,可直接跳至步骤5(存储)或步骤6(查询)。
约束条件:
  • 若未指定嵌入模型,必须询问用户使用哪种模型
  • 不得默认使用某一模型
  • 维度必须与步骤3匹配
  • 存储和查询必须使用同一模型
使用Bedrock invoke-model生成嵌入向量:
bash
aws bedrock-runtime invoke-model \
  --model-id <MODEL_ID> \
  --content-type application/json \
  --cli-binary-format raw-in-base64-out \
  --body '{"inputText": "your text"}' \
  invoke-model-output.json
CLI v2必须使用
--cli-binary-format raw-in-base64-out
参数。CLI需要输出文件。响应键取决于模型(例如,Titan模型使用
embedding
,Cohere模型使用
embeddings
)。对于Titan模型,可通过
json.load(open('invoke-model-output.json'))['embedding']
解析结果。将
embedding
数组作为
float32
类型用于put-vectors或query-vectors命令。如需批量生成嵌入向量,请使用AWS SDK或CLI。

5. Put Vectors

5. 存储向量

bash
aws s3vectors put-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --vectors '[{"key":"<ID>","data":{"float32":[<EMBEDDING>]},"metadata":{"topic":"science"}}]'
Constraints:
  • You MUST NOT exceed 500 vectors per call
  • You SHOULD batch vectors for cost optimization
  • For bulk operations, You SHOULD use an SDK instead of CLI -- vector payloads may be too large for shell arguments
  • You MUST implement retry with backoff on
    429 TooManyRequestsException
  • See
    references/limits-and-patterns.md
    for batch patterns
bash
aws s3vectors put-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --vectors '[{"key":"<ID>","data":{"float32":[<EMBEDDING>]},"metadata":{"topic":"science"}}]'
约束条件:
  • 单次调用不得超过500个向量
  • 建议批量处理向量以优化成本
  • 对于批量操作,建议使用SDK而非CLI——向量负载可能过大,无法作为shell参数传递
  • 遇到
    429 TooManyRequestsException
    时必须实现退避重试机制
  • 批量处理模式请查看
    references/limits-and-patterns.md

6. Query Vectors

6. 查询向量

Generate embedding if needed (Step 4), then query:
bash
aws s3vectors query-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --query-vector '{"float32":[<EMBEDDING>]}' \
  --top-k 10 \
  --return-distance
Optional: add
--return-metadata
and/or
--filter '{"topic":{"$eq":"science"}}'
(both require GetVectors permission). See
references/metadata-filtering.md
.
Example response body:
{"vectors": [{"key": "id1", "distance": 0.45, "metadata": {"topic": "science"}}, ...], "distanceMetric": "cosine"}
Constraints:
  • Using
    --filter
    or
    --return-metadata
    requires both
    s3vectors:QueryVectors
    AND
    s3vectors:GetVectors
    IAM permissions. Without GetVectors, these options return 403.
如需生成嵌入向量请执行步骤4,然后进行查询:
bash
aws s3vectors query-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --query-vector '{"float32":[<EMBEDDING>]}' \
  --top-k 10 \
  --return-distance
可选参数:添加
--return-metadata
和/或
--filter '{"topic":{"$eq":"science"}}'
(两者均需要GetVectors权限)。请查看
references/metadata-filtering.md
响应示例:
{"vectors": [{"key": "id1", "distance": 0.45, "metadata": {"topic": "science"}}, ...], "distanceMetric": "cosine"}
约束条件:
  • 使用
    --filter
    --return-metadata
    参数需要同时拥有
    s3vectors:QueryVectors
    s3vectors:GetVectors
    IAM权限。若无GetVectors权限,使用这些参数会返回403错误。

Troubleshooting

故障排查

ErrorCauseFix
DimensionMismatch
Dims don't match indexUse matching model, or delete/recreate index (confirm with user -- destroys all vectors).
403 Forbidden
with
--filter
or
--return-metadata
Missing
s3vectors:GetVectors
Add
s3vectors:GetVectors
to IAM policy.
Fewer results than
--top-k
Few vectors match filterExpected -- filtering is inline. Broaden filter.
429 TooManyRequestsException
Exceeded per-index rate limitsRetry with backoff. Shard across indexes for sustained throughput. Search AWS docs for
"S3 Vectors limitations and restrictions"
for current limits.
AccessDeniedException
Missing
s3vectors:*
IAM actions
S3 Vectors uses
s3vectors:*
namespace, not
s3:*
. Update IAM policy.
RequestTimeoutException
or service unavailable
Request timeout or region not supportedRetry request. For regional availability, search AWS docs for
"S3 Vectors limitations and restrictions"
.
错误原因解决方法
DimensionMismatch
维度与索引不匹配使用匹配的模型,或删除并重新创建索引(需与用户确认——此操作会销毁所有向量)。
使用
--filter
--return-metadata
时出现
403 Forbidden
缺少
s3vectors:GetVectors
权限
在IAM策略中添加
s3vectors:GetVectors
权限。
返回结果少于
--top-k
指定数量
符合过滤条件的向量较少属于正常情况——过滤为内联操作。放宽过滤条件。
429 TooManyRequestsException
超出每个索引的速率限制退避重试。为持续吞吐量需求将数据分片到多个索引。当前限制请在AWS文档中搜索
"S3 Vectors limitations and restrictions"
AccessDeniedException
缺少
s3vectors:*
IAM操作权限
S3 Vectors使用
s3vectors:*
命名空间,而非
s3:*
。更新IAM策略。
RequestTimeoutException
或服务不可用
请求超时或区域不支持重试请求。区域可用性请在AWS文档中搜索
"S3 Vectors limitations and restrictions"

Additional Resources

额外资源

  • limits-and-patterns.md -- Multi-tenant patterns, batch ingestion, SSE-KMS, migration
  • metadata-filtering.md -- Filter operators, non-filterable metadata, Bedrock KB keys
  • limits-and-patterns.md —— 多租户模式、批量导入、SSE-KMS、迁移
  • metadata-filtering.md —— 过滤运算符、不可过滤元数据、Bedrock知识库键名