storing-and-querying-vectors
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseStore and Query Vectors with Amazon S3 Vectors
使用Amazon S3 Vectors存储和查询向量
Overview
概述
Amazon S3 Vectors is a cost-effective AWS service for storing and querying vector embeddings at scale. Optimized for long-term storage with subsecond latency for cold queries, as low as 100ms for warm queries.
Amazon S3 Vectors是AWS推出的一款经济高效的服务,用于大规模存储和查询向量嵌入。针对长期存储进行优化,冷查询延迟可达亚秒级,热查询延迟低至100ms。
Decision Guide
决策指南
- Hundreds/thousands of sustained queries per second (QPS): Wrong tool. Recommend OpenSearch.
- Hybrid search, aggregations, faceted search: Recommend OpenSearch with S3 Vectors as storage engine. For OpenSearch integration, search AWS docs for .
"Using S3 Vectors with OpenSearch Service" - Tiered (bulk + hot): S3 Vectors for storage + OpenSearch Serverless for real-time. See .
references/limits-and-patterns.md - Cost-effective storage, infrequent queries, RAG: S3 Vectors is the right fit. Proceed.
For latest guidance, search AWS docs for .
"S3 Vectors best practices"- 每秒数千次持续查询(QPS):工具选择错误。推荐使用OpenSearch。
- 混合搜索、聚合、分面搜索:推荐将OpenSearch与S3 Vectors作为存储引擎结合使用。如需OpenSearch集成,请在AWS文档中搜索。
"Using S3 Vectors with OpenSearch Service" - 分层存储(批量+热数据):使用S3 Vectors进行存储 + OpenSearch Serverless处理实时数据。请查看。
references/limits-and-patterns.md - 低成本存储、低频查询、RAG场景:S3 Vectors是合适的选择,可以继续操作。
如需最新指导,请在AWS文档中搜索。
"S3 Vectors best practices"Common Tasks
常见任务
Classify the request before starting:
- Simple query: Existing index, skip to Step 6
- Standard: You MUST list existing indexes first and suggest reusing if relevant. Else, new index + store vectors, follow Steps 2-6
- Migration or multi-tenant: Read first, then Steps 2-6
references/limits-and-patterns.md
You MUST execute commands using AWS MCP server tools when connected. Fall back to AWS CLI only if AWS MCP is unavailable. You MUST explain each step to the user before executing.
开始操作前请先对请求进行分类:
- 简单查询:已有索引,直接跳至步骤6
- 标准流程:必须先列出所有现有索引,建议复用相关索引。若无合适索引,则创建新索引并存储向量,遵循步骤2-6
- 迁移或多租户场景:先阅读,再执行步骤2-6
references/limits-and-patterns.md
连接后必须使用AWS MCP服务器工具执行命令。仅当AWS MCP不可用时,才回退使用AWS CLI。执行每个步骤前必须向用户解释操作内容。
1. Verify Dependencies
1. 验证依赖项
Constraints:
- You MUST check whether AWS MCP tools or AWS CLI is available and inform user if missing
- You MUST confirm target AWS region
约束条件:
- 必须检查AWS MCP工具或AWS CLI是否可用,若缺失需告知用户
- 必须确认目标AWS区域
2. Create a Vector Bucket
2. 创建向量存储桶
You MUST confirm bucket name with user. Names: 3-63 chars, lowercase letters, numbers, hyphens only. Encryption (SSE-S3 default or SSE-KMS for compliance) is immutable after creation.
bash
aws s3vectors create-vector-bucket \
--vector-bucket-name <BUCKET_NAME>Constraints:
- You MUST explain encryption cannot be changed after creation
- For SSE-KMS, KMS key policy MUST grant and
kms:GenerateDataKeyto the S3 Vectors service principalkms:Decrypt. You MUST use full KMS key ARN (not alias). Seeindexing.s3vectors.amazonaws.comfor command example.references/limits-and-patterns.md
必须与用户确认存储桶名称。命名规则:3-63个字符,仅包含小写字母、数字和连字符。加密方式(默认SSE-S3或合规性要求的SSE-KMS)创建后不可更改。
bash
aws s3vectors create-vector-bucket \
--vector-bucket-name <BUCKET_NAME>约束条件:
- 必须向用户说明加密方式创建后无法更改
- 若使用SSE-KMS,KMS密钥策略必须授予S3 Vectors服务主体
indexing.s3vectors.amazonaws.com和kms:GenerateDataKey权限。必须使用完整的KMS密钥ARN(而非别名)。命令示例请查看kms:Decrypt。references/limits-and-patterns.md
3. Create a Vector Index
3. 创建向量索引
Every parameter is immutable after creation.
Pre-flight checklist (confirm ALL with user):
- Dimension (required, integer 1-4096) -- MUST match embedding model output
- Distance metric (required) -- or
cosine. Use embedding model's recommended metric;euclidean - Non-filterable metadata keys (optional, max 10, 1-63 chars) -- Declare at creation or lose forever. For Bedrock Knowledge Bases integration, search AWS docs for to get the required key names.
"S3 Vectors Bedrock Knowledge Bases prerequisites" - Encryption (optional) -- Inherits from bucket. Override per-index if needed.
bash
aws s3vectors create-index \
--vector-bucket-name <BUCKET_NAME> \
--index-name <INDEX_NAME> \
--dimension <DIM> \
--distance-metric <cosine|euclidean> \
--data-type float32 \
--metadata-configuration '{"nonFilterableMetadataKeys":["<KEY1>","<KEY2>"]}'Omit if no non-filterable keys are needed.
--metadata-configurationIndex names: 3-63 chars, lowercase, numbers, hyphens, dots. Unique within bucket. Filterable metadata: 2 KB limit. Total metadata (filterable + non-filterable combined): 40 KB. See .
references/metadata-filtering.md所有参数创建后均不可更改。
预检查清单(需与用户确认所有项):
- 维度(必填,整数1-4096)——必须与嵌入模型输出匹配
- 距离度量(必填)——或
cosine。使用嵌入模型推荐的度量方式;euclidean - 不可过滤元数据键(可选,最多10个,1-63字符)——必须在创建时声明,否则无法添加。如需与Bedrock知识库集成,请在AWS文档中搜索获取所需键名。
"S3 Vectors Bedrock Knowledge Bases prerequisites" - 加密(可选)——继承自存储桶。如有需要可按索引单独设置。
bash
aws s3vectors create-index \
--vector-bucket-name <BUCKET_NAME> \
--index-name <INDEX_NAME> \
--dimension <DIM> \
--distance-metric <cosine|euclidean> \
--data-type float32 \
--metadata-configuration '{"nonFilterableMetadataKeys":["<KEY1>","<KEY2>"]}'若无不可过滤键,可省略参数。
--metadata-configuration索引命名规则:3-63个字符,仅包含小写字母、数字、连字符和点。在存储桶内唯一。可过滤元数据限制为2 KB。总元数据(可过滤+不可过滤)限制为40 KB。请查看。
references/metadata-filtering.md4. Generate Embeddings (if needed)
4. 生成嵌入向量(如需)
Skip to Step 5 (store) or Step 6 (query) if user already has embeddings.
Constraints:
- You MUST ask which embedding model to use if not specified
- You MUST NOT assume a default model
- Dimension MUST match Step 3
- You MUST use the same model for both storing and querying
Generate embeddings with Bedrock invoke-model:
bash
aws bedrock-runtime invoke-model \
--model-id <MODEL_ID> \
--content-type application/json \
--cli-binary-format raw-in-base64-out \
--body '{"inputText": "your text"}' \
invoke-model-output.jsonYou MUST use for CLI v2. Output file is required for CLI. The response key is model-dependent (e.g., embedding for Titan, embeddings for Cohere). For Titan, parse with . Use array as in put-vectors or query-vectors. For batch embedding generation, use AWS SDK or CLI.
--cli-binary-format raw-in-base64-outjson.load(open('invoke-model-output.json'))['embedding']embeddingfloat32若用户已有嵌入向量,可直接跳至步骤5(存储)或步骤6(查询)。
约束条件:
- 若未指定嵌入模型,必须询问用户使用哪种模型
- 不得默认使用某一模型
- 维度必须与步骤3匹配
- 存储和查询必须使用同一模型
使用Bedrock invoke-model生成嵌入向量:
bash
aws bedrock-runtime invoke-model \
--model-id <MODEL_ID> \
--content-type application/json \
--cli-binary-format raw-in-base64-out \
--body '{"inputText": "your text"}' \
invoke-model-output.jsonCLI v2必须使用参数。CLI需要输出文件。响应键取决于模型(例如,Titan模型使用,Cohere模型使用)。对于Titan模型,可通过解析结果。将数组作为类型用于put-vectors或query-vectors命令。如需批量生成嵌入向量,请使用AWS SDK或CLI。
--cli-binary-format raw-in-base64-outembeddingembeddingsjson.load(open('invoke-model-output.json'))['embedding']embeddingfloat325. Put Vectors
5. 存储向量
bash
aws s3vectors put-vectors \
--vector-bucket-name <BUCKET_NAME> \
--index-name <INDEX_NAME> \
--vectors '[{"key":"<ID>","data":{"float32":[<EMBEDDING>]},"metadata":{"topic":"science"}}]'Constraints:
- You MUST NOT exceed 500 vectors per call
- You SHOULD batch vectors for cost optimization
- For bulk operations, You SHOULD use an SDK instead of CLI -- vector payloads may be too large for shell arguments
- You MUST implement retry with backoff on
429 TooManyRequestsException - See for batch patterns
references/limits-and-patterns.md
bash
aws s3vectors put-vectors \
--vector-bucket-name <BUCKET_NAME> \
--index-name <INDEX_NAME> \
--vectors '[{"key":"<ID>","data":{"float32":[<EMBEDDING>]},"metadata":{"topic":"science"}}]'约束条件:
- 单次调用不得超过500个向量
- 建议批量处理向量以优化成本
- 对于批量操作,建议使用SDK而非CLI——向量负载可能过大,无法作为shell参数传递
- 遇到时必须实现退避重试机制
429 TooManyRequestsException - 批量处理模式请查看
references/limits-and-patterns.md
6. Query Vectors
6. 查询向量
Generate embedding if needed (Step 4), then query:
bash
aws s3vectors query-vectors \
--vector-bucket-name <BUCKET_NAME> \
--index-name <INDEX_NAME> \
--query-vector '{"float32":[<EMBEDDING>]}' \
--top-k 10 \
--return-distanceOptional: add and/or (both require GetVectors permission). See .
--return-metadata--filter '{"topic":{"$eq":"science"}}'references/metadata-filtering.mdExample response body:
{"vectors": [{"key": "id1", "distance": 0.45, "metadata": {"topic": "science"}}, ...], "distanceMetric": "cosine"}Constraints:
- Using or
--filterrequires both--return-metadataANDs3vectors:QueryVectorsIAM permissions. Without GetVectors, these options return 403.s3vectors:GetVectors
如需生成嵌入向量请执行步骤4,然后进行查询:
bash
aws s3vectors query-vectors \
--vector-bucket-name <BUCKET_NAME> \
--index-name <INDEX_NAME> \
--query-vector '{"float32":[<EMBEDDING>]}' \
--top-k 10 \
--return-distance可选参数:添加和/或(两者均需要GetVectors权限)。请查看。
--return-metadata--filter '{"topic":{"$eq":"science"}}'references/metadata-filtering.md响应示例:
{"vectors": [{"key": "id1", "distance": 0.45, "metadata": {"topic": "science"}}, ...], "distanceMetric": "cosine"}约束条件:
- 使用或
--filter参数需要同时拥有--return-metadata和s3vectors:QueryVectorsIAM权限。若无GetVectors权限,使用这些参数会返回403错误。s3vectors:GetVectors
Troubleshooting
故障排查
| Error | Cause | Fix |
|---|---|---|
| Dims don't match index | Use matching model, or delete/recreate index (confirm with user -- destroys all vectors). |
| Missing | Add |
Fewer results than | Few vectors match filter | Expected -- filtering is inline. Broaden filter. |
| Exceeded per-index rate limits | Retry with backoff. Shard across indexes for sustained throughput. Search AWS docs for |
| Missing | S3 Vectors uses |
| Request timeout or region not supported | Retry request. For regional availability, search AWS docs for |
| 错误 | 原因 | 解决方法 |
|---|---|---|
| 维度与索引不匹配 | 使用匹配的模型,或删除并重新创建索引(需与用户确认——此操作会销毁所有向量)。 |
使用 | 缺少 | 在IAM策略中添加 |
返回结果少于 | 符合过滤条件的向量较少 | 属于正常情况——过滤为内联操作。放宽过滤条件。 |
| 超出每个索引的速率限制 | 退避重试。为持续吞吐量需求将数据分片到多个索引。当前限制请在AWS文档中搜索 |
| 缺少 | S3 Vectors使用 |
| 请求超时或区域不支持 | 重试请求。区域可用性请在AWS文档中搜索 |
Additional Resources
额外资源
- limits-and-patterns.md -- Multi-tenant patterns, batch ingestion, SSE-KMS, migration
- metadata-filtering.md -- Filter operators, non-filterable metadata, Bedrock KB keys
- limits-and-patterns.md —— 多租户模式、批量导入、SSE-KMS、迁移
- metadata-filtering.md —— 过滤运算符、不可过滤元数据、Bedrock知识库键名