storing-and-querying-vectors

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Store and Query Vectors with Amazon S3 Vectors

使用Amazon S3 Vectors存储和查询向量

Overview

概述

Amazon S3 Vectors is a cost-effective AWS service for storing and querying vector embeddings at scale. Optimized for long-term storage with subsecond latency for cold queries, as low as 100ms for warm queries.

Amazon S3 Vectors是AWS推出的一款经济高效的服务，用于大规模存储和查询向量嵌入。针对长期存储进行优化，冷查询延迟可达亚秒级，热查询延迟低至100ms。

Decision Guide

决策指南

Hundreds/thousands of sustained queries per second (QPS): Wrong tool. Recommend OpenSearch.
Hybrid search, aggregations, faceted search: Recommend OpenSearch with S3 Vectors as storage engine. For OpenSearch integration, search AWS docs for
```
"Using S3 Vectors with OpenSearch Service"
```
.
Tiered (bulk + hot): S3 Vectors for storage + OpenSearch Serverless for real-time. See
```
references/limits-and-patterns.md
```
.
Cost-effective storage, infrequent queries, RAG: S3 Vectors is the right fit. Proceed.

For latest guidance, search AWS docs for

"S3 Vectors best practices"

每秒数千次持续查询（QPS）：工具选择错误。推荐使用OpenSearch。
混合搜索、聚合、分面搜索：推荐将OpenSearch与S3 Vectors作为存储引擎结合使用。如需OpenSearch集成，请在AWS文档中搜索
```
"Using S3 Vectors with OpenSearch Service"
```
。
分层存储（批量+热数据）：使用S3 Vectors进行存储 + OpenSearch Serverless处理实时数据。请查看
```
references/limits-and-patterns.md
```
。
低成本存储、低频查询、RAG场景：S3 Vectors是合适的选择，可以继续操作。

如需最新指导，请在AWS文档中搜索

"S3 Vectors best practices"

。

Common Tasks

常见任务

Classify the request before starting:

Simple query: Existing index, skip to Step 6
Standard: You MUST list existing indexes first and suggest reusing if relevant. Else, new index + store vectors, follow Steps 2-6
Migration or multi-tenant: Read
```
references/limits-and-patterns.md
```
first, then Steps 2-6

You MUST execute commands using AWS MCP server tools when connected. Fall back to AWS CLI only if AWS MCP is unavailable. You MUST explain each step to the user before executing.

开始操作前请先对请求进行分类：

简单查询：已有索引，直接跳至步骤6
标准流程：必须先列出所有现有索引，建议复用相关索引。若无合适索引，则创建新索引并存储向量，遵循步骤2-6
迁移或多租户场景：先阅读
```
references/limits-and-patterns.md
```
，再执行步骤2-6

连接后必须使用AWS MCP服务器工具执行命令。仅当AWS MCP不可用时，才回退使用AWS CLI。执行每个步骤前必须向用户解释操作内容。

1. Verify Dependencies

1. 验证依赖项

Constraints:

You MUST check whether AWS MCP tools or AWS CLI is available and inform user if missing
You MUST confirm target AWS region

约束条件：

必须检查AWS MCP工具或AWS CLI是否可用，若缺失需告知用户
必须确认目标AWS区域

2. Create a Vector Bucket

2. 创建向量存储桶

You MUST confirm bucket name with user. Names: 3-63 chars, lowercase letters, numbers, hyphens only. Encryption (SSE-S3 default or SSE-KMS for compliance) is immutable after creation.

bash

aws s3vectors create-vector-bucket \
  --vector-bucket-name <BUCKET_NAME>

Constraints:

You MUST explain encryption cannot be changed after creation
For SSE-KMS, KMS key policy MUST grant
```
kms:GenerateDataKey
```
and
```
kms:Decrypt
```
to the S3 Vectors service principal
```
indexing.s3vectors.amazonaws.com
```
. You MUST use full KMS key ARN (not alias). See
```
references/limits-and-patterns.md
```
for command example.

必须与用户确认存储桶名称。命名规则：3-63个字符，仅包含小写字母、数字和连字符。加密方式（默认SSE-S3或合规性要求的SSE-KMS）创建后不可更改。

bash

aws s3vectors create-vector-bucket \
  --vector-bucket-name <BUCKET_NAME>

约束条件：

必须向用户说明加密方式创建后无法更改
若使用SSE-KMS，KMS密钥策略必须授予S3 Vectors服务主体
```
indexing.s3vectors.amazonaws.com
```
```
kms:GenerateDataKey
```
和
```
kms:Decrypt
```
权限。必须使用完整的KMS密钥ARN（而非别名）。命令示例请查看
```
references/limits-and-patterns.md
```
。

3. Create a Vector Index

3. 创建向量索引

Every parameter is immutable after creation.

Pre-flight checklist (confirm ALL with user):

Dimension (required, integer 1-4096) -- MUST match embedding model output
Distance metric (required) --
```
cosine
```
or
```
euclidean
```
. Use embedding model's recommended metric;
Non-filterable metadata keys (optional, max 10, 1-63 chars) -- Declare at creation or lose forever. For Bedrock Knowledge Bases integration, search AWS docs for
```
"S3 Vectors Bedrock Knowledge Bases prerequisites"
```
to get the required key names.
Encryption (optional) -- Inherits from bucket. Override per-index if needed.

bash

aws s3vectors create-index \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --dimension <DIM> \
  --distance-metric <cosine|euclidean> \
  --data-type float32 \
  --metadata-configuration '{"nonFilterableMetadataKeys":["<KEY1>","<KEY2>"]}'

Omit

--metadata-configuration

if no non-filterable keys are needed.

Index names: 3-63 chars, lowercase, numbers, hyphens, dots. Unique within bucket. Filterable metadata: 2 KB limit. Total metadata (filterable + non-filterable combined): 40 KB. See

references/metadata-filtering.md

所有参数创建后均不可更改。

预检查清单（需与用户确认所有项）：

维度（必填，整数1-4096）——必须与嵌入模型输出匹配
距离度量（必填）——
```
cosine
```
或
```
euclidean
```
。使用嵌入模型推荐的度量方式；
不可过滤元数据键（可选，最多10个，1-63字符）——必须在创建时声明，否则无法添加。如需与Bedrock知识库集成，请在AWS文档中搜索
```
"S3 Vectors Bedrock Knowledge Bases prerequisites"
```
获取所需键名。
加密（可选）——继承自存储桶。如有需要可按索引单独设置。

bash

aws s3vectors create-index \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --dimension <DIM> \
  --distance-metric <cosine|euclidean> \
  --data-type float32 \
  --metadata-configuration '{"nonFilterableMetadataKeys":["<KEY1>","<KEY2>"]}'

若无不可过滤键，可省略

--metadata-configuration

参数。

索引命名规则：3-63个字符，仅包含小写字母、数字、连字符和点。在存储桶内唯一。可过滤元数据限制为2 KB。总元数据（可过滤+不可过滤）限制为40 KB。请查看

references/metadata-filtering.md

。

4. Generate Embeddings (if needed)

4. 生成嵌入向量（如需）

Skip to Step 5 (store) or Step 6 (query) if user already has embeddings.

Constraints:

You MUST ask which embedding model to use if not specified
You MUST NOT assume a default model
Dimension MUST match Step 3
You MUST use the same model for both storing and querying

Generate embeddings with Bedrock invoke-model:

bash

aws bedrock-runtime invoke-model \
  --model-id <MODEL_ID> \
  --content-type application/json \
  --cli-binary-format raw-in-base64-out \
  --body '{"inputText": "your text"}' \
  invoke-model-output.json

You MUST use

--cli-binary-format raw-in-base64-out

for CLI v2. Output file is required for CLI. The response key is model-dependent (e.g., embedding for Titan, embeddings for Cohere). For Titan, parse with

json.load(open('invoke-model-output.json'))['embedding']

. Use

embedding

array as

float32

in put-vectors or query-vectors. For batch embedding generation, use AWS SDK or CLI.

若用户已有嵌入向量，可直接跳至步骤5（存储）或步骤6（查询）。

约束条件：

若未指定嵌入模型，必须询问用户使用哪种模型
不得默认使用某一模型
维度必须与步骤3匹配
存储和查询必须使用同一模型

使用Bedrock invoke-model生成嵌入向量：

bash

aws bedrock-runtime invoke-model \
  --model-id <MODEL_ID> \
  --content-type application/json \
  --cli-binary-format raw-in-base64-out \
  --body '{"inputText": "your text"}' \
  invoke-model-output.json

CLI v2必须使用

--cli-binary-format raw-in-base64-out

参数。CLI需要输出文件。响应键取决于模型（例如，Titan模型使用

embedding

，Cohere模型使用

embeddings

）。对于Titan模型，可通过

json.load(open('invoke-model-output.json'))['embedding']

解析结果。将

embedding

数组作为

float32

类型用于put-vectors或query-vectors命令。如需批量生成嵌入向量，请使用AWS SDK或CLI。

5. Put Vectors

5. 存储向量

bash

aws s3vectors put-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --vectors '[{"key":"<ID>","data":{"float32":[<EMBEDDING>]},"metadata":{"topic":"science"}}]'

Constraints:

You MUST NOT exceed 500 vectors per call
You SHOULD batch vectors for cost optimization
For bulk operations, You SHOULD use an SDK instead of CLI -- vector payloads may be too large for shell arguments
You MUST implement retry with backoff on
```
429 TooManyRequestsException
```
See
```
references/limits-and-patterns.md
```
for batch patterns

bash

aws s3vectors put-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --vectors '[{"key":"<ID>","data":{"float32":[<EMBEDDING>]},"metadata":{"topic":"science"}}]'

约束条件：

单次调用不得超过500个向量
建议批量处理向量以优化成本
对于批量操作，建议使用SDK而非CLI——向量负载可能过大，无法作为shell参数传递
遇到
```
429 TooManyRequestsException
```
时必须实现退避重试机制
批量处理模式请查看
```
references/limits-and-patterns.md
```

6. Query Vectors

6. 查询向量

Generate embedding if needed (Step 4), then query:

bash

aws s3vectors query-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --query-vector '{"float32":[<EMBEDDING>]}' \
  --top-k 10 \
  --return-distance

Optional: add

--return-metadata

and/or

--filter '{"topic":{"$eq":"science"}}'

(both require GetVectors permission). See

references/metadata-filtering.md

Example response body:

{"vectors": [{"key": "id1", "distance": 0.45, "metadata": {"topic": "science"}}, ...], "distanceMetric": "cosine"}

Constraints:

Using
```
--filter
```
or
```
--return-metadata
```
requires both
```
s3vectors:QueryVectors
```
AND
```
s3vectors:GetVectors
```
IAM permissions. Without GetVectors, these options return 403.

如需生成嵌入向量请执行步骤4，然后进行查询：

bash

aws s3vectors query-vectors \
  --vector-bucket-name <BUCKET_NAME> \
  --index-name <INDEX_NAME> \
  --query-vector '{"float32":[<EMBEDDING>]}' \
  --top-k 10 \
  --return-distance

可选参数：添加

--return-metadata

和/或

--filter '{"topic":{"$eq":"science"}}'

（两者均需要GetVectors权限）。请查看

references/metadata-filtering.md

。

响应示例：

{"vectors": [{"key": "id1", "distance": 0.45, "metadata": {"topic": "science"}}, ...], "distanceMetric": "cosine"}

约束条件：

使用
```
--filter
```
或
```
--return-metadata
```
参数需要同时拥有
```
s3vectors:QueryVectors
```
和
```
s3vectors:GetVectors
```
IAM权限。若无GetVectors权限，使用这些参数会返回403错误。

Troubleshooting

故障排查

Error	Cause	Fix
`DimensionMismatch`	Dims don't match index	Use matching model, or delete/recreate index (confirm with user -- destroys all vectors).
`403 Forbidden` with `--filter` or `--return-metadata`	Missing `s3vectors:GetVectors`	Add `s3vectors:GetVectors` to IAM policy.
Fewer results than `--top-k`	Few vectors match filter	Expected -- filtering is inline. Broaden filter.
`429 TooManyRequestsException`	Exceeded per-index rate limits	Retry with backoff. Shard across indexes for sustained throughput. Search AWS docs for `"S3 Vectors limitations and restrictions"` for current limits.
`AccessDeniedException`	Missing `s3vectors:*` IAM actions	S3 Vectors uses `s3vectors:` namespace, not `s3:` . Update IAM policy.
`RequestTimeoutException` or service unavailable	Request timeout or region not supported	Retry request. For regional availability, search AWS docs for `"S3 Vectors limitations and restrictions"` .

错误	原因	解决方法
`DimensionMismatch`	维度与索引不匹配	使用匹配的模型，或删除并重新创建索引（需与用户确认——此操作会销毁所有向量）。
使用 `--filter` 或 `--return-metadata` 时出现 `403 Forbidden`	缺少 `s3vectors:GetVectors` 权限	在IAM策略中添加 `s3vectors:GetVectors` 权限。
返回结果少于 `--top-k` 指定数量	符合过滤条件的向量较少	属于正常情况——过滤为内联操作。放宽过滤条件。
`429 TooManyRequestsException`	超出每个索引的速率限制	退避重试。为持续吞吐量需求将数据分片到多个索引。当前限制请在AWS文档中搜索 `"S3 Vectors limitations and restrictions"` 。
`AccessDeniedException`	缺少 `s3vectors:*` IAM操作权限	S3 Vectors使用 `s3vectors:` 命名空间，而非 `s3:` 。更新IAM策略。
`RequestTimeoutException` 或服务不可用	请求超时或区域不支持	重试请求。区域可用性请在AWS文档中搜索 `"S3 Vectors limitations and restrictions"` 。

Additional Resources

额外资源

limits-and-patterns.md -- Multi-tenant patterns, batch ingestion, SSE-KMS, migration
metadata-filtering.md -- Filter operators, non-filterable metadata, Bedrock KB keys

limits-and-patterns.md —— 多租户模式、批量导入、SSE-KMS、迁移
metadata-filtering.md —— 过滤运算符、不可过滤元数据、Bedrock知识库键名