aws-messaging-and-streaming

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

AWS Messaging & Streaming Services

AWS消息与流服务

When answering AWS messaging and streaming questions, verify specific numbers, versions, limits, and behavioral details from service-specific skills or official AWS documentation. When uncertain, search skills or docs rather than guessing. Fabricated configuration options or incorrect version numbers are worse than admitting uncertainty.

When a question asks about recommended configurations (CloudWatch alarm settings, thresholds, missing data treatment), search for the service-specific skills or documentation rather than relying on general best practices.

在解答AWS消息与流服务相关问题时，请从特定服务技能文档或AWS官方文档中核实具体的数值、版本、限制和行为细节。若存在不确定的内容，请查阅技能文档或官方文档，不要猜测。编造配置选项或提供错误的版本号比承认不确定更糟糕。

当问题涉及推荐配置（CloudWatch告警设置、阈值、缺失数据处理）时，请查阅特定服务的技能文档或官方文档，不要依赖通用最佳实践。

Overview

概述

Domain expertise for choosing and using AWS services that move data between producers and consumers. This skill covers two fundamental patterns — messaging and streaming — and the AWS services that implement each. Use this skill to decide which pattern fits a workload, select the right service, and understand how services integrate with each other.

For specific guidance on individual AWS services, see reference files or service-specific Skills.

本领域专长用于选择和使用在生产者与消费者之间传输数据的AWS服务。此技能涵盖两种基础模式 —— 消息传递和流处理 —— 以及实现每种模式的AWS服务。使用此技能可判断哪种模式适合工作负载、选择合适的服务，并了解服务之间的集成方式。

如需各AWS服务的具体指导，请参考参考文件或特定服务技能文档。

Streaming and Messaging

流处理与消息传递

What Is Messaging?

什么是消息传递？

Messaging enables decoupled, asynchronous communication between components. A producer sends a message; one or more consumers receive and process it. Once processed, the message is typically deleted. Messaging services handle delivery guarantees, retries, and dead-letter routing.

Key characteristics:

Messages are consumed once (point-to-point) or fanned out (pub/sub), then removed
No replay — once acknowledged, a message is gone
Designed for command/request workloads, task distribution, and event notification

消息传递支持组件之间的解耦、异步通信。生产者发送消息，一个或多个消费者接收并处理消息。处理完成后，消息通常会被删除。消息传递服务负责处理交付保证、重试和死信路由。

核心特性：

消息被消费一次（点对点）或扇出发布/订阅，随后被移除
不支持重放 —— 确认接收后，消息即消失
专为命令/请求工作负载、任务分发和事件通知设计

What Is Streaming?

什么是流处理？

Streaming enables ordered, durable, high-throughput continuous data flow. Producers append records to a log; consumers read from positions in that log. Records persist for a configurable retention period regardless of consumption.

Key characteristics:

Records are retained and replayable within the retention window
Strict ordering within a partition/shard
Multiple independent consumers can read the same data at different positions
Designed for event sourcing, real-time analytics, change data capture, and continuous processing

流处理支持有序、持久、高吞吐量的连续数据流。生产者向日志追加记录，消费者从日志中的指定位置读取记录。无论是否被消费，记录都会在可配置的保留期内持久保存。

核心特性：

记录在保留窗口内可保留并重放
分区/分片内严格有序
多个独立消费者可在不同位置读取相同数据
专为事件溯源、实时分析、变更数据捕获和连续处理设计

Key Differences

核心差异

Dimension	Messaging	Streaming
Data lifecycle	Deleted after consumption	Retained for replay (hours to indefinitely)
Ordering	Best-effort (Standard) or per-group (FIFO)	Strict per-partition/shard
Consumer model	Competing consumers (work distribution)	Independent readers (fan-out by position)
Throughput pattern	Bursty, variable	Sustained, high-volume
Replay	Not supported (except DLQ redrive)	Native — seek to any position in retention
Typical latency	Milliseconds (push or short-poll)	Milliseconds to low seconds
Scaling unit	Concurrency (consumers/pollers)	Partitions or shards

维度	消息传递	流处理
数据生命周期	消费后删除	保留用于重放（数小时至无限期）
有序性	尽力而为（标准队列）或按组有序（FIFO队列）	分区/分片内严格有序
消费者模型	竞争消费者（工作负载分发）	独立读取者（按位置扇出）
吞吐量模式	突发、可变	持续、高容量
重放支持	不支持（死信队列重驱动除外）	原生支持 —— 可定位到保留期内的任意位置
典型延迟	毫秒级（推送或短轮询）	毫秒级至低秒级
扩展单元	并发数（消费者/轮询器）	分区或分片

Messaging Use Cases

消息传递用例

Decoupling microservices with request/response or command patterns
Distributing work across a pool of competing consumers (task queues)
Fan-out notifications where each subscriber acts independently
Workloads that are bursty and benefit from queue buffering
Migrating existing JMS/AMQP applications (Amazon MQ)

使用请求/响应或命令模式解耦微服务
在竞争消费者池中分发工作负载（任务队列）
每个订阅者独立操作的扇出通知
具有突发特性并可从队列缓冲中受益的工作负载
迁移现有JMS/AMQP应用（Amazon MQ）

Streaming Use Cases

流处理用例

Continuous, high-throughput data ingestion (logs, metrics, clickstreams, IoT telemetry)
Event sourcing where consumers need to replay from any point in time
Multiple independent consumers processing the same data differently
Real-time analytics, windowed aggregations, or complex event processing
Change data capture (CDC) pipelines

持续、高吞吐量的数据 ingestion（日志、指标、点击流、IoT遥测数据）
消费者需要从任意时间点重放的事件溯源场景
多个独立消费者以不同方式处理相同数据
实时分析、窗口聚合或复杂事件处理
变更数据捕获（CDC）管道

Messaging Services

消息传递服务

These services are generally used for messaging workloads. Sometimes streaming services (Kinesis Data Streams, Managed Streaming for Apache Kafka) are also used for messaging workloads, depending on exact use case and requirements.

Service	Best For	Key Differentiator
Amazon SQS	Task queues, decoupling, buffering	Fully managed, unlimited throughput (Standard), exactly-once (FIFO), fair queues for multi-tenant workloads
Amazon SNS	Fan-out, pub/sub notifications	Push to multiple subscribers (SQS, Lambda, HTTP, email, SMS)
Amazon EventBridge	Event routing, cross-account/SaaS integration	Content-based filtering, schema registry, 200+ AWS source integrations
Amazon MQ	Lift-and-shift of existing JMS/AMQP/MQTT apps	Protocol compatibility (ActiveMQ, RabbitMQ) for legacy migration

这些服务通常用于消息传递工作负载。有时流处理服务（Kinesis Data Streams、Managed Streaming for Apache Kafka）也会根据具体用例和需求用于消息传递工作负载。

服务	最佳适用场景	核心差异化特性
Amazon SQS	任务队列、解耦、缓冲	完全托管，无限吞吐量（标准队列），Exactly-Once（FIFO队列），适用于多租户工作负载的公平队列
Amazon SNS	扇出、发布/订阅通知	推送至多个订阅者（SQS、Lambda、HTTP、邮件、SMS）
Amazon EventBridge	事件路由、跨账户/SaaS集成	基于内容的过滤、Schema注册表、200+ AWS源集成
Amazon MQ	迁移现有JMS/AMQP/MQTT应用	协议兼容性（ActiveMQ、RabbitMQ），适用于遗留系统迁移

Streaming Services

流处理服务

These services are generally used for streaming workloads.

Service	Best For	Key Differentiator
Amazon Kinesis Data Streams	Real-time ingestion with AWS-native consumers	On-demand Advantage mode (instant scaling, no shard management), 1–365 day retention
Amazon Data Firehose	Zero-admin delivery to storage/analytics	Auto-scales, buffers, batches, and delivers to destinations
Amazon Managed Service for Apache Flink	Complex stream processing (joins, windows, state)	Full Apache Flink runtime — SQL, Java, Python APIs for stateful computation
Amazon MSK	Kafka-native workloads, ecosystem compatibility	Apache Kafka API, Express brokers (3x throughput, 20x faster scaling compared to Standard brokers), broad connector ecosystem

这些服务通常用于流处理工作负载。

服务	最佳适用场景	核心差异化特性
Amazon Kinesis Data Streams	使用AWS原生消费者进行实时数据 ingestion	On-Demand Advantage模式（即时扩展，无需分片管理），1–365天保留期
Amazon Data Firehose	零运维交付至存储/分析服务	自动扩展、缓冲、批处理并交付至目标端
Amazon Managed Service for Apache Flink	复杂流处理（关联、窗口、状态）	完整Apache Flink运行时 —— 支持SQL、Java、Python API进行有状态计算
Amazon MSK	Kafka原生工作负载、生态系统兼容性	Apache Kafka API，Express代理（相比标准代理，吞吐量提升3倍，扩展速度提升20倍），广泛的连接器生态系统

Common Integration Gotchas

常见集成陷阱

SQS system vs. user message attributes: Attributes like
```
AWSTraceHeader
```
(set by X-Ray / EventBridge / Pipes when sending to an SQS DLQ) and
```
SenderId
```
,
```
SentTimestamp
```
are SQS system attributes, NOT user message attributes. They are never returned by default from
```
ReceiveMessage
```
— request them explicitly via
```
AttributeNames=[...]
```
(or
```
MessageSystemAttributeNames
```
), separate from
```
MessageAttributeNames
```
which fetches user attributes. This matters for DLQs, where the trace header rides on the system attribute and the user-attributes slot carries the service's failure metadata (e.g. EventBridge's
```
RULE_ARN
```
,
```
ERROR_CODE
```
).
SNS → Firehose → S3 record separator: For SNS subscriptions using the
```
firehose
```
protocol that land in S3, records are already newline-delimited by default (NDJSON). Do NOT turn on Firehose's
```
AppendDelimiterToRecord
```
— SNS emits the newline itself, and enabling the processor produces double newlines.
EventBridge rule target DLQ + SNS subscription DLQ both need a DLQ queue policy. Attaching the DLQ alone is not enough — the DLQ silently drops messages until its queue policy allows the service principal. EventBridge:
```
PutTargets
```
with
```
DeadLetterConfig.Arn=<DLQ>
```
, plus SQS policy
```
Allow sqs:SendMessage
```
for
```
Service: events.amazonaws.com
```
with
```
aws:SourceArn
```
= the rule ARN. SNS:
```
SetSubscriptionAttributes
```
```
RedrivePolicy={"deadLetterTargetArn":"<DLQ>"}
```
, plus SQS policy allowing
```
Service: sns.amazonaws.com
```
scoped by the topic ARN.
SQS production defaults: long polling + customer-managed encryption. New queues default to short-poll (
```
ReceiveMessageWaitTimeSeconds=0
```
) and SSE-SQS (AWS-owned key). For production,
```
SetQueueAttributes
```
with
```
ReceiveMessageWaitTimeSeconds=20
```
(long polling) and
```
KmsMasterKeyId=<customer-managed key id/ARN>
```
rather than leaving
```
alias/aws/sqs
```
.
Broker and Kafka credentials belong in Secrets Manager, not connection strings. Do not hardcode usernames, passwords, or SASL/SCRAM credentials in application config, env vars, JAAS files, or IaC. For Amazon MQ (ActiveMQ/RabbitMQ) store broker users as secrets and fetch at startup; Lambda event source mappings for Amazon MQ require the broker credentials to be supplied as a Secrets Manager secret ARN (
```
BASIC_AUTH
```
), not inline. For MSK SASL/SCRAM the secret is not optional: it must be named with the
```
AmazonMSK_
```
prefix and encrypted with a customer-managed KMS key (secrets created with the default
```
aws/secretsmanager
```
key cannot be associated with a cluster), then attached via
```
BatchAssociateScramSecret
```
. Lambda event source mappings for MSK (SASL/SCRAM or mTLS) and self-managed Kafka also reference a Secrets Manager secret ARN rather than inline credentials. Enable rotation and scope IAM read access (
```
secretsmanager:GetSecretValue
```
) to the consuming role only. See AWS Well-Architected SEC02-BP03 Store and use secrets securely.
Service-principal resource policies need
aws:SourceArn
/
aws:SourceAccount
conditions. When a queue or topic policy grants a service principal like
```
events.amazonaws.com
```
,
```
sns.amazonaws.com
```
, or
```
s3.amazonaws.com
```
permission to
```
sqs:SendMessage
```
or
```
sns:Publish
```
, omitting source conditions opens a confused-deputy hole — any rule, topic, or bucket in any AWS account can drive writes. Scope every such statement with
```
aws:SourceArn
```
(the specific rule/topic/bucket/pipe ARN; use
```
ArnLike
```
with
```
*
```
when the ARN isn't fully known yet) and
```
aws:SourceAccount
```
(your account ID). For S3 event notifications both keys are required because S3 bucket ARNs don't carry the account ID, so
```
aws:SourceArn
```
alone doesn't constrain the account. The same pattern applies to role trust policies for IAM roles used by EventBridge rules and EventBridge Pipes (principal
```
events.amazonaws.com
```
/
```
pipes.amazonaws.com
```
,
```
aws:SourceArn
```
= the rule or pipe ARN) — not just the DLQ case called out above. See the IAM User Guide on The confused deputy problem.

SQS系统属性与用户消息属性： 诸如
```
AWSTraceHeader
```
（由X-Ray / EventBridge / Pipes发送至SQS死信队列时设置）、
```
SenderId
```
、
```
SentTimestamp
```
等属性属于SQS系统属性，而非用户消息属性。默认情况下，
```
ReceiveMessage
```
不会返回这些属性 —— 需通过
```
AttributeNames=[...]
```
（或
```
MessageSystemAttributeNames
```
）显式请求，与获取用户属性的
```
MessageAttributeNames
```
分开设置。这在死信队列场景中至关重要，因为跟踪头通过系统属性传递，而用户属性槽则携带服务的失败元数据（例如EventBridge的
```
RULE_ARN
```
、
```
ERROR_CODE
```
）。
SNS → Firehose → S3记录分隔符： 对于使用
```
firehose
```
协议并落地至S3的SNS订阅，默认情况下记录已采用换行分隔（NDJSON格式）。请勿开启Firehose的
```
AppendDelimiterToRecord
```
—— SNS本身会输出换行符，启用该处理器会产生双重换行。
EventBridge规则目标死信队列 + SNS订阅死信队列都需要死信队列策略。 仅附加死信队列是不够的 —— 死信队列会静默丢弃消息，直到其队列策略允许服务主体访问。EventBridge：使用
```
PutTargets
```
并设置
```
DeadLetterConfig.Arn=<DLQ>
```
，同时添加SQS策略，允许
```
Service: events.amazonaws.com
```
执行
```
sqs:SendMessage
```
，且
```
aws:SourceArn
```
等于规则ARN。SNS：使用
```
SetSubscriptionAttributes
```
设置
```
RedrivePolicy={"deadLetterTargetArn":"<DLQ>"}
```
，同时添加SQS策略，允许
```
Service: sns.amazonaws.com
```
访问，并按主题ARN限定范围。
SQS生产环境默认配置：长轮询 + 客户托管加密。 新队列默认采用短轮询（
```
ReceiveMessageWaitTimeSeconds=0
```
）和SSE-SQS（AWS托管密钥）。对于生产环境，请使用
```
SetQueueAttributes
```
设置
```
ReceiveMessageWaitTimeSeconds=20
```
（长轮询）和
```
KmsMasterKeyId=<客户托管密钥ID/ARN>
```
，而非保留默认的
```
alias/aws/sqs
```
。
代理和Kafka凭据应存储在Secrets Manager中，而非连接字符串。 请勿在应用配置、环境变量、JAAS文件或基础设施即代码中硬编码用户名、密码或SASL/SCRAM凭据。对于Amazon MQ（ActiveMQ/RabbitMQ），请将代理用户存储为机密并在启动时获取；Lambda针对Amazon MQ的事件源映射要求代理凭据以Secrets Manager机密ARN（
```
BASIC_AUTH
```
）的形式提供，而非内联。对于MSK SASL/SCRAM，机密是必填项：必须以
```
AmazonMSK_
```
前缀命名，并使用客户托管KMS密钥加密（使用默认
```
aws/secretsmanager
```
密钥创建的机密无法与集群关联），然后通过
```
BatchAssociateScramSecret
```
附加。Lambda针对MSK（SASL/SCRAM或mTLS）和自托管Kafka的事件源映射同样引用Secrets Manager机密ARN，而非内联凭据。启用凭据轮换，并将IAM读取权限（
```
secretsmanager:GetSecretValue
```
）限定为仅消费角色可访问。请查阅AWS Well-Architected SEC02-BP03 安全存储和使用机密。
服务主体资源策略需要
aws:SourceArn
/
aws:SourceAccount
条件。当队列或主题策略授予
```
events.amazonaws.com
```
、
```
sns.amazonaws.com
```
或
```
s3.amazonaws.com
```
等服务主体执行
```
sqs:SendMessage
```
或
```
sns:Publish
```
的权限时，省略源条件会导致“混淆代理”漏洞 —— 任何AWS账户中的规则、主题或存储桶都可驱动写入操作。请为每个此类语句添加
```
aws:SourceArn
```
（特定规则/主题/存储桶/Pipe的ARN；当ARN尚未完全确定时，使用
```
ArnLike
```
搭配
```
*
```
）和
```
aws:SourceAccount
```
（您的账户ID）限定范围。对于S3事件通知，这两个键都是必需的，因为S3存储桶ARN不包含账户ID，因此仅
```
aws:SourceArn
```
无法约束账户。同样的模式也适用于EventBridge规则和EventBridge Pipes使用的IAM角色信任策略（主体为
```
events.amazonaws.com
```
/
```
pipes.amazonaws.com
```
，
```
aws:SourceArn
```
等于规则或Pipe的ARN） —— 不仅限于上述死信队列场景。请查阅IAM用户指南中的混淆代理问题。