aws-messaging-and-streaming
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAWS Messaging & Streaming Services
AWS消息与流服务
When answering AWS messaging and streaming questions, verify specific numbers, versions, limits, and behavioral details from service-specific skills or official AWS documentation. When uncertain, search skills or docs rather than guessing. Fabricated configuration options or incorrect version numbers are worse than admitting uncertainty.
When a question asks about recommended configurations (CloudWatch alarm settings, thresholds, missing data treatment), search for the service-specific skills or documentation rather than relying on general best practices.
在解答AWS消息与流服务相关问题时,请从特定服务技能文档或AWS官方文档中核实具体的数值、版本、限制和行为细节。若存在不确定的内容,请查阅技能文档或官方文档,不要猜测。编造配置选项或提供错误的版本号比承认不确定更糟糕。
当问题涉及推荐配置(CloudWatch告警设置、阈值、缺失数据处理)时,请查阅特定服务的技能文档或官方文档,不要依赖通用最佳实践。
Overview
概述
Domain expertise for choosing and using AWS services that move data between producers and consumers.
This skill covers two fundamental patterns — messaging and streaming — and the AWS services that implement each.
Use this skill to decide which pattern fits a workload, select the right service, and understand how services integrate with each other.
For specific guidance on individual AWS services, see reference files or service-specific Skills.
本领域专长用于选择和使用在生产者与消费者之间传输数据的AWS服务。
此技能涵盖两种基础模式 —— 消息传递和流处理 —— 以及实现每种模式的AWS服务。
使用此技能可判断哪种模式适合工作负载、选择合适的服务,并了解服务之间的集成方式。
如需各AWS服务的具体指导,请参考参考文件或特定服务技能文档。
Streaming and Messaging
流处理与消息传递
What Is Messaging?
什么是消息传递?
Messaging enables decoupled, asynchronous communication between components. A producer sends a message; one or more consumers receive and process it. Once processed, the message is typically deleted. Messaging services handle delivery guarantees, retries, and dead-letter routing.
Key characteristics:
- Messages are consumed once (point-to-point) or fanned out (pub/sub), then removed
- No replay — once acknowledged, a message is gone
- Designed for command/request workloads, task distribution, and event notification
消息传递支持组件之间的解耦、异步通信。生产者发送消息,一个或多个消费者接收并处理消息。处理完成后,消息通常会被删除。消息传递服务负责处理交付保证、重试和死信路由。
核心特性:
- 消息被消费一次(点对点)或扇出发布/订阅,随后被移除
- 不支持重放 —— 确认接收后,消息即消失
- 专为命令/请求工作负载、任务分发和事件通知设计
What Is Streaming?
什么是流处理?
Streaming enables ordered, durable, high-throughput continuous data flow. Producers append records to a log; consumers read from positions in that log. Records persist for a configurable retention period regardless of consumption.
Key characteristics:
- Records are retained and replayable within the retention window
- Strict ordering within a partition/shard
- Multiple independent consumers can read the same data at different positions
- Designed for event sourcing, real-time analytics, change data capture, and continuous processing
流处理支持有序、持久、高吞吐量的连续数据流。生产者向日志追加记录,消费者从日志中的指定位置读取记录。无论是否被消费,记录都会在可配置的保留期内持久保存。
核心特性:
- 记录在保留窗口内可保留并重放
- 分区/分片内严格有序
- 多个独立消费者可在不同位置读取相同数据
- 专为事件溯源、实时分析、变更数据捕获和连续处理设计
Key Differences
核心差异
| Dimension | Messaging | Streaming |
|---|---|---|
| Data lifecycle | Deleted after consumption | Retained for replay (hours to indefinitely) |
| Ordering | Best-effort (Standard) or per-group (FIFO) | Strict per-partition/shard |
| Consumer model | Competing consumers (work distribution) | Independent readers (fan-out by position) |
| Throughput pattern | Bursty, variable | Sustained, high-volume |
| Replay | Not supported (except DLQ redrive) | Native — seek to any position in retention |
| Typical latency | Milliseconds (push or short-poll) | Milliseconds to low seconds |
| Scaling unit | Concurrency (consumers/pollers) | Partitions or shards |
| 维度 | 消息传递 | 流处理 |
|---|---|---|
| 数据生命周期 | 消费后删除 | 保留用于重放(数小时至无限期) |
| 有序性 | 尽力而为(标准队列)或按组有序(FIFO队列) | 分区/分片内严格有序 |
| 消费者模型 | 竞争消费者(工作负载分发) | 独立读取者(按位置扇出) |
| 吞吐量模式 | 突发、可变 | 持续、高容量 |
| 重放支持 | 不支持(死信队列重驱动除外) | 原生支持 —— 可定位到保留期内的任意位置 |
| 典型延迟 | 毫秒级(推送或短轮询) | 毫秒级至低秒级 |
| 扩展单元 | 并发数(消费者/轮询器) | 分区或分片 |
Messaging Use Cases
消息传递用例
- Decoupling microservices with request/response or command patterns
- Distributing work across a pool of competing consumers (task queues)
- Fan-out notifications where each subscriber acts independently
- Workloads that are bursty and benefit from queue buffering
- Migrating existing JMS/AMQP applications (Amazon MQ)
- 使用请求/响应或命令模式解耦微服务
- 在竞争消费者池中分发工作负载(任务队列)
- 每个订阅者独立操作的扇出通知
- 具有突发特性并可从队列缓冲中受益的工作负载
- 迁移现有JMS/AMQP应用(Amazon MQ)
Streaming Use Cases
流处理用例
- Continuous, high-throughput data ingestion (logs, metrics, clickstreams, IoT telemetry)
- Event sourcing where consumers need to replay from any point in time
- Multiple independent consumers processing the same data differently
- Real-time analytics, windowed aggregations, or complex event processing
- Change data capture (CDC) pipelines
- 持续、高吞吐量的数据 ingestion(日志、指标、点击流、IoT遥测数据)
- 消费者需要从任意时间点重放的事件溯源场景
- 多个独立消费者以不同方式处理相同数据
- 实时分析、窗口聚合或复杂事件处理
- 变更数据捕获(CDC)管道
Messaging Services
消息传递服务
These services are generally used for messaging workloads.
Sometimes streaming services (Kinesis Data Streams, Managed Streaming for Apache Kafka) are also used for messaging workloads, depending on exact use case and requirements.
| Service | Best For | Key Differentiator |
|---|---|---|
| Amazon SQS | Task queues, decoupling, buffering | Fully managed, unlimited throughput (Standard), exactly-once (FIFO), fair queues for multi-tenant workloads |
| Amazon SNS | Fan-out, pub/sub notifications | Push to multiple subscribers (SQS, Lambda, HTTP, email, SMS) |
| Amazon EventBridge | Event routing, cross-account/SaaS integration | Content-based filtering, schema registry, 200+ AWS source integrations |
| Amazon MQ | Lift-and-shift of existing JMS/AMQP/MQTT apps | Protocol compatibility (ActiveMQ, RabbitMQ) for legacy migration |
这些服务通常用于消息传递工作负载。
有时流处理服务(Kinesis Data Streams、Managed Streaming for Apache Kafka)也会根据具体用例和需求用于消息传递工作负载。
| 服务 | 最佳适用场景 | 核心差异化特性 |
|---|---|---|
| Amazon SQS | 任务队列、解耦、缓冲 | 完全托管,无限吞吐量(标准队列),Exactly-Once(FIFO队列),适用于多租户工作负载的公平队列 |
| Amazon SNS | 扇出、发布/订阅通知 | 推送至多个订阅者(SQS、Lambda、HTTP、邮件、SMS) |
| Amazon EventBridge | 事件路由、跨账户/SaaS集成 | 基于内容的过滤、Schema注册表、200+ AWS源集成 |
| Amazon MQ | 迁移现有JMS/AMQP/MQTT应用 | 协议兼容性(ActiveMQ、RabbitMQ),适用于遗留系统迁移 |
Streaming Services
流处理服务
These services are generally used for streaming workloads.
| Service | Best For | Key Differentiator |
|---|---|---|
| Amazon Kinesis Data Streams | Real-time ingestion with AWS-native consumers | On-demand Advantage mode (instant scaling, no shard management), 1–365 day retention |
| Amazon Data Firehose | Zero-admin delivery to storage/analytics | Auto-scales, buffers, batches, and delivers to destinations |
| Amazon Managed Service for Apache Flink | Complex stream processing (joins, windows, state) | Full Apache Flink runtime — SQL, Java, Python APIs for stateful computation |
| Amazon MSK | Kafka-native workloads, ecosystem compatibility | Apache Kafka API, Express brokers (3x throughput, 20x faster scaling compared to Standard brokers), broad connector ecosystem |
这些服务通常用于流处理工作负载。
| 服务 | 最佳适用场景 | 核心差异化特性 |
|---|---|---|
| Amazon Kinesis Data Streams | 使用AWS原生消费者进行实时数据 ingestion | On-Demand Advantage模式(即时扩展,无需分片管理),1–365天保留期 |
| Amazon Data Firehose | 零运维交付至存储/分析服务 | 自动扩展、缓冲、批处理并交付至目标端 |
| Amazon Managed Service for Apache Flink | 复杂流处理(关联、窗口、状态) | 完整Apache Flink运行时 —— 支持SQL、Java、Python API进行有状态计算 |
| Amazon MSK | Kafka原生工作负载、生态系统兼容性 | Apache Kafka API,Express代理(相比标准代理,吞吐量提升3倍,扩展速度提升20倍),广泛的连接器生态系统 |
Common Integration Gotchas
常见集成陷阱
-
SQS system vs. user message attributes: Attributes like(set by X-Ray / EventBridge / Pipes when sending to an SQS DLQ) and
AWSTraceHeader,SenderIdare SQS system attributes, NOT user message attributes. They are never returned by default fromSentTimestamp— request them explicitly viaReceiveMessage(orAttributeNames=[...]), separate fromMessageSystemAttributeNameswhich fetches user attributes. This matters for DLQs, where the trace header rides on the system attribute and the user-attributes slot carries the service's failure metadata (e.g. EventBridge'sMessageAttributeNames,RULE_ARN).ERROR_CODE -
SNS → Firehose → S3 record separator: For SNS subscriptions using theprotocol that land in S3, records are already newline-delimited by default (NDJSON). Do NOT turn on Firehose's
firehose— SNS emits the newline itself, and enabling the processor produces double newlines.AppendDelimiterToRecord -
EventBridge rule target DLQ + SNS subscription DLQ both need a DLQ queue policy. Attaching the DLQ alone is not enough — the DLQ silently drops messages until its queue policy allows the service principal. EventBridge:with
PutTargets, plus SQS policyDeadLetterConfig.Arn=<DLQ>forAllow sqs:SendMessagewithService: events.amazonaws.com= the rule ARN. SNS:aws:SourceArnSetSubscriptionAttributes, plus SQS policy allowingRedrivePolicy={"deadLetterTargetArn":"<DLQ>"}scoped by the topic ARN.Service: sns.amazonaws.com -
SQS production defaults: long polling + customer-managed encryption. New queues default to short-poll () and SSE-SQS (AWS-owned key). For production,
ReceiveMessageWaitTimeSeconds=0withSetQueueAttributes(long polling) andReceiveMessageWaitTimeSeconds=20rather than leavingKmsMasterKeyId=<customer-managed key id/ARN>.alias/aws/sqs -
Broker and Kafka credentials belong in Secrets Manager, not connection strings. Do not hardcode usernames, passwords, or SASL/SCRAM credentials in application config, env vars, JAAS files, or IaC. For Amazon MQ (ActiveMQ/RabbitMQ) store broker users as secrets and fetch at startup; Lambda event source mappings for Amazon MQ require the broker credentials to be supplied as a Secrets Manager secret ARN (), not inline. For MSK SASL/SCRAM the secret is not optional: it must be named with the
BASIC_AUTHprefix and encrypted with a customer-managed KMS key (secrets created with the defaultAmazonMSK_key cannot be associated with a cluster), then attached viaaws/secretsmanager. Lambda event source mappings for MSK (SASL/SCRAM or mTLS) and self-managed Kafka also reference a Secrets Manager secret ARN rather than inline credentials. Enable rotation and scope IAM read access (BatchAssociateScramSecret) to the consuming role only. See AWS Well-Architected SEC02-BP03 Store and use secrets securely.secretsmanager:GetSecretValue -
Service-principal resource policies need/
aws:SourceArnconditions. When a queue or topic policy grants a service principal likeaws:SourceAccount,events.amazonaws.com, orsns.amazonaws.compermission tos3.amazonaws.comorsqs:SendMessage, omitting source conditions opens a confused-deputy hole — any rule, topic, or bucket in any AWS account can drive writes. Scope every such statement withsns:Publish(the specific rule/topic/bucket/pipe ARN; useaws:SourceArnwithArnLikewhen the ARN isn't fully known yet) and*(your account ID). For S3 event notifications both keys are required because S3 bucket ARNs don't carry the account ID, soaws:SourceAccountalone doesn't constrain the account. The same pattern applies to role trust policies for IAM roles used by EventBridge rules and EventBridge Pipes (principalaws:SourceArn/events.amazonaws.com,pipes.amazonaws.com= the rule or pipe ARN) — not just the DLQ case called out above. See the IAM User Guide on The confused deputy problem.aws:SourceArn
-
SQS系统属性与用户消息属性: 诸如(由X-Ray / EventBridge / Pipes发送至SQS死信队列时设置)、
AWSTraceHeader、SenderId等属性属于SQS系统属性,而非用户消息属性。默认情况下,SentTimestamp不会返回这些属性 —— 需通过ReceiveMessage(或AttributeNames=[...])显式请求,与获取用户属性的MessageSystemAttributeNames分开设置。这在死信队列场景中至关重要,因为跟踪头通过系统属性传递,而用户属性槽则携带服务的失败元数据(例如EventBridge的MessageAttributeNames、RULE_ARN)。ERROR_CODE -
SNS → Firehose → S3记录分隔符: 对于使用协议并落地至S3的SNS订阅,默认情况下记录已采用换行分隔(NDJSON格式)。请勿开启Firehose的
firehose—— SNS本身会输出换行符,启用该处理器会产生双重换行。AppendDelimiterToRecord -
EventBridge规则目标死信队列 + SNS订阅死信队列都需要死信队列策略。 仅附加死信队列是不够的 —— 死信队列会静默丢弃消息,直到其队列策略允许服务主体访问。EventBridge:使用并设置
PutTargets,同时添加SQS策略,允许DeadLetterConfig.Arn=<DLQ>执行Service: events.amazonaws.com,且sqs:SendMessage等于规则ARN。SNS:使用aws:SourceArn设置SetSubscriptionAttributes,同时添加SQS策略,允许RedrivePolicy={"deadLetterTargetArn":"<DLQ>"}访问,并按主题ARN限定范围。Service: sns.amazonaws.com -
SQS生产环境默认配置:长轮询 + 客户托管加密。 新队列默认采用短轮询()和SSE-SQS(AWS托管密钥)。对于生产环境,请使用
ReceiveMessageWaitTimeSeconds=0设置SetQueueAttributes(长轮询)和ReceiveMessageWaitTimeSeconds=20,而非保留默认的KmsMasterKeyId=<客户托管密钥ID/ARN>。alias/aws/sqs -
代理和Kafka凭据应存储在Secrets Manager中,而非连接字符串。 请勿在应用配置、环境变量、JAAS文件或基础设施即代码中硬编码用户名、密码或SASL/SCRAM凭据。对于Amazon MQ(ActiveMQ/RabbitMQ),请将代理用户存储为机密并在启动时获取;Lambda针对Amazon MQ的事件源映射要求代理凭据以Secrets Manager机密ARN()的形式提供,而非内联。对于MSK SASL/SCRAM,机密是必填项:必须以
BASIC_AUTH前缀命名,并使用客户托管KMS密钥加密(使用默认AmazonMSK_密钥创建的机密无法与集群关联),然后通过aws/secretsmanager附加。Lambda针对MSK(SASL/SCRAM或mTLS)和自托管Kafka的事件源映射同样引用Secrets Manager机密ARN,而非内联凭据。启用凭据轮换,并将IAM读取权限(BatchAssociateScramSecret)限定为仅消费角色可访问。请查阅AWS Well-Architected SEC02-BP03 安全存储和使用机密。secretsmanager:GetSecretValue -
服务主体资源策略需要/
aws:SourceArn条件。 当队列或主题策略授予aws:SourceAccount、events.amazonaws.com或sns.amazonaws.com等服务主体执行s3.amazonaws.com或sqs:SendMessage的权限时,省略源条件会导致“混淆代理”漏洞 —— 任何AWS账户中的规则、主题或存储桶都可驱动写入操作。请为每个此类语句添加sns:Publish(特定规则/主题/存储桶/Pipe的ARN;当ARN尚未完全确定时,使用aws:SourceArn搭配ArnLike)和*(您的账户ID)限定范围。对于S3事件通知,这两个键都是必需的,因为S3存储桶ARN不包含账户ID,因此仅aws:SourceAccount无法约束账户。同样的模式也适用于EventBridge规则和EventBridge Pipes使用的IAM角色信任策略(主体为aws:SourceArn/events.amazonaws.com,pipes.amazonaws.com等于规则或Pipe的ARN) —— 不仅限于上述死信队列场景。请查阅IAM用户指南中的混淆代理问题。aws:SourceArn