iii-dead-letter-queues
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDead Letter Queues
死信队列(Dead Letter Queues)
Comparable to: SQS DLQ, RabbitMQ dead-letter exchanges
可类比于:SQS DLQ、RabbitMQ死信交换器
Key Concepts
核心概念
Use the concepts below when they fit the task. Not every queue failure needs manual DLQ intervention.
- Jobs move to a DLQ after exhausting with exponential backoff (
max_retries)backoff_ms * 2^attempt - Each DLQ entry preserves the original payload, last error, timestamp, and job metadata
- Redrive via the built-in function or the
iii::queue::redriveCLI commandiii trigger - Redriving resets attempt counters to zero, giving jobs a fresh retry cycle
- Always investigate and deploy fixes before redriving — blindly redriving repeats failures
- DLQ support available on Builtin and RabbitMQ adapters
当任务场景匹配时使用以下概念,并非所有队列失败都需要手动干预死信队列。
- 任务在通过指数退避()耗尽
backoff_ms * 2^attempt次数后,会转移至DLQmax_retries - 每个DLQ条目都会保留原始负载、最后一次错误信息、时间戳和任务元数据
- 通过内置的函数或
iii::queue::redriveCLI命令进行重新驱动(Redrive)iii trigger - 重新驱动会将尝试计数器重置为0,为任务提供全新的重试周期
- 重新驱动前务必先排查问题并部署修复方案——盲目重新驱动会重复触发失败
- 内置适配器和RabbitMQ适配器支持DLQ功能
Architecture
架构
A queue consumer fails processing a job. The engine retries with exponential backoff up to . Once exhausted, the message moves to the DLQ. An operator inspects the failure, deploys a fix, then redrives the DLQ to replay all failed jobs.
max_retries队列消费者处理任务失败后,引擎会通过指数退避进行重试,直至达到次数。一旦次数耗尽,消息会转移至DLQ。运维人员检查失败原因、部署修复后,重新驱动DLQ以重放所有失败任务。
max_retriesiii Primitives Used
使用的iii原语
| Primitive | Purpose |
|---|---|
| Redrive all DLQ jobs for a named queue |
| Check queue and DLQ status |
| CLI redrive command (part of the engine binary) |
| CLI flag to set trigger timeout (default 30s) |
| Configure |
| 原语 | 用途 |
|---|---|
| 重新驱动指定队列的所有DLQ任务 |
| 检查队列和DLQ状态 |
| CLI重新驱动命令(属于引擎二进制文件的一部分) |
| 设置触发超时时间的CLI标志(默认30秒) |
iii-config.yaml中的 | 配置 |
Reference Implementation
参考实现
See ../references/dead-letter-queues.js for the full working example — inspecting DLQ status,
Also available in Python: ../references/dead-letter-queues.py
Also available in Rust: ../references/dead-letter-queues.rs
redriving failed jobs via SDK and CLI, and configuring retry behavior.
查看../references/dead-letter-queues.js获取完整工作示例——包括检查DLQ状态、
通过SDK和CLI重新驱动失败任务,以及配置重试行为。
同时提供Python版本:../references/dead-letter-queues.py
以及Rust版本:../references/dead-letter-queues.rs
Common Patterns
常见模式
Code using this pattern commonly includes, when relevant:
- — redrive via SDK
await iii.trigger({ function_id: 'iii::queue::redrive', payload: { queue: 'payment' } }) - — redrive via CLI
iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}' - — with custom timeout
iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}' --timeout-ms=60000 - Redrive returns indicating count of replayed jobs
{ queue: 'payment', redriven: 12 } - Inspect in RabbitMQ UI at , find
http://localhost:15672iii.__fn_queue::{name}::dlq.queue - Best practice: investigate failures, deploy fix, then redrive
- Monitor DLQ depth as an operational alert signal
使用此模式的代码通常包含以下相关内容:
- —— 通过SDK重新驱动
await iii.trigger({ function_id: 'iii::queue::redrive', payload: { queue: 'payment' } }) - —— 通过CLI重新驱动
iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}' - —— 自定义超时时间
iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}' --timeout-ms=60000 - 重新驱动会返回,表示重放的任务数量
{ queue: 'payment', redriven: 12 } - 在RabbitMQ UI()中查看,找到
http://localhost:15672iii.__fn_queue::{name}::dlq.queue - 最佳实践:先排查失败原因、部署修复,再进行重新驱动
- 将DLQ深度作为运维告警指标进行监控
Adapting This Pattern
模式适配
Use the adaptations below when they apply to the task.
- Set and
max_retriesinbackoff_msbased on your failure tolerancequeue_configs - Build an admin endpoint that calls for operational control
iii::queue::redrive - Use to check DLQ depth before and after redriving
iii::queue::status - For dev/test, use lower retry counts to surface failures faster
- In production with RabbitMQ, use the management UI for detailed message inspection
- Consider building an alerting function that triggers on DLQ depth thresholds
当任务场景符合以下情况时,可进行相应适配:
- 根据故障容忍度在中设置
queue_configs和max_retriesbackoff_ms - 构建一个管理端点,调用以实现运维控制
iii::queue::redrive - 使用在重新驱动前后检查DLQ深度
iii::queue::status - 在开发/测试环境中,使用较低的重试次数以更快暴露故障
- 在生产环境中使用RabbitMQ时,通过管理UI进行详细的消息检查
- 考虑构建一个告警函数,当DLQ深度达到阈值时触发告警
Engine Configuration
引擎配置
Queue and are set per queue in iii-config.yaml under . See ../references/iii-config.yaml for the full annotated config reference.
max_retriesbackoff_msqueue_configs队列的和在iii-config.yaml的下按队列单独设置。查看../references/iii-config.yaml获取完整的带注释配置参考。
max_retriesbackoff_msqueue_configsPattern Boundaries
模式边界
- For queue processing patterns (enqueue, concurrency, FIFO), prefer .
iii-queue-processing - For queue configuration (retries, backoff, adapters), prefer .
iii-engine-config - For function registration and triggers, prefer .
iii-functions-and-triggers - Stay with when the primary problem is inspecting or redriving failed jobs.
iii-dead-letter-queues
- 对于队列处理模式(入队、并发、FIFO),优先使用。
iii-queue-processing - 对于队列配置(重试、退避、适配器),优先使用。
iii-engine-config - 对于函数注册和触发器,优先使用。
iii-functions-and-triggers - 当核心问题是检查或重新驱动失败任务时,使用。
iii-dead-letter-queues
When to Use
使用场景
- Use this skill when the task is primarily about in the iii engine.
iii-dead-letter-queues - Triggers when the request directly asks for this pattern or an equivalent implementation.
- 当任务主要涉及iii引擎中的时,使用此技能。
iii-dead-letter-queues - 当请求直接询问此模式或等效实现时触发。
Boundaries
边界限制
- Never use this skill as a generic fallback for unrelated tasks.
- You must not apply this skill when a more specific iii skill is a better fit.
- Always verify environment and safety constraints before applying examples from this skill.
- 切勿将此技能作为无关任务的通用回退方案。
- 当有更合适的特定iii技能时,不得使用此技能。
- 在应用此技能中的示例前,务必验证环境和安全约束。