iii-dead-letter-queues

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Dead Letter Queues

死信队列(Dead Letter Queues)

Comparable to: SQS DLQ, RabbitMQ dead-letter exchanges
可类比于:SQS DLQ、RabbitMQ死信交换器

Key Concepts

核心概念

Use the concepts below when they fit the task. Not every queue failure needs manual DLQ intervention.
  • Jobs move to a DLQ after exhausting
    max_retries
    with exponential backoff (
    backoff_ms * 2^attempt
    )
  • Each DLQ entry preserves the original payload, last error, timestamp, and job metadata
  • Redrive via the built-in
    iii::queue::redrive
    function or the
    iii trigger
    CLI command
  • Redriving resets attempt counters to zero, giving jobs a fresh retry cycle
  • Always investigate and deploy fixes before redriving — blindly redriving repeats failures
  • DLQ support available on Builtin and RabbitMQ adapters
当任务场景匹配时使用以下概念,并非所有队列失败都需要手动干预死信队列。
  • 任务在通过指数退避(
    backoff_ms * 2^attempt
    )耗尽
    max_retries
    次数后,会转移至DLQ
  • 每个DLQ条目都会保留原始负载、最后一次错误信息、时间戳和任务元数据
  • 通过内置的
    iii::queue::redrive
    函数或
    iii trigger
    CLI命令进行重新驱动(Redrive)
  • 重新驱动会将尝试计数器重置为0,为任务提供全新的重试周期
  • 重新驱动前务必先排查问题并部署修复方案——盲目重新驱动会重复触发失败
  • 内置适配器和RabbitMQ适配器支持DLQ功能

Architecture

架构

A queue consumer fails processing a job. The engine retries with exponential backoff up to
max_retries
. Once exhausted, the message moves to the DLQ. An operator inspects the failure, deploys a fix, then redrives the DLQ to replay all failed jobs.
队列消费者处理任务失败后,引擎会通过指数退避进行重试,直至达到
max_retries
次数。一旦次数耗尽,消息会转移至DLQ。运维人员检查失败原因、部署修复后,重新驱动DLQ以重放所有失败任务。

iii Primitives Used

使用的iii原语

PrimitivePurpose
trigger({ function_id: 'iii::queue::redrive', payload: { queue } })
Redrive all DLQ jobs for a named queue
trigger({ function_id: 'iii::queue::status', payload: { queue } })
Check queue and DLQ status
iii trigger --function-id='iii::queue::redrive' --payload='{"queue":"name"}'
CLI redrive command (part of the engine binary)
--timeout-ms
CLI flag to set trigger timeout (default 30s)
queue_configs
in iii-config.yaml
Configure
max_retries
and
backoff_ms
原语用途
trigger({ function_id: 'iii::queue::redrive', payload: { queue } })
重新驱动指定队列的所有DLQ任务
trigger({ function_id: 'iii::queue::status', payload: { queue } })
检查队列和DLQ状态
iii trigger --function-id='iii::queue::redrive' --payload='{"queue":"name"}'
CLI重新驱动命令(属于引擎二进制文件的一部分)
--timeout-ms
设置触发超时时间的CLI标志(默认30秒)
iii-config.yaml中的
queue_configs
配置
max_retries
backoff_ms

Reference Implementation

参考实现

See ../references/dead-letter-queues.js for the full working example — inspecting DLQ status,
Also available in Python: ../references/dead-letter-queues.py
Also available in Rust: ../references/dead-letter-queues.rs redriving failed jobs via SDK and CLI, and configuring retry behavior.
查看../references/dead-letter-queues.js获取完整工作示例——包括检查DLQ状态、 通过SDK和CLI重新驱动失败任务,以及配置重试行为。
同时提供Python版本:../references/dead-letter-queues.py
以及Rust版本:../references/dead-letter-queues.rs

Common Patterns

常见模式

Code using this pattern commonly includes, when relevant:
  • await iii.trigger({ function_id: 'iii::queue::redrive', payload: { queue: 'payment' } })
    — redrive via SDK
  • iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}'
    — redrive via CLI
  • iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}' --timeout-ms=60000
    — with custom timeout
  • Redrive returns
    { queue: 'payment', redriven: 12 }
    indicating count of replayed jobs
  • Inspect in RabbitMQ UI at
    http://localhost:15672
    , find
    iii.__fn_queue::{name}::dlq.queue
  • Best practice: investigate failures, deploy fix, then redrive
  • Monitor DLQ depth as an operational alert signal
使用此模式的代码通常包含以下相关内容:
  • await iii.trigger({ function_id: 'iii::queue::redrive', payload: { queue: 'payment' } })
    —— 通过SDK重新驱动
  • iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}'
    —— 通过CLI重新驱动
  • iii trigger --function-id='iii::queue::redrive' --payload='{"queue": "payment"}' --timeout-ms=60000
    —— 自定义超时时间
  • 重新驱动会返回
    { queue: 'payment', redriven: 12 }
    ,表示重放的任务数量
  • 在RabbitMQ UI(
    http://localhost:15672
    )中查看,找到
    iii.__fn_queue::{name}::dlq.queue
  • 最佳实践:先排查失败原因、部署修复,再进行重新驱动
  • 将DLQ深度作为运维告警指标进行监控

Adapting This Pattern

模式适配

Use the adaptations below when they apply to the task.
  • Set
    max_retries
    and
    backoff_ms
    in
    queue_configs
    based on your failure tolerance
  • Build an admin endpoint that calls
    iii::queue::redrive
    for operational control
  • Use
    iii::queue::status
    to check DLQ depth before and after redriving
  • For dev/test, use lower retry counts to surface failures faster
  • In production with RabbitMQ, use the management UI for detailed message inspection
  • Consider building an alerting function that triggers on DLQ depth thresholds
当任务场景符合以下情况时,可进行相应适配:
  • 根据故障容忍度在
    queue_configs
    中设置
    max_retries
    backoff_ms
  • 构建一个管理端点,调用
    iii::queue::redrive
    以实现运维控制
  • 使用
    iii::queue::status
    在重新驱动前后检查DLQ深度
  • 在开发/测试环境中,使用较低的重试次数以更快暴露故障
  • 在生产环境中使用RabbitMQ时,通过管理UI进行详细的消息检查
  • 考虑构建一个告警函数,当DLQ深度达到阈值时触发告警

Engine Configuration

引擎配置

Queue
max_retries
and
backoff_ms
are set per queue in iii-config.yaml under
queue_configs
. See ../references/iii-config.yaml for the full annotated config reference.
队列的
max_retries
backoff_ms
在iii-config.yaml的
queue_configs
下按队列单独设置。查看../references/iii-config.yaml获取完整的带注释配置参考。

Pattern Boundaries

模式边界

  • For queue processing patterns (enqueue, concurrency, FIFO), prefer
    iii-queue-processing
    .
  • For queue configuration (retries, backoff, adapters), prefer
    iii-engine-config
    .
  • For function registration and triggers, prefer
    iii-functions-and-triggers
    .
  • Stay with
    iii-dead-letter-queues
    when the primary problem is inspecting or redriving failed jobs.
  • 对于队列处理模式(入队、并发、FIFO),优先使用
    iii-queue-processing
  • 对于队列配置(重试、退避、适配器),优先使用
    iii-engine-config
  • 对于函数注册和触发器,优先使用
    iii-functions-and-triggers
  • 当核心问题是检查或重新驱动失败任务时,使用
    iii-dead-letter-queues

When to Use

使用场景

  • Use this skill when the task is primarily about
    iii-dead-letter-queues
    in the iii engine.
  • Triggers when the request directly asks for this pattern or an equivalent implementation.
  • 当任务主要涉及iii引擎中的
    iii-dead-letter-queues
    时,使用此技能。
  • 当请求直接询问此模式或等效实现时触发。

Boundaries

边界限制

  • Never use this skill as a generic fallback for unrelated tasks.
  • You must not apply this skill when a more specific iii skill is a better fit.
  • Always verify environment and safety constraints before applying examples from this skill.
  • 切勿将此技能作为无关任务的通用回退方案。
  • 当有更合适的特定iii技能时,不得使用此技能。
  • 在应用此技能中的示例前,务必验证环境和安全约束。