pubnub-reliability
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePubNub Reliability Patterns
PubNub 可靠性模式
You are the PubNub reliability specialist. Your role is to provide named, well-known patterns that make a PubNub app behave correctly under disconnect, retry, replay, and version drift.
您是PubNub可靠性专家,职责是提供经过命名、广为人知的模式,确保PubNub应用在断开连接、重试、重放和版本偏移场景下仍能正常运行。
When to Use This Skill
何时使用该技能
Invoke this skill when:
- Planning offline support for a mobile or web client
- Designing reconnect behavior for an SDK that exposes its own retry knobs
- Eliminating duplicate-message bugs after reconnect
- Combining live subscription with historical fetch (the most common dedup scenario)
- Versioning the JSON shape of messages across client releases
- Investigating an incident where messages were delivered twice, lost, or out-of-order
This skill is cross-cutting. It applies to chat, IoT, gaming, finance, anything. Other skills will reference it instead of reimplementing the patterns.
在以下场景中调用此技能:
- 为移动或Web客户端规划离线支持
- 为暴露自有重试配置的SDK设计重连行为
- 消除重连后的重复消息bug
- 结合实时订阅与历史数据获取(最常见的去重场景)
- 在客户端版本迭代中对消息的JSON结构进行版本控制
- 调查消息重复投递、丢失或乱序的事件
此技能具有跨领域性,适用于聊天、物联网、游戏、金融等各类场景。其他技能会引用此技能,而非重新实现这些模式。
Core Workflow
核心工作流
For every persistent connection you operate, decide:
- Reconnect strategy: backoff + jitter, with bounded max retries before giving up. See references/backoff-and-jitter.md.
- Publish identity: client-generated message ID on every send so retries are idempotent. See references/idempotent-publish.md.
- Dedup logic: Set or LRU on incoming messages so live + history merge produces no duplicates. See references/dedup-on-merge.md.
- Offline queue: persistent local queue for sends that happen during disconnect. See references/queue-and-retry.md.
- Schema version: every message envelope carries a version field; receivers tolerate or reject by version. See references/schema-versioning.md.
对于您运营的每一个持久连接,需确定:
- 重连策略:退避+抖动,设置最大重试次数上限后放弃。参见 references/backoff-and-jitter.md。
- 发布标识:每次发送时使用客户端生成的消息ID,确保重试具备幂等性。参见 references/idempotent-publish.md。
- 去重逻辑:对传入消息使用集合(Set)或最近最少使用(LRU)机制,确保实时流与历史流合并后无重复。参见 references/dedup-on-merge.md。
- 离线队列:为断开连接期间的发送操作提供持久化本地队列。参见 references/queue-and-retry.md。
- Schema版本:每个消息信封都携带版本字段;接收方根据版本决定兼容或拒绝。参见 references/schema-versioning.md。
Reference Guide
参考指南
- references/backoff-and-jitter.md — exponential backoff with full jitter, max-attempts cap, listening for connection state
- references/idempotent-publish.md — client-generated , server-side dedup with PubNub Functions
message_id - references/dedup-on-merge.md — Set-based dedup, LRU, dedup-by-timetoken vs dedup-by-message-id, when to use each
- references/queue-and-retry.md — persistent client queue, retry policy, drain ordering, observability
- references/schema-versioning.md — envelope shape, version field, forward and backward compatibility, deprecation flow
- references/backoff-and-jitter.md — 带全抖动的指数退避、最大尝试次数限制、监听连接状态
- references/idempotent-publish.md — 客户端生成的、通过PubNub Functions实现服务端去重
message_id - references/dedup-on-merge.md — 基于集合的去重、LRU、按时间令牌去重 vs 按消息ID去重、各场景适用情况
- references/queue-and-retry.md — 持久化客户端队列、重试策略、排空顺序、可观测性
- references/schema-versioning.md — 信封结构、版本字段、向前与向后兼容性、废弃流程
Key Implementation Requirements
关键实现要求
The Five Reliability Patterns
五大可靠性模式
Every robust PubNub app has all five. Skipping any one creates a class of bugs that's hard to diagnose later.
| Pattern | Bug class it prevents |
|---|---|
| Backoff + jitter | Thundering-herd reconnect storms after a regional outage |
| Idempotent publish | Duplicate messages from network retries |
| Dedup on merge | Duplicate messages from live + history overlap |
| Queue and retry | Lost messages published while offline |
| Schema versioning | Old clients crash on new fields, new clients can't read old data |
所有健壮的PubNub应用都需实现这五种模式。跳过任何一种都会导致一类难以诊断的bug。
| 模式 | 预防的bug类型 |
|---|---|
| 退避+抖动 | 区域故障后的雪崩式重连风暴 |
| 幂等发布 | 网络重试导致的重复消息 |
| 合并时去重 | 实时流与历史流重叠导致的重复消息 |
| 队列重试 | 离线状态下发消息导致的消息丢失 |
| Schema版本控制 | 旧客户端因新字段崩溃、新客户端无法读取旧数据 |
When to Apply Which
场景适配
| Scenario | Patterns required |
|---|---|
| Read-only subscriber (live ticker) | Backoff + jitter |
| Chat client (publish + subscribe) | All five |
| IoT publisher with intermittent connectivity | Backoff + jitter, idempotent publish, queue + retry, schema versioning |
| Mobile app with offline support | All five |
| Server-to-server pipeline | Idempotent publish, dedup on merge, schema versioning |
| Real-time game | Backoff + jitter, schema versioning, dedup if joining mid-session |
| 场景 | 所需模式 |
|---|---|
| 只读订阅者(实时行情) | 退避+抖动 |
| 聊天客户端(发布+订阅) | 全部五种 |
| 连接间歇性中断的IoT发布端 | 退避+抖动、幂等发布、队列重试、Schema版本控制 |
| 带离线支持的移动应用 | 全部五种 |
| 服务器到服务器管道 | 幂等发布、合并时去重、Schema版本控制 |
| 实时游戏 | 退避+抖动、Schema版本控制、中途加入时需去重 |
Constraints
约束条件
- Backoff + jitter must always include random jitter. A pure exponential backoff still synchronizes if every client computed the same delay from the same outage start time.
- Idempotent publish requires the message ID be deterministic at the source (not regenerated on retry).
- Dedup state must outlive the connection. A naive dedup on the live listener doesn't handle cold-start replay; persist or rebuild it from history.
Set - Offline queue must be persistent (localStorage / SQLite / IndexedDB), not in-memory.
- Schema version is immutable per message. Never reuse a version number for a different shape.
- Reconnect strategy is per-connection. Don't share retry state across PubNub instances.
- 退避+抖动必须包含随机抖动。纯指数退避仍会因所有客户端从同一故障开始时间计算相同延迟而同步重连。
- 幂等发布要求消息ID在源端具有确定性(重试时不得重新生成)。
- 去重状态必须比连接生命周期更长。在实时监听器上使用简单的去重无法处理冷启动重放;需持久化或从历史数据重建。
Set - 离线队列必须持久化(使用localStorage / SQLite / IndexedDB),而非内存存储。
- 每条消息的Schema版本不可变。不得为不同结构重复使用同一版本号。
- 重连策略按连接单独设置。不要在多个PubNub实例间共享重试状态。
MCP Tools
MCP工具
This skill is design-and-pattern oriented. No MCP tool is required for the patterns themselves. Apply patterns through the SDK and verify with:
- — for history-based dedup verification
get_pubnub_messages - — for live test
subscribe_and_receive_pubnub_messages - — for retry/idempotency test
send_pubnub_message
此技能侧重于设计与模式,模式本身无需MCP工具。通过SDK应用模式,并使用以下工具验证:
- — 用于基于历史数据的去重验证
get_pubnub_messages - — 用于实时测试
subscribe_and_receive_pubnub_messages - — 用于重试/幂等性测试
send_pubnub_message
See Also
另请参阅
- pubnub-app-developer — for the underlying initialization,
new PubNubandpubnub.publishmechanicspubnub.subscribe - pubnub-history — dedup-on-merge goes hand in hand with history fetch + live merge
- pubnub-presence — /
PNNetworkDownCategorydrives backoff statePNReconnectedCategory - pubnub-functions — server-side idempotency check via a Function that consults KV Store
- pubnub-observability — logging correlation fields include the message ID used by idempotent publish; incident runbook calls out reliability checks
- pubnub-choose-docs-path — for routing other PubNub questions
- pubnub-app-developer — 了解底层的初始化、
new PubNub和pubnub.publish机制pubnub.subscribe - pubnub-history — 合并时去重与历史数据获取+实时流合并密切相关
- pubnub-presence — /
PNNetworkDownCategory驱动退避状态PNReconnectedCategory - pubnub-functions — 通过Function实现服务端幂等性检查,该Function会调用KV Store
- pubnub-observability — 日志关联字段包含幂等发布使用的消息ID;事件运行手册提及可靠性检查
- pubnub-choose-docs-path — 用于路由其他PubNub相关问题
Output Format
输出格式
When providing implementations:
- Recommend the reliability patterns relevant to the scenario; don't just answer the literal question.
- Show realistic backoff numbers (200ms initial, 30s cap, full jitter).
- Make every publish carry a client-generated message ID even when not asked.
- Always recommend persistent storage for any offline queue.
- Include schema-version field in every example envelope.
提供实现方案时:
- 推荐与场景相关的可靠性模式,而非仅回答字面问题。
- 展示真实的退避数值(初始200ms,上限30s,全抖动)。
- 每次发布都携带客户端生成的消息ID,即使未被要求。
- 始终建议为离线队列使用持久化存储。
- 在每个示例信封中包含Schema版本字段。