algo-social-virality
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseViral Spread Models
病毒式传播模型
Overview
概述
Compartmental models (SIR, SIS, SEIR) model how content/information spreads through populations. Susceptible → Infected → Recovered mirrors unaware → sharing → stopped sharing. Key metric: R0 (basic reproduction number). Solves as ODEs in O(T × N) for T timesteps, N compartments.
仓室模型(SIR、SIS、SEIR)用于模拟内容/信息在人群中的传播过程。易感者(Susceptible)→ 感染者(Infected)→ 康复者(Recovered)的流程对应着未知情→分享→停止分享的行为。关键指标:R0(基本再生数)。通过求解常微分方程(ODEs)实现,时间复杂度为O(T × N),其中T为时间步长,N为仓室数量。
When to Use
使用场景
Trigger conditions:
- Modeling how content spreads through a social network
- Estimating whether a campaign will achieve viral threshold
- Analyzing post-hoc spread dynamics of viral events
When NOT to use:
- When predicting individual user behavior (use influence scoring)
- When measuring engagement metrics (use engagement rate calculator)
触发场景:
- 模拟内容在社交网络中的传播路径
- 估算营销活动是否能达到病毒式传播阈值
- 事后分析病毒式事件的传播动态
不适用场景:
- 预测单个用户行为(使用影响力评分模型)
- 衡量参与度指标(使用参与率计算器)
Algorithm
算法
IRON LAW: Viral Spread Occurs ONLY When R0 > 1
R0 = transmission rate (β) / recovery rate (γ).
Below R0 = 1, content dies out regardless of initial seed size.
Above R0 = 1, exponential growth phase begins before saturation.
Design interventions (seeding, incentives) to push R0 above threshold.铁律:只有当R0 > 1时才会发生病毒式传播
R0 = 传播率(β)/ 恢复率(γ)。
当R0 < 1时,无论初始种子规模多大,内容都会逐渐消失。
当R0 > 1时,会先进入指数增长阶段,随后达到饱和。
可通过种子用户投放、激励措施等干预手段将R0推至阈值以上。Phase 1: Input Validation
阶段1:输入验证
Define: population size (N), initial seed size (I₀), transmission rate (β — probability of sharing upon exposure), recovery rate (γ — rate of losing interest).
Gate: Parameters non-negative, β and γ estimated from historical data or assumed.
定义:人群规模(N)、初始种子用户数(I₀)、传播率(β — 接触后分享的概率)、恢复率(γ — 失去兴趣的速率)。
校验规则: 参数需非负,β和γ可通过历史数据估算或假设取值。
Phase 2: Core Algorithm
阶段2:核心算法
SIR Model: dS/dt = -βSI/N, dI/dt = βSI/N - γI, dR/dt = γI
- Initialize: S=N-I₀, I=I₀, R=0
- Iterate using Euler method or RK4 at discrete timesteps
- Track peak infected (maximum simultaneous sharers) and total ever-infected
SIS variant: No recovery to immune state — recovered become susceptible again (recurring content).
SIR模型: dS/dt = -βSI/N, dI/dt = βSI/N - γI, dR/dt = γI
- 初始化:S=N-I₀, I=I₀, R=0
- 使用欧拉法或RK4法在离散时间步长上迭代计算
- 记录峰值感染数(同时分享的最大用户数)和总感染数(所有参与过分享的用户数)
SIS变体: 无免疫状态,康复者会重新变为易感者(适用于反复传播的内容)。
Phase 3: Verification
阶段3:结果验证
Check: S+I+R = N at all timesteps (conservation). Peak and final sizes plausible for given R0.
Gate: Population conserved, dynamics consistent with R0.
校验:在所有时间步长上需满足S+I+R = N(人群守恒)。峰值和最终规模需与给定的R0相符。
校验规则: 人群数量守恒,传播动态与R0一致。
Phase 4: Output
阶段4:输出
Return time series of compartments and summary metrics.
返回仓室的时间序列数据及汇总指标。
Output Format
输出格式
json
{
"time_series": [{"t": 0, "S": 9900, "I": 100, "R": 0}],
"summary": {"R0": 2.5, "peak_infected": 3200, "peak_day": 12, "total_infected": 8500},
"metadata": {"model": "SIR", "beta": 0.5, "gamma": 0.2, "population": 10000}
}json
{
"time_series": [{"t": 0, "S": 9900, "I": 100, "R": 0}],
"summary": {"R0": 2.5, "peak_infected": 3200, "peak_day": 12, "total_infected": 8500},
"metadata": {"model": "SIR", "beta": 0.5, "gamma": 0.2, "population": 10000}
}Examples
示例
Sample I/O
输入输出示例
Input: N=10000, I₀=10, β=0.3, γ=0.1 (R0=3.0)
Expected: Exponential growth, peak ~4000 at day ~15, total infected ~9500
输入: N=10000, I₀=10, β=0.3, γ=0.1(R0=3.0)
预期结果: 指数增长,约在第15天达到峰值4000,总感染数9500
Edge Cases
边缘案例
| Input | Expected | Why |
|---|---|---|
| R0 = 0.8 | Rapid decay | Below threshold, dies out |
| I₀ = 1 | Slower start but same eventual dynamics | Single seed takes longer to ignite |
| β = γ (R0=1) | Linear, no growth | Critical threshold, endemic equilibrium |
| 输入 | 预期结果 | 原因 |
|---|---|---|
| R0 = 0.8 | 快速衰减 | 低于阈值,内容逐渐消失 |
| I₀ = 1 | 启动较慢但最终动态一致 | 单个种子用户需要更长时间引爆传播 |
| β = γ (R0=1) | 线性增长,无指数扩张 | 临界阈值,达到地方性平衡 |
Gotchas
注意事项
- Homogeneous mixing assumption: SIR assumes everyone interacts equally. Real networks have hubs, clusters, and weak ties. Use network-based models for realistic spread.
- Parameter estimation: β and γ are hard to estimate for social content. Use early spread data to fit parameters, then project.
- Content ≠ disease: Unlike diseases, content sharing is voluntary and influenced by content quality, platform algorithms, and trends. Models give rough dynamics, not precise predictions.
- Platform algorithms: Social media algorithms amplify or suppress content. The "transmission rate" is partly determined by the platform, not just user behavior.
- Temporal dynamics: Content virality often has a much shorter lifecycle than disease (hours-days vs weeks-months). Adjust timescales accordingly.
- 均匀混合假设: SIR模型假设所有用户的互动概率均等。但真实网络中存在枢纽节点、集群和弱连接。如需更真实的传播模拟,请使用基于网络的模型。
- 参数估算: 针对社交内容的β和γ值很难估算。可利用早期传播数据拟合参数,再进行预测。
- 内容≠疾病: 与疾病不同,内容分享是自愿行为,受内容质量、平台算法和趋势影响。模型仅提供大致的传播动态,而非精确预测。
- 平台算法: 社交媒体算法会放大或抑制内容的传播。“传播率”部分由平台决定,并非完全由用户行为主导。
- 时间动态: 内容的病毒式传播周期通常远短于疾病(小时/天 vs 周/月)。需相应调整时间尺度。
References
参考资料
- For network-based epidemic models, see
references/network-sir.md - For parameter estimation from early data, see
references/parameter-fitting.md
- 如需了解基于网络的流行病模型,请查看
references/network-sir.md - 如需了解如何利用早期数据拟合参数,请查看
references/parameter-fitting.md