data-cohort-analysis
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCohort Analysis
同期群分析
Framework
核心框架
IRON LAW: Aggregate Metrics Hide Cohort Differences
A 70% monthly retention rate OVERALL can mask that January cohort retains
at 85% while June cohort retains at 50%. Aggregate metrics blend improving
and deteriorating cohorts together, hiding both problems and progress.
ALWAYS analyze by cohort before drawing conclusions.IRON LAW: Aggregate Metrics Hide Cohort Differences
A 70% monthly retention rate OVERALL can mask that January cohort retains
at 85% while June cohort retains at 50%. Aggregate metrics blend improving
and deteriorating cohorts together, hiding both problems and progress.
ALWAYS analyze by cohort before drawing conclusions.Core Concepts
核心概念
Cohort: A group of users who share a common characteristic in a specific time period. Most common: acquisition cohort (grouped by signup month).
Retention Matrix: Rows = cohorts (by signup month), Columns = time periods after signup (Month 0, 1, 2...). Cells = % of cohort still active.
Month 0 Month 1 Month 2 Month 3
Jan cohort 100% 65% 48% 40%
Feb cohort 100% 60% 42% 35%
Mar cohort 100% 70% 55% 48% ← Improvement!同期群(Cohort):在特定时间段内拥有共同特征的用户群体。最常见的是获客同期群(按注册月份分组)。
留存矩阵:行 = 同期群(按注册月份),列 = 注册后的时间段(第0月、第1月、第2月...)。单元格 = 该同期群仍活跃的用户占比。
Month 0 Month 1 Month 2 Month 3
Jan cohort 100% 65% 48% 40%
Feb cohort 100% 60% 42% 35%
Mar cohort 100% 70% 55% 48% ← Improvement!Retention Types
留存类型
| Type | Definition | Use Case |
|---|---|---|
| N-day | % active on exactly day N | Games, daily-use apps |
| N-day bounded | % active within first N days | General product usage |
| Week/Month | % active in week/month N | SaaS, subscriptions |
| Unbounded | % who ever return after day N | Low-frequency products |
| 类型 | 定义 | 使用场景 |
|---|---|---|
| N-day | 第N天仍活跃的用户占比 | 游戏、日常使用类应用 |
| N-day bounded | 前N天内有活跃行为的用户占比 | 通用产品使用场景 |
| Week/Month | 第N周/月内有活跃行为的用户占比 | SaaS、订阅类产品 |
| Unbounded | 第N天后有过回流行为的用户占比 | 低频次产品 |
Analysis Steps
分析步骤
Phase 1: Define Cohort and Activity
- Cohort definition: signup date, first purchase date, or other milestone
- Activity definition: login, purchase, specific action — must match the product's core value
- Time granularity: daily (for daily-use products), weekly, or monthly
Phase 2: Build Retention Matrix
- Group users into cohorts
- For each cohort, calculate retention at each time period
- Visualize as a heatmap (darker = higher retention)
Phase 3: Identify Patterns
- Retention curve shape: Does it flatten (good — stable core users) or keep declining (bad — everyone eventually churns)?
- Cohort comparison: Are newer cohorts retaining better or worse than older ones?
- Drop-off cliff: Is there a specific period where retention drops sharply? (e.g., Day 1 → Day 7 drops 50%)
Phase 4: Connect to Actions
- What changed for the improving/deteriorating cohorts? (product update, marketing channel shift, onboarding change)
- Can you isolate the cause through A/B test or event analysis?
Phase 5: LTV Projection
- Use cohort retention curves to project future revenue per cohort
- LTV = Σ (retention_month_n × ARPU_month_n) for all future months
阶段1:定义同期群与活跃行为
- 同期群定义:注册日期、首次购买日期或其他关键里程碑
- 活跃行为定义:登录、购买、特定操作——必须与产品核心价值匹配
- 时间粒度:按日(适用于日常使用产品)、周或月
阶段2:构建留存矩阵
- 将用户分组为不同同期群
- 计算每个同期群在各时间段的留存率
- 以热力图可视化(颜色越深表示留存率越高)
阶段3:识别规律
- 留存曲线形态:曲线是否趋于平缓(良好——核心用户稳定)或持续下降(糟糕——最终所有用户都会流失)?
- 同期群对比:新同期群的留存表现比老同期群更好还是更差?
- 流失陡降点:是否存在某个时间段留存率急剧下降?(例如,第1天到第7天留存率下降50%)
阶段4:关联业务动作
- 留存表现变好/变差的同期群对应的业务变化是什么?(产品更新、营销渠道调整、新用户引导流程变化)
- 能否通过A/B测试或事件分析定位具体原因?
阶段5:LTV预测
- 利用同期群留存曲线预测每个同期群的未来收入
- LTV = Σ(第n月留存率 × 第n月每用户平均收入(ARPU)),计算所有未来月份的总和
Output Format
输出格式
markdown
undefinedmarkdown
undefinedCohort Analysis: {Product}
Cohort Analysis: {Product}
Cohort Definition
Cohort Definition
- Cohort: {signup month / first purchase}
- Activity: {what counts as "active"}
- Period: {daily / weekly / monthly}
- Cohort: {signup month / first purchase}
- Activity: {what counts as "active"}
- Period: {daily / weekly / monthly}
Retention Matrix
Retention Matrix
| Cohort | M0 | M1 | M2 | M3 | M4 | M5 | M6 |
|---|---|---|---|---|---|---|---|
| {month} | 100% | {%} | {%} | {%} | {%} | {%} | {%} |
| Cohort | M0 | M1 | M2 | M3 | M4 | M5 | M6 |
|---|---|---|---|---|---|---|---|
| {month} | 100% | {%} | {%} | {%} | {%} | {%} | {%} |
Key Findings
Key Findings
- {retention curve shape}
- {cohort trend — improving or deteriorating}
- {critical drop-off point}
- {retention curve shape}
- {cohort trend — improving or deteriorating}
- {critical drop-off point}
Cohort Comparison
Cohort Comparison
| Metric | Oldest Cohort | Newest Cohort | Delta |
|---|---|---|---|
| M1 retention | {%} | {%} | {±pp} |
| M3 retention | {%} | {%} | {±pp} |
| Projected LTV | ${X} | ${X} | {%} |
| Metric | Oldest Cohort | Newest Cohort | Delta |
|---|---|---|---|
| M1 retention | {%} | {%} | {±pp} |
| M3 retention | {%} | {%} | {±pp} |
| Projected LTV | ${X} | ${X} | {%} |
Recommendations
Recommendations
- {action to improve retention at critical drop-off point}
undefined- {action to improve retention at critical drop-off point}
undefinedGotchas
注意事项
- Define "active" carefully: Login ≠ value delivery. A user who logs in but doesn't complete the core action (purchase, send message, create document) shouldn't count as "retained."
- Cohort size matters: A cohort of 10 users with 50% retention is meaningless (5 users). Ensure cohorts have statistically meaningful sizes.
- Survivorship bias in aggregates: "Average retention is improving" may just mean you have more new users (who are always at M0 = 100%) diluting the denominator.
- Seasonal cohorts behave differently: December cohorts (holiday shoppers) often retain worse than March cohorts (organic discovery). Compare same-season cohorts YoY.
- Retention ≠ engagement depth: A user who returns once per month but uses for 5 hours vs one who returns daily for 30 seconds — same retention, very different engagement. Layer in activity depth metrics.
- 谨慎定义“活跃行为”:登录 ≠ 价值交付。仅登录但未完成核心动作(购买、发送消息、创建文档)的用户不应被算作“留存用户”。
- 同期群规模至关重要:仅10个用户的同期群即便有50%的留存率也毫无意义(仅5个用户)。需确保同期群规模具备统计显著性。
- 聚合数据中的幸存者偏差:“平均留存率正在提升”可能只是因为新用户(始终处于第0月,留存率100%)占比增加,稀释了分母。
- 季节性同期群表现不同:12月同期群(假日购物者)的留存率通常比3月同期群(自然流量用户)差。应对比同季节同期群的同比数据。
- 留存率 ≠ 参与深度:每月回流1次但使用5小时的用户,与每日回流但仅使用30秒的用户——留存率相同,但参与度差异极大。需结合参与深度指标分析。
References
参考资料
- For SQL retention query templates, see
references/retention-sql.md - For LTV projection from cohort data, see
references/cohort-ltv.md
- 如需SQL留存查询模板,请查看
references/retention-sql.md - 如需基于同期群数据的LTV预测方法,请查看
references/cohort-ltv.md