ads-test
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseA/B Test Design & Experiment Planning
A/B测试设计与实验规划
<!-- Created: 2026-04-13 | v1.5 -->
<!-- Source: OpenClaudia/openclaudia-skills (ab-test-setup concept) -->
<!-- 创建时间: 2026-04-13 | 版本v1.5 -->
<!-- 来源: OpenClaudia/openclaudia-skills (ab-test-setup 概念) -->
Process
流程
- Understand what the user wants to test (creative, audience, bidding, landing page)
- Build structured hypothesis using the framework below
- Calculate required sample size and estimated duration
- Recommend platform-specific test setup
- Define success criteria and measurement plan
- 明确用户的测试对象(创意素材、受众群体、出价策略、落地页)
- 使用下方框架构建结构化假设
- 计算所需样本量和预估测试时长
- 推荐平台专属的测试设置方案
- 定义成功标准与衡量计划
Hypothesis Framework
假设框架
Every test must start with a structured hypothesis:
IF we [change/action]
THEN [metric] will [increase/decrease] by [estimated %]
BECAUSE [reasoning based on data or insight]
Example:
IF we replace polished product shots with UGC creator videos
THEN Meta CTR will increase by 25-40%
BECAUSE Andromeda prioritizes diverse creative formats and UGC consistently outperforms polished in 2025-2026 benchmarks所有测试都必须从结构化假设开始:
IF we [change/action]
THEN [metric] will [increase/decrease] by [estimated %]
BECAUSE [reasoning based on data or insight]
Example:
IF we replace polished product shots with UGC creator videos
THEN Meta CTR will increase by 25-40%
BECAUSE Andromeda prioritizes diverse creative formats and UGC consistently outperforms polished in 2025-2026 benchmarksHypothesis Quality Checklist
假设质量检查清单
- Single variable being tested (isolate the change)
- Specific metric defined (not "performance")
- Estimated effect size stated (needed for sample size calculation)
- Timeframe defined
- Success/failure criteria clear before launch
- 仅测试单一变量(隔离变更因素)
- 定义具体指标(而非笼统的"表现")
- 说明预估效果幅度(样本量计算所需)
- 定义时间范围
- 启动前明确成功/失败标准
Statistical Significance Calculator
统计显著性计算器
Required Sample Size (per variant):
n = (Z_alpha + Z_beta)^2 × 2 × p × (1-p) / MDE^2
Where:
- Z_alpha = 1.96 (for 95% confidence)
- Z_beta = 0.84 (for 80% power)
- p = baseline conversion rate
- MDE = minimum detectable effect (relative %)
Simplified lookup:| Baseline CVR | 5% MDE | 10% MDE | 20% MDE | 30% MDE |
|---|---|---|---|---|
| 1% | 612,000 | 153,000 | 38,300 | 17,000 |
| 2% | 302,400 | 75,600 | 18,900 | 8,400 |
| 5% | 116,800 | 29,200 | 7,300 | 3,200 |
| 10% | 55,200 | 13,800 | 3,450 | 1,530 |
| 20% | 24,600 | 6,150 | 1,540 | 680 |
Per variant, 95% confidence, 80% power
Required Sample Size (per variant):
n = (Z_alpha + Z_beta)^2 × 2 × p × (1-p) / MDE^2
Where:
- Z_alpha = 1.96 (for 95% confidence)
- Z_beta = 0.84 (for 80% power)
- p = baseline conversion rate
- MDE = minimum detectable effect (relative %)
Simplified lookup:| 基准转化率 | 5%最小可检测效果 | 10%最小可检测效果 | 20%最小可检测效果 | 30%最小可检测效果 |
|---|---|---|---|---|
| 1% | 612,000 | 153,000 | 38,300 | 17,000 |
| 2% | 302,400 | 75,600 | 18,900 | 8,400 |
| 5% | 116,800 | 29,200 | 7,300 | 3,200 |
| 10% | 55,200 | 13,800 | 3,450 | 1,530 |
| 20% | 24,600 | 6,150 | 1,540 | 680 |
每个变体,95%置信度,80%统计功效
Test Duration Estimator
测试时长估算器
Duration = Required Sample Size / Daily Traffic per Variant
Minimum duration: 7 days (capture weekly patterns)
Maximum recommended: 28 days (avoid seasonal drift)
Learning phase: Google 7-14 days, Meta 3-7 days, LinkedIn 7-14 days
Inputs needed:
- Daily impressions or clicks
- Number of variants (2 = A/B, 3+ = multivariate)
- Baseline conversion rate
- Minimum detectable effect desiredDuration = Required Sample Size / Daily Traffic per Variant
Minimum duration: 7 days (capture weekly patterns)
Maximum recommended: 28 days (avoid seasonal drift)
Learning phase: Google 7-14 days, Meta 3-7 days, LinkedIn 7-14 days
Inputs needed:
- Daily impressions or clicks
- Number of variants (2 = A/B, 3+ = multivariate)
- Baseline conversion rate
- Minimum detectable effect desiredDuration Quick Estimates
时长快速估算表
| Daily Clicks | 2% CVR, 20% MDE | 5% CVR, 20% MDE | 10% CVR, 20% MDE |
|---|---|---|---|
| 100 | 189 days | 73 days | 35 days |
| 500 | 38 days | 15 days | 7 days |
| 1,000 | 19 days | 7 days | 4 days* |
| 5,000 | 4 days* | 2 days* | 1 day* |
*Minimum 7 days recommended regardless of sample sufficiency
| 每日点击量 | 2%转化率,20%最小可检测效果 | 5%转化率,20%最小可检测效果 | 10%转化率,20%最小可检测效果 |
|---|---|---|---|
| 100 | 189天 | 73天 | 35天 |
| 500 | 38天 | 15天 | 7天 |
| 1,000 | 19天 | 7天 | 4天* |
| 5,000 | 4天* | 2天* | 1天* |
无论样本量是否充足,建议最短测试时长为7天
Platform-Specific Test Setup
平台专属测试设置
Meta Experiments
Meta Experiments
- Use Ads Manager > Experiments tab (not manual ad set duplication)
- Automatic audience splitting ensures no overlap
- Supported test types: A/B (creative, audience, placement), Holdout, Brand Survey
- Meta's Incremental Attribution (April 2025) provides AI-powered holdout testing for measuring real causal impact
- Budget: split evenly across variants; minimum $100/day per variant recommended
- Duration: 7-14 days typical; Meta auto-determines winner at 95% confidence
- 使用广告管理工具>实验标签页(而非手动复制广告组)
- 自动受众拆分确保无重叠
- 支持的测试类型:A/B测试(创意、受众、投放位置)、对照组测试、品牌调研
- Meta的增量归因模型(2025年4月)提供AI驱动的对照组测试,用于衡量真实因果影响
- 预算:在各变体间平均分配;建议每个变体每日最低预算100美元
- 时长:通常7-14天;Meta会在达到95%置信度时自动判定获胜方
Google Experiments
Google Experiments
- Campaign Experiments (custom experiments) or Ad Variations
- Create experiment from existing campaign > select experiment type
- Traffic split: 50/50 recommended for fastest results
- Supported: bidding strategy, ad copy, landing page, audience
- Metrics: choose primary metric (conversions, CPA, ROAS) before launch
- Duration: 14-30 days recommended; minimum 2 weeks for bidding tests
- 可选择广告系列实验(自定义实验)或广告变体测试
- 从现有广告系列创建实验>选择实验类型
- 流量分配:推荐50/50以最快获得结果
- 支持测试:出价策略、广告文案、落地页、受众
- 指标:启动前选定核心指标(转化量、单次转化成本、广告支出回报率)
- 时长:建议14-30天;出价测试最短需2周
LinkedIn A/B Testing
LinkedIn A/B测试
- Built into Campaign Manager for Sponsored Content
- Duplicate ad set with single variable change
- Target: same audience segment with automatic rotation
- Minimum budget: $50/day per variant
- Key metrics: CTR (>0.44% benchmark), CPL, Lead Form CVR (13% benchmark)
- Duration: 14-21 days (LinkedIn's smaller daily volumes require longer tests)
- 集成在广告管理工具的推广内容模块中
- 复制广告组并仅修改单一变量
- 目标:同一受众群体自动轮换展示
- 最低预算:每个变体每日50美元
- 核心指标:点击率(基准>0.44%)、潜在客户获取成本、表单转化率(基准13%)
- 时长:14-21天(LinkedIn日均流量较小,需更长测试周期)
TikTok Split Testing
TikTok拆分测试
- Available in TikTok Ads Manager > Create A/B Test
- Test types: targeting, bidding, creative
- Auto-splits audience to avoid contamination
- Minimum 7 days, recommended 14 days
- Budget: minimum $20/day per ad group
- Creative tests: isolate hook (first 2-3 seconds) as the primary variable
- TikTok's enhanced split testing supports modular test variables (targeting, creative, budget, placement) via Smart+ since 2025
- 在TikTok广告管理工具>创建A/B测试中可用
- 测试类型:定向、出价、创意
- 自动拆分受众避免交叉污染
- 最短7天,建议14天
- 预算:每个广告组每日最低20美元
- 创意测试:优先将视频开头2-3秒的钩子作为核心变量
- 自2025年起,TikTok的增强拆分测试通过Smart+支持模块化测试变量(定向、创意、预算、投放位置)
What to Test (Priority Order)
测试优先级排序
High Impact (test first)
高优先级(优先测试)
- Creative concept (different messaging angles, not just color changes)
- Hook/first 3 seconds (video opening on Meta, TikTok, YouTube)
- Offer structure (pricing, discount type, free trial length)
- Landing page (headline, CTA, form length)
- Bidding strategy (tCPA vs tROAS vs Maximize Conversions)
- 创意概念(不同的 messaging 角度,而非仅颜色调整)
- 钩子/前3秒(Meta、TikTok、YouTube的视频开头)
- 优惠结构(定价、折扣类型、免费试用时长)
- 落地页(标题、号召性按钮、表单长度)
- 出价策略(tCPA vs tROAS vs 最大化转化量)
Medium Impact
中优先级
- Audience targeting (interest vs lookalike vs broad)
- Ad format (static vs video vs carousel)
- CTA button (Learn More vs Sign Up vs Shop Now)
- Campaign structure (CBO vs ABO, consolidated vs segmented)
- 受众定向(兴趣相似受众 vs 相似受众 vs 广泛受众)
- 广告格式(静态图 vs 视频 vs 轮播图)
- 号召性按钮(了解更多 vs 注册 vs 立即购买)
- 广告系列结构(CBO vs ABO,整合型 vs 细分型)
Low Impact (test last)
低优先级(最后测试)
- Ad scheduling (time of day, day of week)
- Device targeting (mobile vs desktop)
- Minor copy variations (word substitutions without concept change)
- 广告排期(时段、星期几)
- 设备定向(移动端 vs 桌面端)
- 文案微调(无概念变化的词语替换)
Common Testing Mistakes to Avoid
需避免的常见测试误区
- Testing too many variables at once (no clear winner attribution)
- Ending tests too early (before statistical significance)
- Testing during atypical periods (holidays, launches, incidents)
- Comparing unequal time periods
- Not documenting learnings (build institutional knowledge)
- Testing small changes when big changes are needed (optimize vs innovate)
- Ignoring learning phase on automated platforms
- 同时测试过多变量(无法明确归因获胜因素)
- 过早结束测试(未达到统计显著性)
- 在特殊时期测试(节假日、新品发布、故障事件)
- 对比时长不均的周期
- 不记录测试结论(积累机构知识)
- 需大幅调整时仅做微小改动(优化 vs 创新)
- 忽略自动化平台的学习期
Output Format
输出格式
undefinedundefinedA/B Test Plan
A/B测试计划
Hypothesis
假设
IF [change]
THEN [metric] will [direction] by [amount]
BECAUSE [reasoning]
IF [change]
THEN [metric] will [direction] by [amount]
BECAUSE [reasoning]
Test Design
测试设计
| Parameter | Value |
|---|---|
| Platform | [platform] |
| Test Type | [A/B / Multivariate] |
| Variable | [what's being changed] |
| Control | [current state] |
| Variant | [proposed change] |
| Primary Metric | [KPI] |
| Traffic Split | [50/50 / other] |
| 参数 | 数值 |
|---|---|
| 平台 | [platform] |
| 测试类型 | [A/B / 多变量] |
| 测试变量 | [变更内容] |
| 对照组 | 当前状态 |
| 变体组 | 提议变更 |
| 核心指标 | [KPI] |
| 流量分配 | [50/50 / 其他] |
Sample Size & Duration
样本量与时长
| Metric | Value |
|---|---|
| Baseline CVR | [X%] |
| MDE | [X%] |
| Required Sample | [N per variant] |
| Daily Traffic | [N clicks/day] |
| Est. Duration | [X days] |
| Min Duration | 7 days |
| 指标 | 数值 |
|---|---|
| 基准转化率 | [X%] |
| 最小可检测效果 | [X%] |
| 所需样本量 | [每个变体N] |
| 每日流量 | [每日N次点击] |
| 预估时长 | [X天] |
| 最短时长 | 7天 |
Success Criteria
成功标准
- Winner declared at 95% confidence
- [Primary metric] improvement of [X%]+ sustained over [Y] days
- No negative impact on [secondary metric]
- 达到95%置信度时判定获胜方
- [核心指标]提升[X%]+并持续[Y]天
- 对[次要指标]无负面影响
Setup Instructions
设置指南
[Platform-specific step-by-step]
undefined[平台专属分步说明]
undefined