Back to Details

ab-test-designer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

A/B Test Designer

A/B测试设计器

When to Use

适用场景

Testing a new feature or design variation
Validating a hypothesis before full rollout
Optimizing conversion rates or key metrics
Choosing between multiple design approaches
Need to make a data-driven decision on a change

测试新功能或设计变体
在全面推出前验证假设
优化转化率或关键指标
在多种设计方案中做选择
需要基于数据对某项变更做出决策

What This Skill Does

该Skill的作用

Helps you design rigorous A/B tests with clear hypotheses, success metrics, sample size calculations, and analysis plans.

帮助你设计严谨的A/B测试，包含清晰的假设、成功指标、样本量计算和分析计划。

Instructions

使用说明

Help me design an A/B test for [feature/change]. Include:

Hypothesis
- Current situation and metrics
- Proposed change
- Expected impact and why
Test Design
- Primary success metric
- Secondary metrics
- Sample size needed
- Test duration
- User segments to include/exclude
Variants
- Control (A): current experience
- Variant (B): new experience
- Any additional variants (C, D, etc.)
Risks and Controls
- Potential negative impacts
- Guardrail metrics
- When to stop the test early
Analysis Plan
- Statistical significance threshold
- How to handle edge cases
- Decision criteria

Feature context: [Add context about the change you want to test]

帮我为[功能/变更]设计一个A/B测试，包含以下内容：

假设
- 当前状况与指标
- 拟议的变更
- 预期影响及原因
测试设计
- 主要成功指标
- 次要指标
- 所需样本量
- 测试时长
- 包含/排除的用户细分群体
测试变体
- 对照组（A）：当前体验
- 变体组（B）：新体验
- 其他任何变体（C、D等）
风险与控制
- 潜在负面影响
- 护栏指标
- 提前终止测试的条件
分析计划
- 统计显著性阈值
- 如何处理边缘情况
- 决策标准

功能背景： [添加你想要测试的变更相关背景信息]

Best Practices

最佳实践

Start with a clear, falsifiable hypothesis
Choose one primary metric to avoid multiple comparison issues
Calculate sample size upfront based on expected effect size
Run tests for full weekly cycles to account for day-of-week effects
Set a minimum test duration (usually 1-2 weeks)
Define success criteria before running the test
Monitor guardrail metrics (revenue, errors, performance)

从清晰、可证伪的假设开始
选择一个主要指标，避免多重比较问题
根据预期效果大小预先计算样本量
运行完整的周周期测试，以覆盖周内不同日期的影响
设置最短测试时长（通常为1-2周）
在运行测试前定义成功标准
监控护栏指标（收入、错误、性能）

Example

示例

Input: Testing new onboarding flow vs current 3-step process Output: Hypothesis (new 1-step flow will increase co...

输入： 测试新的引导流程 vs 当前的3步流程 输出： 假设（新的1步流程将提升注...