data-engineering-data-driven-feature

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Data-Driven Feature Development

数据驱动的功能开发

Build features guided by data insights, A/B testing, and continuous measurement using specialized agents for analysis, implementation, and experimentation.

[Extended thinking: This workflow orchestrates a comprehensive data-driven development process from initial data analysis and hypothesis formulation through feature implementation with integrated analytics, A/B testing infrastructure, and post-launch analysis. Each phase leverages specialized agents to ensure features are built based on data insights, properly instrumented for measurement, and validated through controlled experiments. The workflow emphasizes modern product analytics practices, statistical rigor in testing, and continuous learning from user behavior.]

借助用于分析、实施和实验的专用Agent，基于数据洞察、A/B测试和持续度量来构建功能。

[拓展思考：该工作流编排了一套完整的数据驱动开发流程，从初始的数据分析、假设制定，到集成分析能力的功能实现、A/B测试基础设施搭建，再到上线后分析。每个阶段都借助专用Agent，确保功能基于数据洞察构建、配备完善的度量埋点，并通过受控实验验证。该工作流强调现代产品分析实践、测试的统计严谨性，以及从用户行为中持续学习。]

Use this skill when

何时使用此Skill

Working on data-driven feature development tasks or workflows
Needing guidance, best practices, or checklists for data-driven feature development

处理数据驱动的功能开发任务或工作流时
需要数据驱动功能开发的指导、最佳实践或检查清单时

Do not use this skill when

何时不使用此Skill

The task is unrelated to data-driven feature development
You need a different domain or tool outside this scope

任务与数据驱动的功能开发无关时
需要此范围之外的其他领域或工具时

Instructions

说明

Clarify goals, constraints, and required inputs.
Apply relevant best practices and validate outcomes.
Provide actionable steps and verification.
If detailed examples are required, open
```
resources/implementation-playbook.md
```
.

明确目标、约束条件和所需输入。
应用相关最佳实践并验证结果。
提供可操作的步骤和验证方法。
如果需要详细示例，请打开
```
resources/implementation-playbook.md
```
。

Phase 1: Data Analysis and Hypothesis Formation

第一阶段：数据分析与假设形成

1. Exploratory Data Analysis

1. 探索性数据分析

Use Task tool with subagent_type="machine-learning-ops::data-scientist"
Prompt: "Perform exploratory data analysis for feature: $ARGUMENTS. Analyze existing user behavior data, identify patterns and opportunities, segment users by behavior, and calculate baseline metrics. Use modern analytics tools (Amplitude, Mixpanel, Segment) to understand current user journeys, conversion funnels, and engagement patterns."
Output: EDA report with visualizations, user segments, behavioral patterns, baseline metrics

使用Task工具，设置subagent_type="machine-learning-ops::data-scientist"
提示语：“为功能$ARGUMENTS执行探索性数据分析。分析现有用户行为数据，识别模式与机会，按行为细分用户，并计算基准指标。使用现代分析工具（Amplitude、Mixpanel、Segment）了解当前用户旅程、转化漏斗和参与模式。”
输出：包含可视化内容、用户细分、行为模式、基准指标的EDA报告

2. Business Hypothesis Development

2. 业务假设制定

Use Task tool with subagent_type="business-analytics::business-analyst"
Context: Data scientist's EDA findings and behavioral patterns
Prompt: "Formulate business hypotheses for feature: $ARGUMENTS based on data analysis. Define clear success metrics, expected impact on key business KPIs, target user segments, and minimum detectable effects. Create measurable hypotheses using frameworks like ICE scoring or RICE prioritization."
Output: Hypothesis document, success metrics definition, expected ROI calculations

使用Task工具，设置subagent_type="business-analytics::business-analyst"
上下文：数据科学家的EDA发现和行为模式
提示语：“基于数据分析为功能$ARGUMENTS制定业务假设。定义清晰的成功指标、对关键业务KPI的预期影响、目标用户群体，以及最小可检测效果。使用ICE评分或RICE优先级排序等框架创建可衡量的假设。”
输出：假设文档、成功指标定义、预期ROI计算结果

3. Statistical Experiment Design

3. 统计实验设计

Use Task tool with subagent_type="machine-learning-ops::data-scientist"
Context: Business hypotheses and success metrics
Prompt: "Design statistical experiment for feature: $ARGUMENTS. Calculate required sample size for statistical power, define control and treatment groups, specify randomization strategy, and plan for multiple testing corrections. Consider Bayesian A/B testing approaches for faster decision making. Design for both primary and guardrail metrics."
Output: Experiment design document, power analysis, statistical test plan

使用Task工具，设置subagent_type="machine-learning-ops::data-scientist"
上下文：业务假设和成功指标
提示语：“为功能$ARGUMENTS设计统计实验。计算具备统计效力所需的样本量，定义对照组和实验组，指定随机化策略，并规划多重测试校正。考虑使用贝叶斯A/B测试方法以加快决策速度。同时设计主要指标和防护指标。”
输出：实验设计文档、效力分析、统计测试计划

Phase 2: Feature Architecture and Analytics Design

第二阶段：功能架构与分析设计

4. Feature Architecture Planning

4. 功能架构规划

Use Task tool with subagent_type="data-engineering::backend-architect"
Context: Business requirements and experiment design
Prompt: "Design feature architecture for: $ARGUMENTS with A/B testing capability. Include feature flag integration (LaunchDarkly, Split.io, or Optimizely), gradual rollout strategy, circuit breakers for safety, and clean separation between control and treatment logic. Ensure architecture supports real-time configuration updates."
Output: Architecture diagrams, feature flag schema, rollout strategy

使用Task工具，设置subagent_type="data-engineering::backend-architect"
上下文：业务需求和实验设计
提示语：“为$ARGUMENTS设计具备A/B测试能力的功能架构。集成功能标志（LaunchDarkly、Split.io或Optimizely）、渐进式发布策略、安全熔断机制，并清晰分离对照组和实验组逻辑。确保架构支持实时配置更新。”
输出：架构图、功能标志 schema、发布策略

5. Analytics Instrumentation Design

5. 分析埋点设计

Use Task tool with subagent_type="data-engineering::data-engineer"
Context: Feature architecture and success metrics
Prompt: "Design comprehensive analytics instrumentation for: $ARGUMENTS. Define event schemas for user interactions, specify properties for segmentation and analysis, design funnel tracking and conversion events, plan cohort analysis capabilities. Implement using modern SDKs (Segment, Amplitude, Mixpanel) with proper event taxonomy."
Output: Event tracking plan, analytics schema, instrumentation guide

使用Task工具，设置subagent_type="data-engineering::data-engineer"
上下文：功能架构和成功指标
提示语：“为$ARGUMENTS设计全面的分析埋点。定义用户交互的事件schema，指定用于细分和分析的属性，设计漏斗跟踪和转化事件，规划群组分析能力。使用现代SDK（Segment、Amplitude、Mixpanel）实现，并遵循规范的事件分类体系。”
输出：事件跟踪计划、分析schema、埋点指南

6. Data Pipeline Architecture

6. 数据管道架构

Use Task tool with subagent_type="data-engineering::data-engineer"
Context: Analytics requirements and existing data infrastructure
Prompt: "Design data pipelines for feature: $ARGUMENTS. Include real-time streaming for live metrics (Kafka, Kinesis), batch processing for detailed analysis, data warehouse integration (Snowflake, BigQuery), and feature store for ML if applicable. Ensure proper data governance and GDPR compliance."
Output: Pipeline architecture, ETL/ELT specifications, data flow diagrams

使用Task工具，设置subagent_type="data-engineering::data-engineer"
上下文：分析需求和现有数据基础设施
提示语：“为功能$ARGUMENTS设计数据管道。包含用于实时指标的实时流处理（Kafka、Kinesis）、用于详细分析的批处理、数据仓库集成（Snowflake、BigQuery），以及（如适用）用于机器学习的特征存储。确保符合数据治理和GDPR合规要求。”
输出：管道架构、ETL/ELT规范、数据流图

Phase 3: Implementation with Instrumentation

第三阶段：带埋点的功能实现

7. Backend Implementation

7. 后端实现

Use Task tool with subagent_type="backend-development::backend-architect"
Context: Architecture design and feature requirements
Prompt: "Implement backend for feature: $ARGUMENTS with full instrumentation. Include feature flag checks at decision points, comprehensive event tracking for all user actions, performance metrics collection, error tracking and monitoring. Implement proper logging for experiment analysis."
Output: Backend code with analytics, feature flag integration, monitoring setup

使用Task工具，设置subagent_type="backend-development::backend-architect"
上下文：架构设计和功能需求
提示语：“为功能$ARGUMENTS实现完整带埋点的后端。在决策点添加功能标志检查，为所有用户操作添加全面的事件跟踪，收集性能指标，实现错误跟踪与监控。为实验分析设置完善的日志记录。”
输出：集成分析功能的后端代码、功能标志集成、监控设置

8. Frontend Implementation

8. 前端实现

Use Task tool with subagent_type="frontend-mobile-development::frontend-developer"
Context: Backend APIs and analytics requirements
Prompt: "Build frontend for feature: $ARGUMENTS with analytics tracking. Implement event tracking for all user interactions, session recording integration if applicable, performance metrics (Core Web Vitals), and proper error boundaries. Ensure consistent experience between control and treatment groups."
Output: Frontend code with analytics, A/B test variants, performance monitoring

使用Task工具，设置subagent_type="frontend-mobile-development::frontend-developer"
上下文：后端API和分析需求
提示语：“为功能$ARGUMENTS构建带分析跟踪的前端。为所有用户交互实现事件跟踪，（如适用）集成会话录制，收集性能指标（Core Web Vitals），并设置适当的错误边界。确保对照组和实验组的体验一致。”
输出：集成分析功能的前端代码、A/B测试变体、性能监控设置

9. ML Model Integration (if applicable)

9. ML模型集成（如适用）

Use Task tool with subagent_type="machine-learning-ops::ml-engineer"
Context: Feature requirements and data pipelines
Prompt: "Integrate ML models for feature: $ARGUMENTS if needed. Implement online inference with low latency, A/B testing between model versions, model performance tracking, and automatic fallback mechanisms. Set up model monitoring for drift detection."
Output: ML pipeline, model serving infrastructure, monitoring setup

使用Task工具，设置subagent_type="machine-learning-ops::ml-engineer"
上下文：功能需求和数据管道
提示语：“（如需要）为功能$ARGUMENTS集成ML模型。实现低延迟的在线推理，在不同模型版本间进行A/B测试，跟踪模型性能，并设置自动回退机制。搭建模型监控以检测数据漂移。”
输出：ML管道、模型服务基础设施、监控设置

Phase 4: Pre-Launch Validation

第四阶段：上线前验证

10. Analytics Validation

10. 分析验证

Use Task tool with subagent_type="data-engineering::data-engineer"
Context: Implemented tracking and event schemas
Prompt: "Validate analytics implementation for: $ARGUMENTS. Test all event tracking in staging, verify data quality and completeness, validate funnel definitions, ensure proper user identification and session tracking. Run end-to-end tests for data pipeline."
Output: Validation report, data quality metrics, tracking coverage analysis

使用Task工具，设置subagent_type="data-engineering::data-engineer"
上下文：已实现的跟踪和事件schema
提示语：“验证$ARGUMENTS的分析实现。在预发布环境测试所有事件跟踪，验证数据质量和完整性，确认漏斗定义的正确性，确保用户标识和会话跟踪正常。对数据管道进行端到端测试。”
输出：验证报告、数据质量指标、跟踪覆盖分析

11. Experiment Setup

11. 实验设置

Use Task tool with subagent_type="cloud-infrastructure::deployment-engineer"
Context: Feature flags and experiment design
Prompt: "Configure experiment infrastructure for: $ARGUMENTS. Set up feature flags with proper targeting rules, configure traffic allocation (start with 5-10%), implement kill switches, set up monitoring alerts for key metrics. Test randomization and assignment logic."
Output: Experiment configuration, monitoring dashboards, rollout plan

使用Task工具，设置subagent_type="cloud-infrastructure::deployment-engineer"
上下文：功能标志和实验设计
提示语：“为$ARGUMENTS配置实验基础设施。设置带有适当目标规则的功能标志，配置流量分配（从5-10%开始），实现紧急关闭开关，为关键指标设置监控告警。测试随机化和分组逻辑。”
输出：实验配置、监控仪表盘、发布计划

Phase 5: Launch and Experimentation

第五阶段：上线与实验

12. Gradual Rollout

12. 渐进式发布

Use Task tool with subagent_type="cloud-infrastructure::deployment-engineer"
Context: Experiment configuration and monitoring setup
Prompt: "Execute gradual rollout for feature: $ARGUMENTS. Start with internal dogfooding, then beta users (1-5%), gradually increase to target traffic. Monitor error rates, performance metrics, and early indicators. Implement automated rollback on anomalies."
Output: Rollout execution, monitoring alerts, health metrics

使用Task工具，设置subagent_type="cloud-infrastructure::deployment-engineer"
上下文：实验配置和监控设置
提示语：“执行$ARGUMENTS的渐进式发布。先在内部试用，然后推广给beta用户（1-5%），逐步扩大到目标流量。监控错误率、性能指标和早期信号。在出现异常时自动回滚。”
输出：发布执行记录、监控告警、健康指标

13. Real-time Monitoring

13. 实时监控

Use Task tool with subagent_type="observability-monitoring::observability-engineer"
Context: Deployed feature and success metrics
Prompt: "Set up comprehensive monitoring for: $ARGUMENTS. Create real-time dashboards for experiment metrics, configure alerts for statistical significance, monitor guardrail metrics for negative impacts, track system performance and error rates. Use tools like Datadog, New Relic, or custom dashboards."
Output: Monitoring dashboards, alert configurations, SLO definitions

使用Task工具，设置subagent_type="observability-monitoring::observability-engineer"
上下文：已上线的功能和成功指标
提示语：“为$ARGUMENTS搭建全面的监控。创建实验指标的实时仪表盘，为统计显著性配置告警，监控防护指标以发现负面影响，跟踪系统性能和错误率。使用Datadog、New Relic或自定义仪表盘等工具。”
输出：监控仪表盘、告警配置、SLO定义

Phase 6: Analysis and Decision Making

第六阶段：分析与决策

14. Statistical Analysis

14. 统计分析

Use Task tool with subagent_type="machine-learning-ops::data-scientist"
Context: Experiment data and original hypotheses
Prompt: "Analyze A/B test results for: $ARGUMENTS. Calculate statistical significance with confidence intervals, check for segment-level effects, analyze secondary metrics impact, investigate any unexpected patterns. Use both frequentist and Bayesian approaches. Account for multiple testing if applicable."
Output: Statistical analysis report, significance tests, segment analysis

使用Task工具，设置subagent_type="machine-learning-ops::data-scientist"
上下文：实验数据和原始假设
提示语：“分析$ARGUMENTS的A/B测试结果。计算带置信区间的统计显著性，检查细分群体的影响，分析次要指标的影响，调查任何意外模式。同时使用频率学派和贝叶斯方法。（如适用）考虑多重测试校正。”
输出：统计分析报告、显著性测试结果、细分群体分析

15. Business Impact Assessment

15. 业务影响评估

Use Task tool with subagent_type="business-analytics::business-analyst"
Context: Statistical analysis and business metrics
Prompt: "Assess business impact of feature: $ARGUMENTS. Calculate actual vs expected ROI, analyze impact on key business metrics, evaluate cost-benefit including operational overhead, project long-term value. Make recommendation on full rollout, iteration, or rollback."
Output: Business impact report, ROI analysis, recommendation document

使用Task工具，设置subagent_type="business-analytics::business-analyst"
上下文：统计分析结果和业务指标
提示语：“评估$ARGUMENTS的业务影响。计算实际ROI与预期ROI的对比，分析对关键业务指标的影响，评估包括运营成本在内的成本效益，预测长期价值。给出全面发布、迭代或回退的建议。”
输出：业务影响报告、ROI分析、建议文档

16. Post-Launch Optimization

16. 上线后优化

Use Task tool with subagent_type="machine-learning-ops::data-scientist"
Context: Launch results and user feedback
Prompt: "Identify optimization opportunities for: $ARGUMENTS based on data. Analyze user behavior patterns in treatment group, identify friction points in user journey, suggest improvements based on data, plan follow-up experiments. Use cohort analysis for long-term impact."
Output: Optimization recommendations, follow-up experiment plans

使用Task工具，设置subagent_type="machine-learning-ops::data-scientist"
上下文：上线结果和用户反馈
提示语：“基于数据识别$ARGUMENTS的优化机会。分析实验组的用户行为模式，找出用户旅程中的痛点，基于数据提出改进建议，规划后续实验。使用群组分析评估长期影响。”
输出：优化建议、后续实验计划

Configuration Options

配置选项

yaml

experiment_config:
  min_sample_size: 10000
  confidence_level: 0.95
  runtime_days: 14
  traffic_allocation: "gradual"  # gradual, fixed, or adaptive

analytics_platforms:
  - amplitude
  - segment
  - mixpanel

feature_flags:
  provider: "launchdarkly"  # launchdarkly, split, optimizely, unleash

statistical_methods:
  - frequentist
  - bayesian

monitoring:
  - real_time_metrics: true
  - anomaly_detection: true
  - automatic_rollback: true

yaml

experiment_config:
  min_sample_size: 10000
  confidence_level: 0.95
  runtime_days: 14
  traffic_allocation: "gradual"  # gradual, fixed, or adaptive

analytics_platforms:
  - amplitude
  - segment
  - mixpanel

feature_flags:
  provider: "launchdarkly"  # launchdarkly, split, optimizely, unleash

statistical_methods:
  - frequentist
  - bayesian

monitoring:
  - real_time_metrics: true
  - anomaly_detection: true
  - automatic_rollback: true

Success Criteria

成功标准

Data Coverage: 100% of user interactions tracked with proper event schema
Experiment Validity: Proper randomization, sufficient statistical power, no sample ratio mismatch
Statistical Rigor: Clear significance testing, proper confidence intervals, multiple testing corrections
Business Impact: Measurable improvement in target metrics without degrading guardrail metrics
Technical Performance: No degradation in p95 latency, error rates below 0.1%
Decision Speed: Clear go/no-go decision within planned experiment runtime
Learning Outcomes: Documented insights for future feature development

数据覆盖：100%的用户交互都按规范的事件schema被跟踪
实验有效性：随机化正确、统计效力充足、无样本比例失衡
统计严谨性：明确的显著性测试、规范的置信区间、多重测试校正
业务影响：目标指标有可衡量的提升，且未损害防护指标
技术性能：p95延迟无下降，错误率低于0.1%
决策速度：在计划的实验周期内给出明确的执行/终止决策
学习成果：记录可用于未来功能开发的洞察

Coordination Notes

协作说明

Data scientists and business analysts collaborate on hypothesis formation
Engineers implement with analytics as first-class requirement, not afterthought
Feature flags enable safe experimentation without full deployments
Real-time monitoring allows for quick iteration and rollback if needed
Statistical rigor balanced with business practicality and speed to market
Continuous learning loop feeds back into next feature development cycle

Feature to develop with data-driven approach: $ARGUMENTS

数据科学家与业务分析师协作制定假设
工程师将分析作为首要需求实现，而非事后补充
功能标志支持安全实验，无需完整部署
实时监控允许快速迭代和必要时的回滚
在统计严谨性与业务实用性、上市速度之间取得平衡
持续学习循环为下一轮功能开发提供输入

待以数据驱动方法开发的功能：$ARGUMENTS