screenshot-feature-extractor

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Screenshot Analyzer (Multi-Agent)

Screenshot Analyzer (Multi-Agent)

Extract product features from UI screenshots using a coordinated multi-agent analysis pipeline.
Core principle: Describe WHAT to build (features/interactions), NOT HOW (no tech stack).
通过协同的多Agent分析流程从UI截图中提取产品功能。
核心原则:描述要构建的内容(功能/交互),而非构建方式(不涉及技术栈)。

Multi-Agent Architecture

多Agent架构

This skill orchestrates 5 specialized agents for comprehensive analysis:
                    ┌─────────────────┐
                    │   Coordinator   │
                    │   (this skill)  │
                    └────────┬────────┘
         ┌───────────────────┼───────────────────┐
         │                   │                   │
         ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│  UI Analyzer    │ │  Interaction    │ │   Business      │
│  (parallel)     │ │   Analyzer      │ │    Analyzer     │
│                 │ │  (parallel)     │ │   (parallel)    │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
         │                   │                   │
         └───────────────────┼───────────────────┘
                    ┌─────────────────┐
                    │   Synthesizer   │
                    │   (sequential)  │
                    └────────┬────────┘
                    ┌─────────────────┐
                    │    Reviewer     │
                    │   (sequential)  │
                    └─────────────────┘
本Skill编排5个专业Agent以进行全面分析:
                    ┌─────────────────┐
                    │   Coordinator   │
                    │   (this skill)  │
                    └────────┬────────┘
         ┌───────────────────┼───────────────────┐
         │                   │                   │
         ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│  UI Analyzer    │ │  Interaction    │ │   Business      │
│  (parallel)     │ │   Analyzer      │ │    Analyzer     │
│                 │ │  (parallel)     │ │   (parallel)    │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
         │                   │                   │
         └───────────────────┼───────────────────┘
                    ┌─────────────────┐
                    │   Synthesizer   │
                    │   (sequential)  │
                    └────────┬────────┘
                    ┌─────────────────┐
                    │    Reviewer     │
                    │   (sequential)  │
                    └─────────────────┘

Process

流程

Phase 1: Screenshot Collection

阶段1:截图收集

Gather all screenshots to analyze:
  1. Read the screenshot file(s) provided by the user
  2. For each screenshot, note the file path and any context provided
  3. If multiple screenshots, determine if they are from the same product
收集所有待分析的截图:
  1. 读取用户提供的截图文件(s)
  2. 为每张截图记录文件路径及任何相关上下文
  3. 若为多张截图,判断它们是否来自同一产品

Phase 2: Parallel Analysis

阶段2:并行分析

Launch THREE Task agents IN PARALLEL for each screenshot:
Agent 1: screenshot-ui-analyzer
Analyze this screenshot for UI components, layout structure, and design patterns.
Screenshot: [file path]
Return your analysis as JSON.
Agent 2: screenshot-interaction-analyzer
Analyze this screenshot for user interactions, navigation flows, and state transitions.
Screenshot: [file path]
Return your analysis as JSON.
Agent 3: screenshot-business-analyzer
Analyze this screenshot for business functions, data entities, and domain logic.
Screenshot: [file path]
Return your analysis as JSON.
IMPORTANT: Use the Task tool with THREE parallel calls in a single message to maximize efficiency.
为每张截图启动三个任务Agent进行并行分析:
Agent 1: screenshot-ui-analyzer
分析此截图的UI组件、布局结构和设计模式。
截图:[文件路径]
以JSON格式返回分析结果。
Agent 2: screenshot-interaction-analyzer
分析此截图的用户交互、导航流程和状态转换。
截图:[文件路径]
以JSON格式返回分析结果。
Agent 3: screenshot-business-analyzer
分析此截图的业务功能、数据实体和领域逻辑。
截图:[文件路径]
以JSON格式返回分析结果。
重要提示:在一条消息中使用Task工具进行三次并行调用,以最大化效率。

Phase 3: Synthesis

阶段3:合成

After all parallel analyses complete, launch the synthesizer agent:
Agent 4: screenshot-synthesizer
Synthesize these analysis results into a unified development task list.

UI Analysis:
[paste UI analyzer result]

Interaction Analysis:
[paste Interaction analyzer result]

Business Analysis:
[paste Business analyzer result]

Product Name: [product name]
Output file: docs/plans/YYYY-MM-DD-<product>-features.md
所有并行分析完成后,启动合成Agent:
Agent 4: screenshot-synthesizer
将这些分析结果合成为统一的开发任务列表。

UI分析结果:
[粘贴UI分析器结果]

交互分析结果:
[粘贴交互分析器结果]

业务分析结果:
[粘贴业务分析器结果]

产品名称:[产品名称]
输出文件:docs/plans/YYYY-MM-DD-<product>-features.md

Phase 4: Review

阶段4:审核

Launch the reviewer agent to validate the output:
Agent 5: screenshot-reviewer
Review this task list for completeness and quality.

Original screenshot(s): [file paths]
Task list: [synthesized output]

If issues found, provide corrections.
启动审核Agent以验证输出内容:
Agent 5: screenshot-reviewer
审核此任务列表的完整性和质量。

原始截图:[文件路径]
任务列表:[合成输出内容]

若发现问题,请提供修正方案。

Phase 5: Output

阶段5:输出

  1. Write final task list to
    docs/plans/YYYY-MM-DD-<product>-features.md
  2. Use format from references/output-format.md
  3. Present summary to user
  1. 将最终任务列表写入
    docs/plans/YYYY-MM-DD-<product>-features.md
  2. 使用references/output-format.md中的格式
  3. 向用户展示总结内容

Key Guidelines

关键指南

  • Use
    - [ ]
    checkbox format for all tasks
  • Break features into small, executable subtasks
  • Focus on user interactions, not implementation details
  • For multiple screenshots: deduplicate features across all screens
  • For competitive analysis: highlight unique features and gaps
  • 所有任务使用
    - [ ]
    复选框格式
  • 将功能拆分为小的、可执行的子任务
  • 聚焦用户交互,而非实现细节
  • 多张截图场景:去重所有界面中的重复功能
  • 竞品分析场景:突出独特功能和差距

Benefits of Multi-Agent Approach

多Agent方法的优势

  1. Thoroughness - Three specialized perspectives catch more details
  2. Speed - Parallel analysis reduces total time
  3. Quality - Synthesis + Review ensures coherent, complete output
  4. Specialization - Each agent focuses on its domain expertise
  1. 全面性 - 三个专业视角捕捉更多细节
  2. 速度 - 并行分析减少总耗时
  3. 质量 - 合成+审核确保输出连贯、完整
  4. 专业性 - 每个Agent专注于其领域专长