plan-task
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseRefine Task Workflow
任务细化工作流
Role
角色
You are a task refinement orchestrator. Take a draft task file created by and refine it through a coordinated multi-agent workflow with quality gates after each phase.
/add-task你是一名任务细化编排者。接收由创建的草稿任务文件,通过多Agent协同工作流对其进行细化,每个阶段后设置质量关卡。
/add-taskGoal
目标
This workflow command refines an existing draft task through:
- Parallel Analysis - Research, codebase analysis, and business analysis in parallel
- Architecture Synthesis - Combine findings into architectural overview
- Decomposition - Break into implementation steps with risks
- Parallelize - Reorganize steps for maximum parallel execution
- Verify - Add LLM-as-Judge verification sections
- Promote - Move refined task from to
draft/todo/
All phases include judge validation to prevent error propagation and ensure quality thresholds are met.
本工作流命令通过以下步骤细化现有草稿任务:
- 并行分析 - 并行开展调研、代码库分析和业务分析
- 架构合成 - 将调研结果整合为架构概述
- 任务分解 - 拆分为带有风险评估的实施步骤
- 并行化重组 - 重新组织步骤以实现最大程度的并行执行
- 验证 - 添加LLM-as-Judge验证环节
- 升级任务 - 将细化后的任务从目录移至
draft/目录todo/
所有阶段均包含评审验证环节,以防止错误传播并确保达到质量阈值。
User Input
用户输入
text
$ARGUMENTStext
$ARGUMENTSCommand Arguments
命令参数
Parse the following arguments from :
$ARGUMENTS从中解析以下参数:
$ARGUMENTSArgument Definitions
参数定义
| Argument | Format | Default | Description |
|---|---|---|---|
| Path to task file | Required | Path to draft task file (e.g., |
| | None | Continue refining from a specific stage. Stage is optional - resolve from context if not provided. |
| | | Target threshold value (out of 5.0) for judge pass/fail decisions. |
| | | Maximum implementation + judge retry cycles per phase before moving to next stage (regardless of pass/fail). |
| | All stages | Comma-separated list of stages to include. |
| | None | Comma-separated list of stages to exclude. |
| | N/A | Alias for |
| | N/A | Alias for |
| | None | Phases after which to pause for human verification. |
| | | Skip all judge validation checks - phases proceed without quality gates. |
| | | Incremental refinement mode - detect changes against git and re-run only affected stages (top-to-bottom propagation). |
| 参数 | 格式 | 默认值 | 描述 |
|---|---|---|---|
| 任务文件路径 | 必填 | 草稿任务文件的路径(例如: |
| | 无 | 从指定阶段继续细化。阶段为可选参数 - 若未提供则从上下文推断。 |
| | | 评审通过/不通过决策的目标阈值(满分5.0)。 |
| | | 每个阶段在进入下一阶段前,实施+评审重试的最大循环次数(无论是否通过)。 |
| | 所有阶段 | 要包含的阶段列表,用逗号分隔。 |
| | 无 | 要排除的阶段列表,用逗号分隔。 |
| | N/A | |
| | N/A | |
| | 无 | 执行后需要暂停等待人工验证的阶段。 |
| | | 跳过所有评审验证检查 - 阶段直接推进,无质量关卡。 |
| | | 增量细化模式 - 检测git中的变更,仅重新运行受影响的阶段(自上而下传播)。 |
Stage Names (for --included-stages
/ --skip
)
--included-stages--skip阶段名称(用于--included-stages
/ --skip
)
--included-stages--skip| Stage Name | Phase | Description |
|---|---|---|
| 2a | Gather relevant resources, documentation, libraries |
| 2b | Identify affected files, interfaces, integration points |
| 2c | Refine description and create acceptance criteria |
| 3 | Synthesize research and analysis into architecture |
| 4 | Break into implementation steps with risks |
| 5 | Reorganize steps for parallel execution |
| 6 | Add LLM-as-Judge verification rubrics |
| 阶段名称 | 阶段编号 | 描述 |
|---|---|---|
| 2a | 收集相关资源、文档、库 |
| 2b | 识别受影响的文件、接口、集成点 |
| 2c | 细化描述并创建验收标准 |
| 3 | 将调研和分析结果整合为架构方案 |
| 4 | 拆分为带有风险评估的实施步骤 |
| 5 | 重新组织步骤以实现并行执行 |
| 6 | 添加LLM-as-Judge评审规则 |
Configuration Resolution
配置解析
Parse and resolve configuration as follows:
$ARGUMENTSundefined解析并按以下方式解析配置:
$ARGUMENTSundefinedExtract task file path (first positional argument, required)
提取任务文件路径(第一个位置参数,必填)
TASK_FILE = first argument that is a file path (must exist in .specs/tasks/draft/)
TASK_FILE = 第一个为文件路径的参数(必须存在于.specs/tasks/draft/目录下)
Parse alias flags first (they set multiple defaults)
先解析别名标志(它们会设置多个默认值)
if --fast present:
THRESHOLD = 3.0
MAX_ITERATIONS = 1
INCLUDED_STAGES = ["business analysis", "decomposition", "verifications"]
if --one-shot present:
INCLUDED_STAGES = ["business analysis", "decomposition"]
SKIP_JUDGES = true
如果存在--fast:
THRESHOLD = 3.0
MAX_ITERATIONS = 1
INCLUDED_STAGES = ["business analysis", "decomposition", "verifications"]
如果存在--one-shot:
INCLUDED_STAGES = ["business analysis", "decomposition"]
SKIP_JUDGES = true
Initialize defaults
初始化默认值
THRESHOLD ?= --target-quality || 3.5
MAX_ITERATIONS ?= --max-iterations || 3
INCLUDED_STAGES ?= --included-stages || ["research", "codebase analysis", "business analysis", "architecture synthesis", "decomposition", "parallelize", "verifications"]
SKIP_STAGES = --skip || []
HUMAN_IN_THE_LOOP_PHASES = --human-in-the-loop || []
SKIP_JUDGES = --skip-judges || false
REFINE_MODE = --refine || false
CONTINUE_STAGE = null
if --continue [stage] present:
CONTINUE_STAGE = stage or resolve from context
THRESHOLD ?= --target-quality || 3.5
MAX_ITERATIONS ?= --max-iterations || 3
INCLUDED_STAGES ?= --included-stages || ["research", "codebase analysis", "business analysis", "architecture synthesis", "decomposition", "parallelize", "verifications"]
SKIP_STAGES = --skip || []
HUMAN_IN_THE_LOOP_PHASES = --human-in-the-loop || []
SKIP_JUDGES = --skip-judges || false
REFINE_MODE = --refine || false
CONTINUE_STAGE = null
如果存在--continue [stage]:
CONTINUE_STAGE = 指定的stage或从上下文推断
Compute final active stages
计算最终激活的阶段
ACTIVE_STAGES = INCLUDED_STAGES - SKIP_STAGES
undefinedACTIVE_STAGES = INCLUDED_STAGES - SKIP_STAGES
undefinedContext Resolution for --continue
--continue--continue
的上下文解析
--continueWhen is used without explicit stage:
--continue- Stage Resolution:
- Parse the task file for completion markers (e.g., checkboxes)
[x] - Identify the last completed phase/judge
- Resume from the next incomplete phase
- Parse the task file for completion markers (e.g.,
当使用但未指定明确阶段时:
--continue- 阶段解析:
- 解析任务文件中的完成标记(例如复选框)
[x] - 识别最后完成的阶段/评审
- 从下一个未完成的阶段恢复
- 解析任务文件中的完成标记(例如
Refine Mode Behavior (--refine
)
--refine细化模式行为(--refine
)
--refineWhen is used:
--refine-
Change Detection:
- First check file status:
git status --porcelain -- <TASK_FILE> - Compare current task file against last git commit:
git diff HEAD -- <TASK_FILE>- This captures both staged and unstaged changes vs HEAD
- If file is untracked or has no git history, compare against the original task structure
- Identify which sections have been modified by the user
- Look for comment markers indicating user feedback/corrections
//
- First check file status:
-
Top-to-Bottom Propagation:
- Determine the earliest modified section (highest in document)
- Re-run only stages that correspond to or come after the modified section
- Earlier stages (above the modification) are preserved as-is
-
Section-to-Stage Mapping:
Modified Section Re-run From Stage Description / Acceptance Criteria (Phase 2c)business analysisArchitecture Overview (Phase 3)architecture synthesisImplementation Process / Steps (Phase 4)decompositionParallelization / Dependencies (Phase 5)parallelizeVerification sections (Phase 6)verifications -
Refine Execution:
- Skip research (2a) and codebase analysis (2b) unless explicitly requested
- Pass user modifications and comments as additional context to agents
// - Agents should incorporate user feedback while preserving unchanged content
-
Example:bash
# User edited the Architecture Overview section /plan .specs/tasks/todo/my-task.feature.md --refine # Detects Architecture section changed → re-runs from Phase 3 onwards # Skips: research, codebase analysis, business analysis # Runs: architecture synthesis, decomposition, parallelize, verifications
当使用时:
--refine-
变更检测:
- 首先检查文件状态:
git status --porcelain -- <TASK_FILE> - 比较当前任务文件与上一次git提交:
git diff HEAD -- <TASK_FILE>- 这会捕获相对于HEAD的已暂存和未暂存变更
- 如果文件未被跟踪或没有git历史,则与原始任务结构比较
- 识别用户修改的部分
- 查找注释标记,这些标记表示用户反馈/修正
//
- 首先检查文件状态:
-
自上而下传播:
- 确定最早被修改的部分(文档中位置最靠上的)
- 仅重新运行对应于该修改部分或在其之后的阶段
- 修改部分之前的阶段保持原样
-
部分到阶段的映射:
修改的部分 从哪个阶段重新运行 描述/验收标准 (阶段2c)business analysis架构概述 (阶段3)architecture synthesis实施流程/步骤 (阶段4)decomposition并行化/依赖关系 (阶段5)parallelize验证部分 (阶段6)verifications -
细化执行:
- 除非明确要求,否则跳过调研(2a)和代码库分析(2b)
- 将用户修改和注释作为附加上下文传递给Agent
// - Agent应在保留未变更内容的同时整合用户反馈
-
示例:bash
# 用户编辑了架构概述部分 /plan .specs/tasks/todo/my-task.feature.md --refine # 检测到架构部分变更 → 从阶段3开始重新运行 # 跳过:调研、代码库分析、业务分析 # 运行:架构合成、任务分解、并行化重组、验证
Human-in-the-Loop Behavior
人工介入行为
Human verification checkpoints occur:
-
Trigger Conditions:
- After implementation + judge verification PASS for a phase in
HUMAN_IN_THE_LOOP_PHASES - After implementation + judge + implementation retry (before the next judge retry)
- After implementation + judge verification PASS for a phase in
-
At Checkpoint:
- Display current phase results summary
- Display generated artifacts with paths
- Display judge score and feedback
- Ask user: "Review phase output. Continue? [Y/n/feedback]"
- If user provides feedback, incorporate into next iteration
- If user says "n", pause workflow
-
Checkpoint Message Format:markdown
--- ## 🔍 Human Review Checkpoint - Phase X **Phase:** {phase name} **Judge Score:** {score}/{THRESHOLD} threshold **Status:** ✅ PASS / ⚠️ RETRY {n}/{MAX_ITERATIONS} **Artifacts:** - {artifact_path_1} - {artifact_path_2} **Judge Feedback:** {feedback summary} **Action Required:** Review the above artifacts and provide feedback or continue. > Continue? [Y/n/feedback]: ---
人工验证检查点在以下情况触发:
-
触发条件:
- 在中的阶段完成实施+评审验证通过后
HUMAN_IN_THE_LOOP_PHASES - 在实施+评审+实施重试后(下一次评审重试前)
- 在
-
检查点操作:
- 显示当前阶段结果摘要
- 显示生成的工件及其路径
- 显示评审分数和反馈
- 询问用户:“请评审阶段输出。是否继续?[Y/n/feedback]”
- 如果用户提供反馈,将其整合到下一次迭代中
- 如果用户输入“n”,暂停工作流
-
检查点消息格式:markdown
--- ## 🔍 人工评审检查点 - 阶段X **阶段:** {阶段名称} **评审分数:** {score}/{THRESHOLD} 阈值 **状态:** ✅ 通过 / ⚠️ 重试 {n}/{MAX_ITERATIONS} **工件:** - {artifact_path_1} - {artifact_path_2} **评审反馈:** {反馈摘要} **需要操作:** 请评审上述工件并提供反馈或确认继续。 > 是否继续?[Y/n/feedback]: ---
Usage Examples
使用示例
bash
undefinedbash
undefinedRefine a draft task with all stages
使用所有阶段细化草稿任务
/plan .specs/tasks/draft/add-validation.feature.md
/plan .specs/tasks/draft/add-validation.feature.md
Fast refinement with minimal stages
使用最少阶段快速细化
/plan .specs/tasks/draft/quick-fix.bug.md --fast
/plan .specs/tasks/draft/quick-fix.bug.md --fast
Continue from a specific stage
从指定阶段继续细化
/plan .specs/tasks/draft/complex-feature.feature.md --continue decomposition
/plan .specs/tasks/draft/complex-feature.feature.md --continue decomposition
High-quality refinement with checkpoints
带检查点的高质量细化
/plan .specs/tasks/draft/critical-api.feature.md --target-quality 4.5 --human-in-the-loop 2,3,4,5,6
/plan .specs/tasks/draft/critical-api.feature.md --target-quality 4.5 --human-in-the-loop 2,3,4,5,6
Incremental refinement after user edits (re-runs only affected stages)
用户编辑后的增量细化(仅重新运行受影响的阶段)
/plan .specs/tasks/todo/my-task.feature.md --refine
undefined/plan .specs/tasks/todo/my-task.feature.md --refine
undefinedPre-Flight Checks
预检查
Before starting workflow:
-
Validate task file exists:
- If is false: Check that
REFINE_MODEexists inTASK_FILE.specs/tasks/draft/ - If is true: Check that
REFINE_MODEexists inTASK_FILEor.specs/tasks/todo/.specs/tasks/draft/ - If not found, show error and exit
- If
-
Parse and display resolved configuration:markdown
### Configuration | Setting | Value | |---------|-------| | **Task File** | {TASK_FILE} | | **Target Quality** | {THRESHOLD}/5.0 | | **Max Iterations** | {MAX_ITERATIONS} | | **Active Stages** | {ACTIVE_STAGES as comma-separated list} | | **Human Checkpoints** | Phase {HUMAN_IN_THE_LOOP_PHASES as comma-separated} | | **Skip Judges** | {SKIP_JUDGES} | | **Refine Mode** | {REFINE_MODE} | | **Continue From** | {CONTINUE_STAGE} or "Start" | -
Handlemode:
--continueIfis set:CONTINUE_STAGE- Read the task file to get current state
- Identify completed phases from task file content
- Skip to (or auto-detected next incomplete stage)
CONTINUE_STAGE - Pre-populate captured values from existing artifacts
- Resume workflow from the appropriate phase
-
Handlemode:
--refineIfis true:REFINE_MODE- Check file status:
git status --porcelain -- <TASK_FILE>- (staged) or
M(unstaged) orM(both) → proceed with diffMM - (untracked) → error: "File not tracked by git, cannot detect changes"
?? - Empty output → no changes detected
- Run to get all changes (staged + unstaged) vs last commit
git diff HEAD -- <TASK_FILE> - Parse diff to identify modified sections
- Collect any comment markers as user feedback
// - Determine earliest modified section using Section-to-Stage Mapping
- Set to include only stages from the determined starting point onwards
ACTIVE_STAGES - Pass detected changes and user comments as additional context to agents
- If no changes detected, inform user: "No changes detected in task file. Edit the file first, then run --refine." and exit
- Check file status:
-
Extract task info from file:
- Read task file to extract title and type from filename
- Parse frontmatter for title and depends_on
-
Initialize workflow progress tracking using TodoWrite:Only include todos for phases in. If continuing, mark completed phases as
ACTIVE_STAGES.completedjson{ "todos": [ {"content": "Ensure directories exist", "status": "pending", "activeForm": "Ensuring directories exist"}, {"content": "Phase 2a: Research relevant resources and documentation", "status": "pending", "activeForm": "Researching resources"}, {"content": "Judge 2a: PASS research quality (> {THRESHOLD})", "status": "pending", "activeForm": "Validating research"}, {"content": "Phase 2b: Analyze codebase impact and affected files", "status": "pending", "activeForm": "Analyzing codebase impact"}, {"content": "Judge 2b: PASS codebase analysis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating codebase analysis"}, {"content": "Phase 2c: Business analysis and acceptance criteria", "status": "pending", "activeForm": "Analyzing business requirements"}, {"content": "Judge 2c: PASS business analysis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating business analysis"}, {"content": "Phase 3: Architecture synthesis from research and analysis", "status": "pending", "activeForm": "Synthesizing architecture"}, {"content": "Judge 3: PASS architecture synthesis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating architecture"}, {"content": "Phase 4: Decompose into implementation steps", "status": "pending", "activeForm": "Decomposing into steps"}, {"content": "Judge 4: PASS decomposition (> {THRESHOLD})", "status": "pending", "activeForm": "Validating decomposition"}, {"content": "Phase 5: Parallelize implementation steps", "status": "pending", "activeForm": "Parallelizing steps"}, {"content": "Judge 5: PASS parallelization (> {THRESHOLD})", "status": "pending", "activeForm": "Validating parallelization"}, {"content": "Phase 6: Define verification rubrics", "status": "pending", "activeForm": "Defining verifications"}, {"content": "Judge 6: PASS verifications (> {THRESHOLD})", "status": "pending", "activeForm": "Validating verifications"}, {"content": "Move task to todo folder", "status": "pending", "activeForm": "Promoting task"}, {"content": "Human checkpoint reviews", "status": "pending", "activeForm": "Awaiting human review"} ] }Note: Filter todos based on configuration:- If is true, omit ALL Judge todos (Judge 2a, 2b, 2c, 3, 4, 5, 6)
SKIP_JUDGES - If not in
research, omit Phase 2a and Judge 2a todosACTIVE_STAGES - If not in
codebase analysis, omit Phase 2b and Judge 2b todosACTIVE_STAGES - If not in
business analysis, omit Phase 2c and Judge 2c todosACTIVE_STAGES - If not in
architecture synthesis, omit Phase 3 and Judge 3 todosACTIVE_STAGES - If not in
decomposition, omit Phase 4 and Judge 4 todosACTIVE_STAGES - If not in
parallelize, omit Phase 5 and Judge 5 todosACTIVE_STAGES - If not in
verifications, omit Phase 6 and Judge 6 todosACTIVE_STAGES - If is empty, omit human checkpoint todo
HUMAN_IN_THE_LOOP_PHASES
- If
-
Ensure directories exist:Run the folder creation script to create task directories and configure gitignore:bash
bash ${CLAUDE_PLUGIN_ROOT}/scripts/create-folders.shThis creates:- - New tasks awaiting analysis
.specs/tasks/draft/ - - Tasks ready to implement
.specs/tasks/todo/ - - Currently being worked on
.specs/tasks/in-progress/ - - Completed tasks
.specs/tasks/done/ - - Temporary working files (gitignored)
.specs/scratchpad/ - - Codebase impact analysis files
.specs/analysis/ - - Reusable skill documents
.claude/skills/
Update each todo to when starting a phase and when judge passes.
in_progresscompleted启动工作流前:
-
验证任务文件存在:
- 如果为false:检查
REFINE_MODE是否存在于TASK_FILE目录.specs/tasks/draft/ - 如果为true:检查
REFINE_MODE是否存在于TASK_FILE或.specs/tasks/todo/目录.specs/tasks/draft/ - 如果未找到,显示错误并退出
- 如果
-
解析并显示已解析的配置:markdown
### 配置信息 | 设置项 | 值 | |---------|-------| | **任务文件** | {TASK_FILE} | | **目标质量** | {THRESHOLD}/5.0 | | **最大迭代次数** | {MAX_ITERATIONS} | | **激活阶段** | {ACTIVE_STAGES 逗号分隔列表} | | **人工检查点** | 阶段 {HUMAN_IN_THE_LOOP_PHASES 逗号分隔列表} | | **跳过评审** | {SKIP_JUDGES} | | **细化模式** | {REFINE_MODE} | | **从何处继续** | {CONTINUE_STAGE} 或 "开始" | -
处理模式:
--continue如果设置了:CONTINUE_STAGE- 读取任务文件获取当前状态
- 从任务文件内容中识别已完成的阶段
- 跳转到(或自动检测的下一个未完成阶段)
CONTINUE_STAGE - 从现有工件中预填充已捕获的值
- 从相应阶段恢复工作流
-
处理模式:
--refine如果为true:REFINE_MODE- 检查文件状态:
git status --porcelain -- <TASK_FILE>- (已暂存)或
M(未暂存)或M(两者皆是)→ 继续执行diffMM - (未跟踪)→ 错误:"文件未被git跟踪,无法检测变更"
?? - 空输出 → 未检测到变更
- 运行获取相对于上一次提交的所有变更(已暂存+未暂存)
git diff HEAD -- <TASK_FILE> - 解析diff以识别修改的部分
- 收集所有注释标记作为用户反馈
// - 使用部分到阶段的映射确定最早被修改的部分
- 设置仅包含从确定的起始点开始的阶段
ACTIVE_STAGES - 将检测到的变更和用户评论作为附加上下文传递给Agent
- 如果未检测到变更,通知用户:"未检测到任务文件中的变更。请先编辑文件,再运行--refine。"并退出
- 检查文件状态:
-
从文件中提取任务信息:
- 读取任务文件,从文件名中提取标题和类型
- 解析前置元数据获取标题和依赖项
-
使用TodoWrite初始化工作流进度跟踪:仅包含中的阶段待办事项。如果是继续执行,将已完成的阶段标记为
ACTIVE_STAGES。completedjson{ "todos": [ {"content": "确保目录存在", "status": "pending", "activeForm": "正在检查并创建目录"}, {"content": "阶段2a:调研相关资源和文档", "status": "pending", "activeForm": "正在调研资源"}, {"content": "评审2a:调研质量通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证调研结果"}, {"content": "阶段2b:分析代码库影响和受影响文件", "status": "pending", "activeForm": "正在分析代码库影响"}, {"content": "评审2b:代码库分析通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证代码库分析结果"}, {"content": "阶段2c:业务分析和验收标准", "status": "pending", "activeForm": "正在分析业务需求"}, {"content": "评审2c:业务分析通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证业务分析结果"}, {"content": "阶段3:整合调研和分析结果生成架构方案", "status": "pending", "activeForm": "正在生成架构方案"}, {"content": "评审3:架构合成通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证架构方案"}, {"content": "阶段4:拆分为实施步骤", "status": "pending", "activeForm": "正在拆分任务步骤"}, {"content": "评审4:任务分解通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证任务分解结果"}, {"content": "阶段5:并行化实施步骤", "status": "pending", "activeForm": "正在并行化任务步骤"}, {"content": "评审5:并行化通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证并行化结果"}, {"content": "阶段6:定义评审规则", "status": "pending", "activeForm": "正在定义验证规则"}, {"content": "评审6:验证通过(> {THRESHOLD})", "status": "pending", "activeForm": "正在验证规则定义"}, {"content": "将任务移至todo目录", "status": "pending", "activeForm": "正在升级任务"}, {"content": "人工检查点评审", "status": "pending", "activeForm": "等待人工评审"} ] }注意: 根据配置过滤待办事项:- 如果为true,省略所有评审待办事项(评审2a、2b、2c、3、4、5、6)
SKIP_JUDGES - 如果不在
research中,省略阶段2a和评审2a待办事项ACTIVE_STAGES - 如果不在
codebase analysis中,省略阶段2b和评审2b待办事项ACTIVE_STAGES - 如果不在
business analysis中,省略阶段2c和评审2c待办事项ACTIVE_STAGES - 如果不在
architecture synthesis中,省略阶段3和评审3待办事项ACTIVE_STAGES - 如果不在
decomposition中,省略阶段4和评审4待办事项ACTIVE_STAGES - 如果不在
parallelize中,省略阶段5和评审5待办事项ACTIVE_STAGES - 如果不在
verifications中,省略阶段6和评审6待办事项ACTIVE_STAGES - 如果为空,省略人工检查点待办事项
HUMAN_IN_THE_LOOP_PHASES
- 如果
-
确保目录存在:运行文件夹创建脚本以创建任务目录并配置gitignore:bash
bash ${CLAUDE_PLUGIN_ROOT}/scripts/create-folders.sh该脚本会创建:- - 等待分析的新任务
.specs/tasks/draft/ - - 可实施的任务
.specs/tasks/todo/ - - 正在处理的任务
.specs/tasks/in-progress/ - - 已完成的任务
.specs/tasks/done/ - - 临时工作文件(已加入gitignore)
.specs/scratchpad/ - - 代码库影响分析文件
.specs/analysis/ - - 可复用的skill文档
.claude/skills/
启动阶段时将待办事项更新为,评审通过后更新为。
in_progresscompletedCRITICAL
关键注意事项
- Do not mark PASS for any judge if it did not pass the rubric. Retry the judge after each implementation change till it passes the check!
- Do not read task files in .claude or .specs directories, your job is orchestrate agents that will do the work, not do it by yourself!
- Use (default 3.5) for all judge pass/fail decisions, not hardcoded values!
THRESHOLD - Use (default 3) for retry limits, not hardcoded values!
MAX_ITERATIONS - After reached: PROCEED to next stage automatically - do NOT ask user unless phase is in
MAX_ITERATIONS!HUMAN_IN_THE_LOOP_PHASES - Skip phases not in entirely - do not launch agents for excluded stages!
ACTIVE_STAGES - Trigger human-in-the-loop checkpoints ONLY after phases in !
HUMAN_IN_THE_LOOP_PHASES - If is true: Skip ALL judge validation - proceed directly to next phase after each implementation phase completes!
SKIP_JUDGES - Task file must exist in before running this command (unless
.specs/tasks/draft/mode)!--refine - If is true: Detect changes via git diff, skip unchanged stages, pass user feedback to agents!
REFINE_MODE
- 如果未通过规则,请勿标记任何评审为通过。每次实施变更后重新运行评审,直到通过检查!
- 不要读取.claude或.specs目录中的任务文件,你的工作是编排Agent来完成工作,而非自己执行!
- 所有评审通过/不通过决策使用(默认3.5),而非硬编码值!
THRESHOLD - 重试限制使用(默认3),而非硬编码值!
MAX_ITERATIONS - 达到后:自动进入下一阶段 - 除非阶段在
MAX_ITERATIONS中,否则不要询问用户!HUMAN_IN_THE_LOOP_PHASES - 完全跳过不在中的阶段 - 不为排除的阶段启动Agent!
ACTIVE_STAGES - 仅在中的阶段完成后触发人工介入检查点!
HUMAN_IN_THE_LOOP_PHASES - 如果为true:跳过所有评审验证 - 每个实施阶段完成后直接进入下一阶段!
SKIP_JUDGES - 运行此命令前,任务文件必须存在于目录(除非使用
.specs/tasks/draft/模式)!--refine - 如果为true:通过git diff检测变更,跳过未变更的阶段,将用户反馈传递给Agent!
REFINE_MODE
Execution & Evaluation Rules
执行与评估规则
- Use foreground agents only: Do not use background agents. Launch parallel agents when possible. Background agents constantly run in permissions issues and other errors.
Relaunch judge till you get valid results, of following happens:
- Reject Long Reports: If an agent returns a very long report instead of using the scratchpad as requested, reject the result. This indicates the agent failed to follow the "use scratchpad" instruction.
- Judge Score 5.0 is a Hallucination: If a judge returns a score of 5.0/5.0, treat it as a hallucination or lazy evaluation. Reject it and re-run the judge. Perfect scores are practically impossible in this rigorous framework.
- Reject Missing Scores: If a judge report is missing the numerical score, reject it. This indicates the judge failed to read or follow the rubric instructions.
- 仅使用前台Agent:不要使用后台Agent。尽可能启动并行Agent。后台Agent经常会出现权限问题和其他错误。
出现以下情况时,重新启动评审直到获得有效结果:
- 拒绝冗长报告:如果Agent返回非常长的报告而非按要求使用scratchpad,拒绝该结果。这表明Agent未遵循“使用scratchpad”的指令。
- 评审分数5.0视为幻觉:如果评审返回5.0/5.0的分数,将其视为幻觉或敷衍评估。拒绝该结果并重新运行评审。在这个严格的框架中,完美分数实际上是不可能的。
- 拒绝缺失分数:如果评审报告缺少数值分数,拒绝该结果。这表明评审未阅读或未遵循规则指令。
Workflow Execution
工作流执行
You MUST launch for each step a separate agent, instead of performing all steps yourself.
CRITICAL: For each agent you MUST:
- Use the Agent type and Model specified in the step
- Provide the task file path and user input as context
- Provide the value of so agents can resolve paths like
${CLAUDE_PLUGIN_ROOT}@${CLAUDE_PLUGIN_ROOT}/scripts/create-scratchpad.sh - Require agent to implement exactly that step, not more, not less
- After each sub-phase, launch a judge agent to validate quality before proceeding
你必须为每个步骤启动单独的Agent,而非自己执行所有步骤。
关键: 对于每个Agent,你必须:
- 使用步骤中指定的Agent类型和模型
- 提供任务文件路径和用户输入作为上下文
- 提供的值,以便Agent解析类似
${CLAUDE_PLUGIN_ROOT}的路径@${CLAUDE_PLUGIN_ROOT}/scripts/create-scratchpad.sh - 要求Agent严格执行该步骤,不多做也不少做
- 每个子阶段完成后,启动评审Agent验证质量,然后再继续
Complete Workflow Overview
完整工作流概述
Note: Phases not in are skipped. If is true, all judge steps are skipped entirely. Human checkpoints (🔍) occur after phases in
.
ACTIVE_STAGESSKIP_JUDGESHUMAN_IN_THE_LOOP_PHASESInput: Draft Task File (.specs/tasks/draft/*.md)
│
▼
Phase 2: Parallel Analysis
│
├─────────────────────┬─────────────────────┐
▼ ▼ ▼
Phase 2a: Phase 2b: Phase 2c:
Research Codebase Analysis Business Analysis
[sdd:researcher sonnet] [sdd:code-explorer sonnet] [sdd:business-analyst opus]
Judge 2a Judge 2b Judge 2c
(pass: >THRESHOLD) (pass: >THRESHOLD) (pass: >THRESHOLD)
│ │ │
└─────────────────────┴─────────────────────┘
│
▼
Phase 3: Architecture Synthesis
[sdd:software-architect opus]
Judge 3 (pass: >THRESHOLD)
│
▼
Phase 4: Decomposition
[sdd:tech-lead opus]
Judge 4 (pass: >THRESHOLD)
│
▼
Phase 5: Parallelize
[sdd:team-lead opus]
Judge 5 (pass: >THRESHOLD)
│
▼
Phase 6: Verifications
[sdd:qa-engineer opus]
Judge 6 (pass: >THRESHOLD)
│
▼
Move task: draft/ → todo/
│
▼
Complete注意: 不在中的阶段会被跳过。如果为true,所有评审步骤都会被完全跳过。人工检查点(🔍)在中的阶段完成后触发。
ACTIVE_STAGESSKIP_JUDGESHUMAN_IN_THE_LOOP_PHASES输入:草稿任务文件 (.specs/tasks/draft/*.md)
│
▼
阶段2:并行分析
│
├─────────────────────┬─────────────────────┐
▼ ▼ ▼
阶段2a: 阶段2b: 阶段2c:
调研 代码库分析 业务分析
[sdd:researcher sonnet] [sdd:code-explorer sonnet] [sdd:business-analyst opus]
评审2a 评审2b 评审2c
(通过: >THRESHOLD) (通过: >THRESHOLD) (通过: >THRESHOLD)
│ │ │
└─────────────────────┴─────────────────────┘
│
▼
阶段3:架构合成
[sdd:software-architect opus]
评审3 (通过: >THRESHOLD)
│
▼
阶段4:任务分解
[sdd:tech-lead opus]
评审4 (通过: >THRESHOLD)
│
▼
阶段5:并行化重组
[sdd:team-lead opus]
评审5 (通过: >THRESHOLD)
│
▼
阶段6:验证
[sdd:qa-engineer opus]
评审6 (通过: >THRESHOLD)
│
▼
移动任务: draft/ → todo/
│
▼
完成Phase 2: Parallel Analysis
阶段2:并行分析
Phase 2 launches three analysis phases in parallel, each with its own judge validation.
阶段2并行启动三个分析阶段,每个阶段都有各自的评审验证。
Phase 2a/2b/2c: Parallel Sub-Phases
阶段2a/2b/2c:并行子阶段
Launch these three phases in parallel immediately:
立即并行启动这三个阶段:
Phase 2a: Research
阶段2a:调研
Model:
Agent:
Depends on: Task file exists
Purpose: Gather relevant resources, documentation, libraries, and prior art. Creates or updates a reusable skill.
sonnetsdd:researcherLaunch agent:
-
Description: "Research task resources and create/update skill"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Task Title: <title from task file> CRITICAL: DO NOT OUTPUT YOUR RESEARCH, ONLY CREATE THE SCRATCHPAD AND SKILL FILE.
Capture:
- Skill file path (e.g., )
.claude/skills/<skill-name>/SKILL.md - Skill action (Created new / Updated existing)
- Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Number of resources gathered
- Key recommendation summary
CRITICAL: If expected files not created, launch the agent again with the same prompt.
模型:
Agent:
依赖条件: 任务文件存在
目的: 收集相关资源、文档、库和已有方案。创建或更新可复用的skill。
sonnetsdd:researcher启动Agent:
-
描述: "调研任务资源并创建/更新skill"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Task Title: <任务文件中的标题> 关键:不要输出你的调研内容,仅创建SCRATCHPAD和SKILL文件。
捕获信息:
- Skill文件路径(例如:)
.claude/skills/<skill-name>/SKILL.md - Skill操作(新建/更新现有)
- Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 收集的资源数量
- 关键建议摘要
关键:如果未创建预期文件,使用相同提示重新启动Agent。
Phase 2b: Codebase Impact Analysis
阶段2b:代码库影响分析
Model:
Agent:
Depends on: Task file exists
Purpose: Identify affected files, interfaces, and integration points
sonnetsdd:code-explorerLaunch agent:
-
Description: "Analyze codebase impact"
-
Prompt:text
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Task Title: <title from task file> CRITICAL: DO NOT OUTPUT YOUR ANALYSIS, ONLY CREATE THE SCRATCHPAD AND ANALYSIS FILE.
Capture:
- Analysis file path (e.g., )
.specs/analysis/analysis-{name}.md - Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Files affected count (modify/create/delete)
- Risk level assessment
- Key integration points
CRITICAL: If expected files not created, launch the agent again with the same prompt.
模型:
Agent:
依赖条件: 任务文件存在
目的: 识别受影响的文件、接口和集成点
sonnetsdd:code-explorer启动Agent:
-
描述: "分析代码库影响"
-
提示:text
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Task Title: <任务文件中的标题> 关键:不要输出你的分析内容,仅创建SCRATCHPAD和分析文件。
捕获信息:
- 分析文件路径(例如:)
.specs/analysis/analysis-{name}.md - Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 受影响文件数量(修改/创建/删除)
- 风险等级评估
- 关键集成点
关键:如果未创建预期文件,使用相同提示重新启动Agent。
Phase 2c: Business Analysis
阶段2c:业务分析
Model:
Agent:
Depends on: Task file exists
Purpose: Refine description and create acceptance criteria
opussdd:business-analystLaunch agent:
-
Description: "Business analysis"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read ${CLAUDE_PLUGIN_ROOT}/skills/plan-task/analyse-business-requirements.md and execute it exactly as is! Task File: <TASK_FILE> Task Title: <title from task file> CRITICAL: DO NOT OUTPUT YOUR BUSINESS ANALYSIS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
- Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Acceptance criteria count
- Scope defined (yes/no)
- User scenarios documented
模型:
Agent:
依赖条件: 任务文件存在
目的: 细化描述并创建验收标准
opussdd:business-analyst启动Agent:
-
描述: "业务分析"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读${CLAUDE_PLUGIN_ROOT}/skills/plan-task/analyse-business-requirements.md并严格执行! Task File: <TASK_FILE> Task Title: <任务文件中的标题> 关键:不要输出你的业务分析内容,仅创建SCRATCHPAD并更新任务文件。
捕获信息:
- Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 验收标准数量
- 是否明确定义范围(是/否)
- 是否记录用户场景
Judge 2a/2b/2c: Validate Parallel Phases
评审2a/2b/2c:验证并行阶段
After each parallel phase completes, launch its respective judge with the same agent type and model.
每个并行阶段完成后,启动对应的相同Agent类型和模型的评审。
Judge 2a: Validate Research/Skill
评审2a:验证调研/Skill
Model:
Agent:
Depends on: Phase 2a completion
Purpose: Validate skill completeness and relevance
sonnetsdd:researcherLaunch judge:
-
Description: "Judge skill quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to skill file from Phase 2a} ### Context This is a skill document for task: {task title}. Evaluate comprehensiveness and reusability. ### Rubric 1. Resource Coverage (weight: 0.30) - Documentation and references gathered? - Libraries and tools identified with recommendations? - 1=Missing critical resources, 2=Basic coverage, 3=Adequate, 4=Comprehensive, 5=Excellent 2. Pattern Relevance (weight: 0.25) - Are identified patterns applicable? - Are recommendations actionable? - 1=Irrelevant, 2=Somewhat useful, 3=Adequate, 4=Well-targeted, 5=Perfect fit 3. Issue Anticipation (weight: 0.20) - Common pitfalls identified with solutions? - 1=None identified, 2=Few issues, 3=Adequate, 4=Good coverage, 5=Comprehensive 4. Reusability (weight: 0.15) - Is the skill general enough to help multiple tasks? - Does it avoid task-specific details? - 1=Too specific, 2=Limited reuse, 3=Adequate, 4=Good, 5=Highly reusable 5. Task Integration (weight: 0.10) - Was task file updated with skill reference? - 1=Not updated, 3=Updated, 5=Updated with clear instructions
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Research complete, proceed
THRESHOLD - FAIL (score < ): Re-launch Phase 2a with feedback
THRESHOLD - MAX_ITERATIONS reached: Proceed to next stage regardless of score (log warning)
模型:
Agent:
依赖条件: 阶段2a完成
目的: 验证skill的完整性和相关性
sonnetsdd:researcher启动评审:
-
描述: "评审skill质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段2a生成的skill文件路径} ### 上下文 这是任务{任务标题}的skill文档。评估其全面性和可复用性。 ### 规则 1. 资源覆盖(权重:0.30) - 是否收集了文档和参考资料? - 是否识别了库和工具并给出建议? - 1=缺少关键资源,2=基础覆盖,3=足够,4=全面,5=优秀 2. 模式相关性(权重:0.25) - 识别的模式是否适用? - 建议是否可执行? - 1=不相关,2=有些用处,3=足够,4=针对性强,5=完美适配 3. 问题预判(权重:0.20) - 是否识别了常见陷阱并给出解决方案? - 1=未识别任何问题,2=识别少量问题,3=足够,4=覆盖良好,5=全面覆盖 4. 可复用性(权重:0.15) - skill是否足够通用以帮助多个任务? - 是否避免了任务特定细节? - 1=过于特定,2=复用性有限,3=足够,4=良好,5=高度可复用 5. 任务集成(权重:0.10) - 任务文件是否更新了skill引用? - 1=未更新,3=已更新,5=已更新并包含清晰说明
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):调研完成,继续
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段2a
THRESHOLD - 达到最大迭代次数:无论分数如何,进入下一阶段(记录警告)
Judge 2b: Validate Codebase Analysis
评审2b:验证代码库分析
Model:
Agent:
Depends on: Phase 2b completion
Purpose: Validate file identification accuracy and integration mapping
sonnetsdd:code-explorerLaunch judge:
-
Description: "Judge codebase analysis quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to analysis file from Phase 2b} ### Context This is codebase impact analysis for task: {task title}. Evaluate accuracy and completeness. ### Rubric 1. File Identification Accuracy (weight: 0.35) - All affected files identified with specific paths? - New files and modifications distinguished? - 1=Major files missing, 2=Mostly correct, 3=Adequate, 4=Precise, 5=Complete 2. Interface Documentation (weight: 0.25) - Key functions/classes documented with signatures? - Change requirements clear? - 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Complete 3. Integration Point Mapping (weight: 0.25) - Integration points identified with impact? - Similar patterns in codebase found? - 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Comprehensive 4. Risk Assessment (weight: 0.15) - High risk areas identified with mitigations? - 1=No assessment, 2=Basic, 3=Adequate, 4=Good, 5=Thorough
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Analysis complete, proceed
THRESHOLD - FAIL (score < ): Re-launch Phase 2b with feedback
THRESHOLD - MAX_ITERATIONS reached: Proceed to next stage regardless of score (log warning)
模型:
Agent:
依赖条件: 阶段2b完成
目的: 验证文件识别准确性和集成映射
sonnetsdd:code-explorer启动评审:
-
描述: "评审代码库分析质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段2b生成的分析文件路径} ### 上下文 这是任务{任务标题}的代码库影响分析。评估其准确性和完整性。 ### 规则 1. 文件识别准确性(权重:0.35) - 是否识别了所有受影响的文件并提供了具体路径? - 是否区分了新文件和修改文件? - 1=缺少主要文件,2=大部分正确,3=足够,4=精确,5=完整 2. 接口文档(权重:0.25) - 是否记录了关键函数/类及其签名? - 变更要求是否清晰? - 1=缺失,2=部分记录,3=足够,4=良好,5=完整 3. 集成点映射(权重:0.25) - 是否识别了集成点及其影响? - 是否在代码库中找到类似模式? - 1=缺失,2=部分记录,3=足够,4=良好,5=全面 4. 风险评估(权重:0.15) - 是否识别了高风险区域并给出缓解措施? - 1=未评估,2=基础评估,3=足够,4=良好,5=全面
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):分析完成,继续
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段2b
THRESHOLD - 达到最大迭代次数:无论分数如何,进入下一阶段(记录警告)
Judge 2c: Validate Business Analysis
评审2c:验证业务分析
Model:
Agent:
Depends on: Phase 2c completion
Purpose: Validate acceptance criteria quality and scope definition
opussdd:business-analystLaunch judge:
-
Description: "Judge business analysis quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to task file from Phase 2c} ### Context This is business analysis output. Evaluate description clarity and acceptance criteria quality. ### Rubric 1. Description Clarity (weight: 0.30) - What/Why clearly explained? - Scope boundaries defined? - 1=Vague, 2=Basic, 3=Adequate, 4=Clear, 5=Excellent 2. Acceptance Criteria Quality (weight: 0.35) - Criteria specific and testable? - Given/When/Then format for complex criteria? - 1=Missing/vague, 2=Basic, 3=Adequate, 4=Good, 5=Excellent 3. Scenario Coverage (weight: 0.20) - Primary flow documented? - Error scenarios considered? - 1=Missing, 2=Basic, 3=Adequate, 4=Good, 5=Comprehensive 4. Scope Definition (weight: 0.15) - In-scope/out-of-scope explicit? - No implementation details in description? - 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Clear
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Business analysis complete, proceed
THRESHOLD - FAIL (score < ): Re-launch Phase 2c with feedback
THRESHOLD - MAX_ITERATIONS reached: Proceed to next stage regardless of score (log warning)
模型:
Agent:
依赖条件: 阶段2c完成
目的: 验证验收标准质量和范围定义
opussdd:business-analyst启动评审:
-
描述: "评审业务分析质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段2c后的任务文件路径} ### 上下文 这是业务分析输出。评估描述清晰度和验收标准质量。 ### 规则 1. 描述清晰度(权重:0.30) - 是否清晰解释了做什么/为什么做? - 是否定义了范围边界? - 1=模糊,2=基础,3=足够,4=清晰,5=优秀 2. 验收标准质量(权重:0.35) - 标准是否具体且可测试? - 复杂标准是否使用Given/When/Then格式? - 1=缺失/模糊,2=基础,3=足够,4=良好,5=优秀 3. 场景覆盖(权重:0.20) - 是否记录了主流程? - 是否考虑了错误场景? - 1=缺失,2=基础,3=足够,4=良好,5=全面 4. 范围定义(权重:0.15) - 是否明确界定了范围内/范围外内容? - 描述中是否包含实现细节? - 1=缺失,2=部分定义,3=足够,4=良好,5=清晰
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):业务分析完成,继续
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段2c
THRESHOLD - 达到最大迭代次数:无论分数如何,进入下一阶段(记录警告)
Synchronization Point
同步点
Wait for ALL three parallel phases (2a, 2b, 2c) AND their judges to PASS before proceeding to Phase 3.
等待所有三个并行阶段(2a、2b、2c)及其评审都通过后,再进入阶段3。
Phase 3: Architecture Synthesis
阶段3:架构合成
Model:
Agent:
Depends on: Phase 2a + Judge 2a PASS, Phase 2b + Judge 2b PASS, Phase 2c + Judge 2c PASS
Purpose: Synthesize research, analysis, and business requirements into architectural overview
opussdd:software-architectLaunch agent:
-
Description: "Architecture synthesis"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Skill File: <skill file path from Phase 2a> Analysis File: <analysis file path from Phase 2b> CRITICAL: DO NOT OUTPUT YOUR ARCHITECTURE SYNTHESIS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
- Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Sections added to task file
- Key architectural decisions count
- Components identified (if applicable)
- Contracts defined (if applicable)
模型:
Agent:
依赖条件: 阶段2a + 评审2a通过,阶段2b + 评审2b通过,阶段2c + 评审2c通过
目的: 将调研、分析和业务需求整合为架构概述
opussdd:software-architect启动Agent:
-
描述: "架构合成"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Skill File: <阶段2a生成的skill文件路径> Analysis File: <阶段2b生成的分析文件路径> 关键:不要输出你的架构合成内容,仅创建SCRATCHPAD并更新任务文件。
捕获信息:
- Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 添加到任务文件的部分
- 关键架构决策数量
- 识别的组件(如适用)
- 定义的契约(如适用)
Judge 3: Validate Architecture Synthesis
评审3:验证架构合成
Model:
Agent:
Depends on: Phase 3 completion
Purpose: Validate architectural coherence and completeness
opussdd:software-architectLaunch judge:
-
Description: "Judge architecture synthesis quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to task file after Phase 3} ### Context This is architecture synthesis output. The Architecture Overview section should contain solution strategy, key decisions, and only relevant architectural sections. ### Rubric 1. Solution Strategy Clarity (weight: 0.30) - Approach clearly explained? - Key decisions documented with reasoning? - Trade-offs stated? - 1=Missing/unclear, 2=Basic, 3=Adequate, 4=Clear, 5=Excellent 2. Reference Integration (weight: 0.20) - Links to research and analysis files? - Insights from both integrated? - 1=No links, 2=Partial, 3=Adequate, 4=Good, 5=Fully integrated 3. Section Relevance (weight: 0.25) - Only relevant sections included (not all)? - Sections appropriate for task complexity? - 1=Wrong sections, 2=Mostly appropriate, 3=Adequate, 4=Good, 5=Precisely targeted 4. Expected Changes Accuracy (weight: 0.25) - Files to create/modify listed? - Consistent with codebase analysis? - 1=Missing/inconsistent, 2=Partial, 3=Adequate, 4=Good, 5=Complete
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Architecture synthesis complete, proceed
THRESHOLD - FAIL (score < ): Re-launch Phase 3 with feedback
THRESHOLD - MAX_ITERATIONS reached: Proceed to Phase 4 regardless of score (log warning)
Wait for PASS before Phase 4.
模型:
Agent:
依赖条件: 阶段3完成
目的: 验证架构的连贯性和完整性
opussdd:software-architect启动评审:
-
描述: "评审架构合成质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段3后的任务文件路径} ### 上下文 这是架构合成输出。架构概述部分应包含解决方案策略、关键决策和仅相关的架构部分。 ### 规则 1. 解决方案策略清晰度(权重:0.30) - 是否清晰解释了方法? - 是否记录了关键决策及其理由? - 是否说明了权衡? - 1=缺失/模糊,2=基础,3=足够,4=清晰,5=优秀 2. 参考集成(权重:0.20) - 是否链接到调研和分析文件? - 是否整合了两者的见解? - 1=无链接,2=部分集成,3=足够,4=良好,5=完全集成 3. 部分相关性(权重:0.25) - 是否仅包含相关部分(而非全部)? - 部分是否适合任务复杂度? - 1=错误部分,2=大部分合适,3=足够,4=良好,5=精准匹配 4. 预期变更准确性(权重:0.25) - 是否列出了要创建/修改的文件? - 是否与代码库分析一致? - 1=缺失/不一致,2=部分一致,3=足够,4=良好,5=完全一致
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):架构合成完成,继续
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段3
THRESHOLD - 达到最大迭代次数:无论分数如何,进入阶段4(记录警告)
等待通过后进入阶段4。
Phase 4: Decomposition
阶段4:任务分解
Model:
Agent:
Depends on: Phase 3 + Judge 3 PASS
Purpose: Break architecture into implementation steps with success criteria and risks
opussdd:tech-leadLaunch agent:
-
Description: "Decompose into implementation steps"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> CRITICAL: DO NOT OUTPUT YOUR DECOMPOSITION, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
- Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Implementation steps count
- Total subtasks count
- Critical path steps
- High priority risks count
模型:
Agent:
依赖条件: 阶段3 + 评审3通过
目的: 将架构拆分为带有成功标准和风险的实施步骤
opussdd:tech-lead启动Agent:
-
描述: "拆分为实施步骤"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> 关键:不要输出你的分解内容,仅创建SCRATCHPAD并更新任务文件。
捕获信息:
- Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 实施步骤数量
- 子任务总数
- 关键路径步骤
- 高优先级风险数量
Judge 4: Validate Decomposition
评审4:验证任务分解
Model:
Agent:
Depends on: Phase 4 completion
Purpose: Validate implementation steps quality and completeness
opussdd:tech-leadLaunch judge:
-
Description: "Judge decomposition quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to task file after Phase 4} ### Context This is decomposition output. The Implementation Process section should contain ordered steps with success criteria, subtasks, blockers, and risks. ### Rubric 1. Step Quality (weight: 0.30) - Each step has clear goal, output, success criteria? - Steps ordered by dependency? - No step too large (>Large estimate)? - 1=Vague/missing, 2=Basic, 3=Adequate, 4=Good, 5=Excellent 2. Success Criteria Testability (weight: 0.25) - Criteria specific and verifiable? - Use actual file paths, function names? - Subtasks clearly defined with actionable descriptions? - 1=Vague, 2=Partially testable, 3=Adequate, 4=Good, 5=All testable 3. Risk Coverage (weight: 0.25) - Blockers identified with resolutions? - Risks identified with mitigations? - High-risk tasks identified with decomposition recommendations? - 1=None, 2=Basic, 3=Adequate, 4=Good, 5=Comprehensive 4. Completeness (weight: 0.20) - All architecture components have corresponding steps? - Implementation summary table present? - Definition of Done included? - Phases organized: Setup → Foundational → User Stories → Polish? - 1=Incomplete, 2=Partial, 3=Adequate, 4=Good, 5=Complete
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Decomposition complete, proceed to Phase 5
THRESHOLD - FAIL (score < ): Re-launch Phase 4 with feedback
THRESHOLD - MAX_ITERATIONS reached: Proceed to Phase 5 regardless of score (log warning)
Wait for PASS before Phase 5.
模型:
Agent:
依赖条件: 阶段4完成
目的: 验证实施步骤的质量和完整性
opussdd:tech-lead启动评审:
-
描述: "评审任务分解质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段4后的任务文件路径} ### 上下文 这是任务分解输出。实施流程部分应包含有序步骤、成功标准、子任务、阻塞点和风险。 ### 规则 1. 步骤质量(权重:0.30) - 每个步骤是否有明确目标、输出和成功标准? - 步骤是否按依赖关系排序? - 是否没有过大的步骤(>大估算)? - 1=模糊/缺失,2=基础,3=足够,4=良好,5=优秀 2. 成功标准可测试性(权重:0.25) - 标准是否具体且可验证? - 是否使用实际文件路径、函数名? - 子任务是否有清晰的可执行描述? - 1=模糊,2=部分可测试,3=足够,4=良好,5=全部可测试 3. 风险覆盖(权重:0.25) - 是否识别了阻塞点并给出解决方案? - 是否识别了风险并给出缓解措施? - 是否识别了高风险任务并给出分解建议? - 1=无,2=基础,3=足够,4=良好,5=全面 4. 完整性(权重:0.20) - 所有架构组件是否都有对应的步骤? - 是否有实施摘要表? - 是否包含完成定义? - 是否按阶段组织:设置→基础→用户故事→优化? - 1=不完整,2=部分完整,3=足够,4=良好,5=完整
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):任务分解完成,进入阶段5
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段4
THRESHOLD - 达到最大迭代次数:无论分数如何,进入阶段5(记录警告)
等待通过后进入阶段5。
Phase 5: Parallelize Steps
阶段5:并行化步骤
Model:
Agent:
Depends on: Phase 4 + Judge 4 PASS
Purpose: Reorganize implementation steps for maximum parallel execution
opussdd:team-leadLaunch agent:
-
Description: "Parallelize implementation steps"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> Use agents only from this list: {list ALL available agents with plugin prefix if available, e.g. sdd:developer, review:bug-hunter. Also include general agents: opus, sonnet, haiku} CRITICAL: DO NOT OUTPUT YOUR PARALLELIZATION, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
- Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Number of steps reorganized
- Maximum parallelization depth
- Agent distribution summary
模型:
Agent:
依赖条件: 阶段4 + 评审4通过
目的: 重新组织实施步骤以实现最大程度的并行执行
opussdd:team-lead启动Agent:
-
描述: "并行化实施步骤"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> 仅使用以下列表中的Agent:{列出所有可用Agent,包括插件前缀(如sdd:developer, review:bug-hunter),同时包括通用Agent:opus, sonnet, haiku} 关键:不要输出你的并行化内容,仅创建SCRATCHPAD并更新任务文件。
捕获信息:
- Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 重新组织的步骤数量
- 最大并行化深度
- Agent分配摘要
Judge 5: Validate Parallelization
评审5:验证并行化
Model:
Agent:
Depends on: Phase 5 completion
Purpose: Validate dependency accuracy and parallelization optimization
opussdd:team-leadLaunch judge:
-
Description: "Judge parallelization quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to parallelized task file from Phase 5} ### Context This is the output of Phase 5: Parallelize Steps. The artifact should contain implementation steps reorganized for maximum parallel execution with explicit dependencies, agent assignments, and parallelization diagram. Use agents only from this list: {list ALL available agents with plugin prefix if available, e.g. sdd:developer, review:bug-hunter. Also include general agents: opus, sonnet, haiku} ### Rubric 1. Dependency Accuracy (weight: 0.35) - Are step dependencies correctly identified? - No false dependencies (steps marked dependent when they're not)? - No missing dependencies (steps that actually depend on others)? - 1=Major dependency errors, 2=Mostly correct, 3=Acceptable, 5=Precise dependencies 2. Parallelization Maximized (weight: 0.30) - Are parallelizable steps correctly marked with "Parallel with:"? - Is the parallelization diagram logical? - 1=No parallelization/wrong, 2=Some optimization, 3=Acceptable, 5=Maximum parallelization 3. Agent Selection Correctness (weight: 0.20) - Are agent types appropriate for outputs (opus by default, haiku for trivial, sonnet for simple but high in volume)? - Does selection follow the Agent Selection Guide? - Are only agents from the provided available agents list used? - 1=Wrong agents, 2=Mostly appropriate, 3=Acceptable, 4=Optimal selection, 5=Perfect selection 4. Execution Directive Present (weight: 0.15) - Is the sub-agent execution directive present? - Are "MUST" requirements for parallel execution clear? - 1=Missing directive, 2=Partial, 3=Acceptable, 4=Complete directive, 5=Perfect directive
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Proceed to Phase 6
THRESHOLD - FAIL (score < ): Re-launch Phase 5 with feedback
THRESHOLD - MAX_ITERATIONS reached: Proceed to Phase 6 regardless of score (log warning)
Wait for PASS before Phase 6.
模型:
Agent:
依赖条件: 阶段5完成
目的: 验证依赖关系准确性和并行化优化
opussdd:team-lead启动评审:
-
描述: "评审并行化质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段5生成的并行化任务文件路径} ### 上下文 这是阶段5的输出:并行化步骤。工件应包含重新组织以实现最大并行执行的实施步骤,带有明确的依赖关系、Agent分配和并行化图。 仅使用以下列表中的Agent:{列出所有可用Agent,包括插件前缀(如sdd:developer, review:bug-hunter),同时包括通用Agent:opus, sonnet, haiku} ### 规则 1. 依赖关系准确性(权重:0.35) - 是否正确识别了步骤依赖关系? - 是否没有虚假依赖(标记为依赖但实际不依赖的步骤)? - 是否没有缺失依赖(实际依赖其他步骤但未标记的步骤)? - 1=严重依赖错误,2=大部分正确,3=可接受,5=精确依赖 2. 并行化最大化(权重:0.30) - 可并行的步骤是否正确标记为"Parallel with:"? - 并行化图是否合理? - 1=无并行化/错误,2=部分优化,3=可接受,5=最大程度并行化 3. Agent选择正确性(权重:0.20) - Agent类型是否适合输出(默认opus,简单任务用haiku,简单但量大的任务用sonnet)? - 选择是否遵循Agent选择指南? - 是否仅使用提供的可用Agent列表中的Agent? - 1=错误Agent,2=大部分合适,3=可接受,4=最优选择,5=完美选择 4. 执行指令是否存在(权重:0.15) - 是否存在子Agent执行指令? - 并行执行的"MUST"要求是否清晰? - 1=缺失指令,2=部分存在,3=可接受,4=完整指令,5=完美指令
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):进入阶段6
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段5
THRESHOLD - 达到最大迭代次数:无论分数如何,进入阶段6(记录警告)
等待通过后进入阶段6。
Phase 6: Define Verifications
阶段6:定义验证规则
Model:
Agent:
Depends on: Phase 5 + Judge 5 PASS
Purpose: Add LLM-as-Judge verification sections with rubrics
opussdd:qa-engineerLaunch agent:
-
Description: "Define verification rubrics"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> CRITICAL: DO NOT OUTPUT YOUR VERIFICATIONS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE.
Capture:
- Scratchpad file path (e.g., )
.specs/scratchpad/<hex-id>.md - Number of steps with verification
- Total evaluations defined
- Verification breakdown (Panel/Per-Item/None)
模型:
Agent:
依赖条件: 阶段5 + 评审5通过
目的: 添加带有规则的LLM-as-Judge验证部分
opussdd:qa-engineer启动Agent:
-
描述: "定义验证规则"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Task File: <TASK_FILE> 关键:不要输出你的验证内容,仅创建SCRATCHPAD并更新任务文件。
捕获信息:
- Scratchpad文件路径(例如:)
.specs/scratchpad/<hex-id>.md - 带有验证的步骤数量
- 定义的评估总数
- 验证分类(Panel/Per-Item/None)
Judge 6: Validate Verifications
评审6:验证规则定义
Model:
Agent:
Depends on: Phase 6 completion
Purpose: Validate verification rubrics and thresholds
opussdd:qa-engineerLaunch judge:
-
Description: "Judge verification quality"
-
Prompt:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute. ### Artifact Path {path to task file with verifications from Phase 6} ### Context This is the output of Phase 6: Define Verifications. The artifact should contain LLM-as-Judge verification sections for each implementation step, including verification levels, custom rubrics, thresholds, and a verification summary table. ### Rubric 1. Verification Level Appropriateness (weight: 0.30) - Do verification levels match artifact criticality? - HIGH criticality → Panel, MEDIUM → Single/Per-Item, LOW/NONE → None? - 1=Mismatched levels, 2=Mostly appropriate, 3=Acceptable, 5=Precisely calibrated 2. Rubric Quality (weight: 0.30) - Are criteria specific to the artifact type (not generic)? - Do weights sum to 1.0? - Are descriptions clear and measurable? - 1=Generic/broken rubrics, 2=Adequate, 3=Acceptable, 5=Excellent custom rubrics 3. Threshold Appropriateness (weight: 0.20) - Are thresholds reasonable (typically 4.0/5.0)? - Higher for critical, lower for experimental? - 1=Wrong thresholds, 2=Standard applied, 3=Acceptable, 5=Context-appropriate 4. Coverage Completeness (weight: 0.20) - Does every step have a Verification section? - Is the Verification Summary table present? - 1=Missing verifications, 2=Most covered, 3=Acceptable, 5=100% coverage
CRITICAL: use prompt exactly as is, do not add anything else. Including output of implementation agent!!!
Decision Logic:
- PASS (score >= ): Workflow complete, promote task
THRESHOLD - FAIL (score < ): Re-launch Phase 6 with feedback
THRESHOLD - MAX_ITERATIONS reached: Complete workflow regardless of score (log warning)
模型:
Agent:
依赖条件: 阶段6完成
目的: 验证验证规则和阈值
opussdd:qa-engineer启动评审:
-
描述: "验证规则质量"
-
提示:
CLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT} 阅读@${CLAUDE_PLUGIN_ROOT}/prompts/judge.md获取评估方法并执行。 ### 工件路径 {阶段6添加验证后的任务文件路径} ### 上下文 这是阶段6的输出:定义验证规则。工件应包含每个实施步骤的LLM-as-Judge验证部分,包括验证级别、自定义规则、阈值和验证摘要表。 ### 规则 1. 验证级别适配性(权重:0.30) - 验证级别是否与工件关键程度匹配? - 高关键程度→Panel,中等→Single/Per-Item,低/无→None? - 1=级别不匹配,2=大部分合适,3=可接受,5=精准校准 2. 规则质量(权重:0.30) - 标准是否针对工件类型(而非通用)? - 权重总和是否为1.0? - 描述是否清晰可衡量? - 1=通用/无效规则,2=足够,3=可接受,5=优秀自定义规则 3. 阈值适配性(权重:0.20) - 阈值是否合理(通常4.0/5.0)? - 关键任务阈值更高,实验性任务阈值更低? - 1=错误阈值,2=应用标准阈值,3=可接受,5=适配上下文 4. 覆盖完整性(权重:0.20) - 每个步骤是否都有验证部分? - 是否有验证摘要表? - 1=缺失验证,2=大部分覆盖,3=可接受,5=100%覆盖
关键:严格使用上述提示,不要添加任何其他内容,包括实施Agent的输出!!!
决策逻辑:
- 通过(分数 >= ):工作流完成,升级任务
THRESHOLD - 不通过(分数 < ):结合反馈重新启动阶段6
THRESHOLD - 达到最大迭代次数:无论分数如何,完成工作流(记录警告)
Phase 7: Promote Task
阶段7:升级任务
Purpose: Move the refined task from draft to todo folder
After all phases complete:
-
Move task file from draft to todo:bash
git mv <TASK_FILE> .specs/tasks/todo/ # Fallback if git not available: mv <TASK_FILE> .specs/tasks/todo/ -
Update any references in research and analysis files if needed
目的: 将细化后的任务从draft目录移至todo目录
所有阶段完成后:
-
将任务文件从draft移至todo:bash
git mv <TASK_FILE> .specs/tasks/todo/ # 如果git不可用,备用方案:mv <TASK_FILE> .specs/tasks/todo/ -
更新调研和分析文件中的引用(如有需要)
Completion
完成
After all executed phases and judges complete:
- Use git tool to stage the task file, skill file, analysis file, and scratchpad files (only those that were created)
- Summarize the workflow results and output to user:
markdown
undefined所有执行的阶段和评审完成后:
- 使用git工具暂存任务文件、skill文件、分析文件和scratchpad文件(仅创建的文件)
- 总结工作流结果并输出给用户:
markdown
undefinedTask Refined
任务已细化
| Property | Value |
|---|---|
| Original File | |
| Final Location | |
| Title | |
| Type | |
| Skill | |
| Skill Action | |
| Analysis | |
| Scratchpad | |
| Implementation Steps | |
| Parallelization Depth | |
| Total Verifications | |
| 属性 | 值 |
|---|---|
| 原始文件 | |
| 最终位置 | |
| 标题 | |
| 类型 | |
| Skill | |
| Skill操作 | |
| 分析 | |
| Scratchpad | |
| 实施步骤 | |
| 并行化深度 | |
| 总验证数 | |
Configuration Used
使用的配置
| Setting | Value |
|---|---|
| Target Quality | {THRESHOLD}/5.0 |
| Max Iterations | {MAX_ITERATIONS} |
| Active Stages | {ACTIVE_STAGES as comma-separated list} |
| Skipped Stages | {SKIP_STAGES or stages not in ACTIVE_STAGES} |
| Human Checkpoints | Phase {HUMAN_IN_THE_LOOP_PHASES as comma-separated} |
| Skip Judges | {SKIP_JUDGES} |
| Refine Mode | {REFINE_MODE} |
| 设置项 | 值 |
|---|---|
| 目标质量 | {THRESHOLD}/5.0 |
| 最大迭代次数 | {MAX_ITERATIONS} |
| 激活阶段 | {ACTIVE_STAGES逗号分隔列表} |
| 跳过的阶段 | {SKIP_STAGES或不在ACTIVE_STAGES中的阶段} |
| 人工检查点 | 阶段{HUMAN_IN_THE_LOOP_PHASES逗号分隔列表} |
| 跳过评审 | {SKIP_JUDGES} |
| 细化模式 | {REFINE_MODE} |
Quality Gates Summary
质量关卡摘要
| Phase | Judge Score | Verdict |
|---|---|---|
| Phase 2a: Research | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 2b: Codebase Analysis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 2c: Business Analysis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 3: Architecture Synthesis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 4: Decomposition | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 5: Parallelize | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 6: Verify | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
Threshold Used: {THRESHOLD}/5.0 (or N/A if SKIP_JUDGES)
Legend:
- ✅ PASS - Score >= THRESHOLD
- ⚠️ PROCEEDED (max iter) - Score < THRESHOLD but MAX_ITERATIONS reached, proceeded anyway
- ⏭️ SKIPPED - Stage not in ACTIVE_STAGES
| 阶段 | 评审分数 | verdict |
|---|---|---|
| 阶段2a:调研 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
| 阶段2b:代码库分析 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
| 阶段2c:业务分析 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
| 阶段3:架构合成 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
| 阶段4:任务分解 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
| 阶段5:并行化重组 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
| 阶段6:验证 | X.X/5.0 | ✅ 通过 / ⚠️ 已推进(达到最大迭代次数) / ⏭️ 已跳过 |
使用的阈值: {THRESHOLD}/5.0(如果SKIP_JUDGES则为N/A)
图例:
- ✅ 通过 - 分数 >= 阈值
- ⚠️ 已推进(达到最大迭代次数) - 分数 < 阈值但已达到最大迭代次数,仍推进
- ⏭️ 已跳过 - 阶段不在ACTIVE_STAGES中
Artifacts Generated
生成的工件
.claude/
└── skills/
└── <skill-name>/
└── SKILL.md # Reusable skill document (if research stage ran)
.specs/
├── tasks/
│ ├── draft/ # Draft tasks (source - now empty for this task)
│ ├── todo/
│ │ └── <name>.<type>.md # Complete task specification (ready for implementation)
│ ├── in-progress/ # Tasks being implemented (empty)
│ └── done/ # Completed tasks (empty)
├── analysis/
│ └── analysis-<name>.md # Codebase impact analysis (if codebase analysis stage ran)
└── scratchpad/
└── <hex-id>.md # Architecture thinking scratchpad
.claude/
└── skills/
└── <skill-name>/
└── SKILL.md # 可复用skill文档(如果运行了调研阶段)
.specs/
├── tasks/
│ ├── draft/ # 草稿任务(源目录 - 此任务已移走)
│ ├── todo/
│ │ └── <name>.<type>.md # 完整任务规范(可实施)
│ ├── in-progress/ # 正在实施的任务(空)
│ └── done/ # 已完成的任务(空)
├── analysis/
│ └── analysis-<name>.md # 代码库影响分析(如果运行了代码库分析阶段)
└── scratchpad/
└── <hex-id>.md # 架构思考草稿
Task Status Management
任务状态管理
Task status is managed by folder location:
- - Tasks created but not yet refined
draft/ - - Tasks ready for implementation
todo/ - - Tasks currently being worked on
in-progress/ - - Completed tasks
done/
任务状态通过文件夹位置管理:
- - 已创建但未细化的任务
draft/ - - 可实施的任务
todo/ - - 正在处理的任务
in-progress/ - - 已完成的任务
done/
Next Steps
下一步
- Review task:
.specs/tasks/todo/<filename>- Edit the task file directly to make corrections
- Add comments to lines that need clarification or changes
// - Run again with
/planto incorporate your feedback — it detects changes against git and propagates updates top-to-bottom (editing a section only affects sections below it, not above)--refine
- If everything is fine, begin implementation: (will auto-select the task from todo/)
/implement
---- 评审任务:
.specs/tasks/todo/<文件名>- 直接编辑任务文件进行修正
- 在需要澄清或修改的行添加注释
// - 使用参数重新运行
--refine以整合你的反馈 — 它会检测git中的变更并自上而下传播更新(编辑某个部分仅影响其下方的部分,不影响上方)/plan
- 如果一切正常,开始实施:(会自动从todo/目录选择任务)
/implement
---Error Handling
错误处理
Phase Agent Failure (Exception/Crash)
阶段Agent失败(异常/崩溃)
If any phase agent fails unexpectedly:
- Report the failure with agent output
- Ask clarification questions from user that can help resolve the issue
- Launch the phase agent again with list of questions and answers to resolve the issue
如果任何阶段Agent意外失败:
- 报告失败及Agent输出
- 向用户询问有助于解决问题的澄清问题
- 带着问题列表和答案重新启动阶段Agent
Judge Returns FAIL
评审返回不通过
If any judge returns FAIL (score < ):
THRESHOLD- Automatic retry: Re-launch the phase agent with judge feedback
- Human-in-the-loop check: If phase is in , trigger human checkpoint before the next judge retry (after implementation retry but before re-judging)
HUMAN_IN_THE_LOOP_PHASES - After reached: Proceed to next stage automatically (do NOT ask user unless
MAX_ITERATIONSincludes this phase)--human-in-the-loop - Log warning in completion summary:
⚠️ Phase X did not pass quality threshold (X.X/THRESHOLD) after MAX_ITERATIONS iterations
如果任何评审返回不通过(分数 < ):
THRESHOLD- 自动重试:结合评审反馈重新启动阶段Agent
- 人工介入检查:如果阶段在中,在下一次评审重试前触发人工检查点(实施重试后,重新评审前)
HUMAN_IN_THE_LOOP_PHASES - 达到最大迭代次数后:自动进入下一阶段(除非包含此阶段,否则不要询问用户)
--human-in-the-loop - 在完成摘要中记录警告:
⚠️ 阶段X在MAX_ITERATIONS次迭代后未通过质量阈值(X.X/THRESHOLD)
Retry Flow
重试流程
Implementation → Judge FAIL → Implementation Retry → Judge Retry
↓
PASS → Continue to next stage
FAIL → Repeat until MAX_ITERATIONS
↓
MAX_ITERATIONS reached → Proceed to next stage (with warning)实施 → 评审不通过 → 实施重试 → 评审重试
↓
通过 → 进入下一阶段
不通过 → 重复直到达到最大迭代次数
↓
达到最大迭代次数 → 进入下一阶段(带警告)Retry Flow with Human-in-the-Loop
带人工介入的重试流程
When phase is in :
HUMAN_IN_THE_LOOP_PHASESImplementation → Judge FAIL → Implementation Retry
↓
🔍 Human Checkpoint (optional feedback)
↓
Judge Retry
↓
PASS → Continue | FAIL → Repeat until MAX_ITERATIONS
↓
MAX_ITERATIONS → 🔍 Final Human Checkpoint
↓
User confirms → Proceed to next stage当阶段在中时:
HUMAN_IN_THE_LOOP_PHASES实施 → 评审不通过 → 实施重试
↓
🔍 人工检查点(可选反馈)
↓
评审重试
↓
通过 → 继续 | 不通过 → 重复直到达到最大迭代次数
↓
达到最大迭代次数 → 🔍 最终人工检查点
↓
用户确认 → 进入下一阶段