Loading...
Loading...
Refine, parallelize, and verify a draft task specification into a fully planned implementation-ready task
npx skill4agent add neolabhq/context-engineering-kit plan-task/add-taskdraft/todo/$ARGUMENTS$ARGUMENTS| Argument | Format | Default | Description |
|---|---|---|---|
| Path to task file | Required | Path to draft task file (e.g., |
| | None | Continue refining from a specific stage. Stage is optional - resolve from context if not provided. |
| | | Target threshold value (out of 5.0) for judge pass/fail decisions. |
| | | Maximum implementation + judge retry cycles per phase before moving to next stage (regardless of pass/fail). |
| | All stages | Comma-separated list of stages to include. |
| | None | Comma-separated list of stages to exclude. |
| | N/A | Alias for |
| | N/A | Alias for |
| | None | Phases after which to pause for human verification. |
| | | Skip all judge validation checks - phases proceed without quality gates. |
| | | Incremental refinement mode - detect changes against git and re-run only affected stages (top-to-bottom propagation). |
--included-stages--skip| Stage Name | Phase | Description |
|---|---|---|
| 2a | Gather relevant resources, documentation, libraries |
| 2b | Identify affected files, interfaces, integration points |
| 2c | Refine description and create acceptance criteria |
| 3 | Synthesize research and analysis into architecture |
| 4 | Break into implementation steps with risks |
| 5 | Reorganize steps for parallel execution |
| 6 | Add LLM-as-Judge verification rubrics |
$ARGUMENTS
# Extract task file path (first positional argument, required)
TASK_FILE = first argument that is a file path (must exist in .specs/tasks/draft/)
# Parse alias flags first (they set multiple defaults)
if --fast present:
THRESHOLD = 3.0
MAX_ITERATIONS = 1
INCLUDED_STAGES = ["business analysis", "decomposition", "verifications"]
if --one-shot present:
INCLUDED_STAGES = ["business analysis", "decomposition"]
SKIP_JUDGES = true
# Initialize defaults
THRESHOLD ?= --target-quality || 3.5
MAX_ITERATIONS ?= --max-iterations || 3
INCLUDED_STAGES ?= --included-stages || ["research", "codebase analysis", "business analysis", "architecture synthesis", "decomposition", "parallelize", "verifications"]
SKIP_STAGES = --skip || []
HUMAN_IN_THE_LOOP_PHASES = --human-in-the-loop || []
SKIP_JUDGES = --skip-judges || false
REFINE_MODE = --refine || false
CONTINUE_STAGE = null
if --continue [stage] present:
CONTINUE_STAGE = stage or resolve from context
# Compute final active stages
ACTIVE_STAGES = INCLUDED_STAGES - SKIP_STAGES--continue--continue[x]--refine--refinegit status --porcelain -- <TASK_FILE>git diff HEAD -- <TASK_FILE>//| Modified Section | Re-run From Stage |
|---|---|
| Description / Acceptance Criteria | |
| Architecture Overview | |
| Implementation Process / Steps | |
| Parallelization / Dependencies | |
| Verification sections | |
//# User edited the Architecture Overview section
/plan .specs/tasks/todo/my-task.feature.md --refine
# Detects Architecture section changed → re-runs from Phase 3 onwards
# Skips: research, codebase analysis, business analysis
# Runs: architecture synthesis, decomposition, parallelize, verificationsHUMAN_IN_THE_LOOP_PHASES---
## 🔍 Human Review Checkpoint - Phase X
**Phase:** {phase name}
**Judge Score:** {score}/{THRESHOLD} threshold
**Status:** ✅ PASS / ⚠️ RETRY {n}/{MAX_ITERATIONS}
**Artifacts:**
- {artifact_path_1}
- {artifact_path_2}
**Judge Feedback:**
{feedback summary}
**Action Required:** Review the above artifacts and provide feedback or continue.
> Continue? [Y/n/feedback]:
---# Refine a draft task with all stages
/plan .specs/tasks/draft/add-validation.feature.md
# Fast refinement with minimal stages
/plan .specs/tasks/draft/quick-fix.bug.md --fast
# Continue from a specific stage
/plan .specs/tasks/draft/complex-feature.feature.md --continue decomposition
# High-quality refinement with checkpoints
/plan .specs/tasks/draft/critical-api.feature.md --target-quality 4.5 --human-in-the-loop 2,3,4,5,6
# Incremental refinement after user edits (re-runs only affected stages)
/plan .specs/tasks/todo/my-task.feature.md --refineREFINE_MODETASK_FILE.specs/tasks/draft/REFINE_MODETASK_FILE.specs/tasks/todo/.specs/tasks/draft/### Configuration
| Setting | Value |
|---------|-------|
| **Task File** | {TASK_FILE} |
| **Target Quality** | {THRESHOLD}/5.0 |
| **Max Iterations** | {MAX_ITERATIONS} |
| **Active Stages** | {ACTIVE_STAGES as comma-separated list} |
| **Human Checkpoints** | Phase {HUMAN_IN_THE_LOOP_PHASES as comma-separated} |
| **Skip Judges** | {SKIP_JUDGES} |
| **Refine Mode** | {REFINE_MODE} |
| **Continue From** | {CONTINUE_STAGE} or "Start" |--continueCONTINUE_STAGECONTINUE_STAGE--refineREFINE_MODEgit status --porcelain -- <TASK_FILE>MMMM??git diff HEAD -- <TASK_FILE>//ACTIVE_STAGESACTIVE_STAGEScompleted{
"todos": [
{"content": "Ensure directories exist", "status": "pending", "activeForm": "Ensuring directories exist"},
{"content": "Phase 2a: Research relevant resources and documentation", "status": "pending", "activeForm": "Researching resources"},
{"content": "Judge 2a: PASS research quality (> {THRESHOLD})", "status": "pending", "activeForm": "Validating research"},
{"content": "Phase 2b: Analyze codebase impact and affected files", "status": "pending", "activeForm": "Analyzing codebase impact"},
{"content": "Judge 2b: PASS codebase analysis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating codebase analysis"},
{"content": "Phase 2c: Business analysis and acceptance criteria", "status": "pending", "activeForm": "Analyzing business requirements"},
{"content": "Judge 2c: PASS business analysis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating business analysis"},
{"content": "Phase 3: Architecture synthesis from research and analysis", "status": "pending", "activeForm": "Synthesizing architecture"},
{"content": "Judge 3: PASS architecture synthesis (> {THRESHOLD})", "status": "pending", "activeForm": "Validating architecture"},
{"content": "Phase 4: Decompose into implementation steps", "status": "pending", "activeForm": "Decomposing into steps"},
{"content": "Judge 4: PASS decomposition (> {THRESHOLD})", "status": "pending", "activeForm": "Validating decomposition"},
{"content": "Phase 5: Parallelize implementation steps", "status": "pending", "activeForm": "Parallelizing steps"},
{"content": "Judge 5: PASS parallelization (> {THRESHOLD})", "status": "pending", "activeForm": "Validating parallelization"},
{"content": "Phase 6: Define verification rubrics", "status": "pending", "activeForm": "Defining verifications"},
{"content": "Judge 6: PASS verifications (> {THRESHOLD})", "status": "pending", "activeForm": "Validating verifications"},
{"content": "Move task to todo folder", "status": "pending", "activeForm": "Promoting task"},
{"content": "Human checkpoint reviews", "status": "pending", "activeForm": "Awaiting human review"}
]
}SKIP_JUDGESresearchACTIVE_STAGEScodebase analysisACTIVE_STAGESbusiness analysisACTIVE_STAGESarchitecture synthesisACTIVE_STAGESdecompositionACTIVE_STAGESparallelizeACTIVE_STAGESverificationsACTIVE_STAGESHUMAN_IN_THE_LOOP_PHASESbash ${CLAUDE_PLUGIN_ROOT}/scripts/create-folders.sh.specs/tasks/draft/.specs/tasks/todo/.specs/tasks/in-progress/.specs/tasks/done/.specs/scratchpad/.specs/analysis/.claude/skills/in_progresscompletedTHRESHOLDMAX_ITERATIONSMAX_ITERATIONSHUMAN_IN_THE_LOOP_PHASESACTIVE_STAGESHUMAN_IN_THE_LOOP_PHASESSKIP_JUDGES.specs/tasks/draft/--refineREFINE_MODE${CLAUDE_PLUGIN_ROOT}@${CLAUDE_PLUGIN_ROOT}/scripts/create-scratchpad.shACTIVE_STAGESSKIP_JUDGESHUMAN_IN_THE_LOOP_PHASESInput: Draft Task File (.specs/tasks/draft/*.md)
│
▼
Phase 2: Parallel Analysis
│
├─────────────────────┬─────────────────────┐
▼ ▼ ▼
Phase 2a: Phase 2b: Phase 2c:
Research Codebase Analysis Business Analysis
[sdd:researcher sonnet] [sdd:code-explorer sonnet] [sdd:business-analyst opus]
Judge 2a Judge 2b Judge 2c
(pass: >THRESHOLD) (pass: >THRESHOLD) (pass: >THRESHOLD)
│ │ │
└─────────────────────┴─────────────────────┘
│
▼
Phase 3: Architecture Synthesis
[sdd:software-architect opus]
Judge 3 (pass: >THRESHOLD)
│
▼
Phase 4: Decomposition
[sdd:tech-lead opus]
Judge 4 (pass: >THRESHOLD)
│
▼
Phase 5: Parallelize
[sdd:team-lead opus]
Judge 5 (pass: >THRESHOLD)
│
▼
Phase 6: Verifications
[sdd:qa-engineer opus]
Judge 6 (pass: >THRESHOLD)
│
▼
Move task: draft/ → todo/
│
▼
Completesonnetsdd:researcherCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Task File: <TASK_FILE>
Task Title: <title from task file>
CRITICAL: DO NOT OUTPUT YOUR RESEARCH, ONLY CREATE THE SCRATCHPAD AND SKILL FILE..claude/skills/<skill-name>/SKILL.md.specs/scratchpad/<hex-id>.mdsonnetsdd:code-explorerCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Task File: <TASK_FILE>
Task Title: <title from task file>
CRITICAL: DO NOT OUTPUT YOUR ANALYSIS, ONLY CREATE THE SCRATCHPAD AND ANALYSIS FILE..specs/analysis/analysis-{name}.md.specs/scratchpad/<hex-id>.mdopussdd:business-analystCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read ${CLAUDE_PLUGIN_ROOT}/skills/plan-task/analyse-business-requirements.md and execute it exactly as is!
Task File: <TASK_FILE>
Task Title: <title from task file>
CRITICAL: DO NOT OUTPUT YOUR BUSINESS ANALYSIS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE..specs/scratchpad/<hex-id>.mdsonnetsdd:researcherCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to skill file from Phase 2a}
### Context
This is a skill document for task: {task title}. Evaluate comprehensiveness and reusability.
### Rubric
1. Resource Coverage (weight: 0.30)
- Documentation and references gathered?
- Libraries and tools identified with recommendations?
- 1=Missing critical resources, 2=Basic coverage, 3=Adequate, 4=Comprehensive, 5=Excellent
2. Pattern Relevance (weight: 0.25)
- Are identified patterns applicable?
- Are recommendations actionable?
- 1=Irrelevant, 2=Somewhat useful, 3=Adequate, 4=Well-targeted, 5=Perfect fit
3. Issue Anticipation (weight: 0.20)
- Common pitfalls identified with solutions?
- 1=None identified, 2=Few issues, 3=Adequate, 4=Good coverage, 5=Comprehensive
4. Reusability (weight: 0.15)
- Is the skill general enough to help multiple tasks?
- Does it avoid task-specific details?
- 1=Too specific, 2=Limited reuse, 3=Adequate, 4=Good, 5=Highly reusable
5. Task Integration (weight: 0.10)
- Was task file updated with skill reference?
- 1=Not updated, 3=Updated, 5=Updated with clear instructionsTHRESHOLDTHRESHOLDsonnetsdd:code-explorerCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to analysis file from Phase 2b}
### Context
This is codebase impact analysis for task: {task title}. Evaluate accuracy and completeness.
### Rubric
1. File Identification Accuracy (weight: 0.35)
- All affected files identified with specific paths?
- New files and modifications distinguished?
- 1=Major files missing, 2=Mostly correct, 3=Adequate, 4=Precise, 5=Complete
2. Interface Documentation (weight: 0.25)
- Key functions/classes documented with signatures?
- Change requirements clear?
- 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Complete
3. Integration Point Mapping (weight: 0.25)
- Integration points identified with impact?
- Similar patterns in codebase found?
- 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=Comprehensive
4. Risk Assessment (weight: 0.15)
- High risk areas identified with mitigations?
- 1=No assessment, 2=Basic, 3=Adequate, 4=Good, 5=ThoroughTHRESHOLDTHRESHOLDopussdd:business-analystCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to task file from Phase 2c}
### Context
This is business analysis output. Evaluate description clarity and acceptance criteria quality.
### Rubric
1. Description Clarity (weight: 0.30)
- What/Why clearly explained?
- Scope boundaries defined?
- 1=Vague, 2=Basic, 3=Adequate, 4=Clear, 5=Excellent
2. Acceptance Criteria Quality (weight: 0.35)
- Criteria specific and testable?
- Given/When/Then format for complex criteria?
- 1=Missing/vague, 2=Basic, 3=Adequate, 4=Good, 5=Excellent
3. Scenario Coverage (weight: 0.20)
- Primary flow documented?
- Error scenarios considered?
- 1=Missing, 2=Basic, 3=Adequate, 4=Good, 5=Comprehensive
4. Scope Definition (weight: 0.15)
- In-scope/out-of-scope explicit?
- No implementation details in description?
- 1=Missing, 2=Partial, 3=Adequate, 4=Good, 5=ClearTHRESHOLDTHRESHOLDopussdd:software-architectCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Task File: <TASK_FILE>
Skill File: <skill file path from Phase 2a>
Analysis File: <analysis file path from Phase 2b>
CRITICAL: DO NOT OUTPUT YOUR ARCHITECTURE SYNTHESIS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE..specs/scratchpad/<hex-id>.mdopussdd:software-architectCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to task file after Phase 3}
### Context
This is architecture synthesis output. The Architecture Overview section should contain
solution strategy, key decisions, and only relevant architectural sections.
### Rubric
1. Solution Strategy Clarity (weight: 0.30)
- Approach clearly explained?
- Key decisions documented with reasoning?
- Trade-offs stated?
- 1=Missing/unclear, 2=Basic, 3=Adequate, 4=Clear, 5=Excellent
2. Reference Integration (weight: 0.20)
- Links to research and analysis files?
- Insights from both integrated?
- 1=No links, 2=Partial, 3=Adequate, 4=Good, 5=Fully integrated
3. Section Relevance (weight: 0.25)
- Only relevant sections included (not all)?
- Sections appropriate for task complexity?
- 1=Wrong sections, 2=Mostly appropriate, 3=Adequate, 4=Good, 5=Precisely targeted
4. Expected Changes Accuracy (weight: 0.25)
- Files to create/modify listed?
- Consistent with codebase analysis?
- 1=Missing/inconsistent, 2=Partial, 3=Adequate, 4=Good, 5=Complete
THRESHOLDTHRESHOLDopussdd:tech-leadCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Task File: <TASK_FILE>
CRITICAL: DO NOT OUTPUT YOUR DECOMPOSITION, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE..specs/scratchpad/<hex-id>.mdopussdd:tech-leadCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to task file after Phase 4}
### Context
This is decomposition output. The Implementation Process section should contain
ordered steps with success criteria, subtasks, blockers, and risks.
### Rubric
1. Step Quality (weight: 0.30)
- Each step has clear goal, output, success criteria?
- Steps ordered by dependency?
- No step too large (>Large estimate)?
- 1=Vague/missing, 2=Basic, 3=Adequate, 4=Good, 5=Excellent
2. Success Criteria Testability (weight: 0.25)
- Criteria specific and verifiable?
- Use actual file paths, function names?
- Subtasks clearly defined with actionable descriptions?
- 1=Vague, 2=Partially testable, 3=Adequate, 4=Good, 5=All testable
3. Risk Coverage (weight: 0.25)
- Blockers identified with resolutions?
- Risks identified with mitigations?
- High-risk tasks identified with decomposition recommendations?
- 1=None, 2=Basic, 3=Adequate, 4=Good, 5=Comprehensive
4. Completeness (weight: 0.20)
- All architecture components have corresponding steps?
- Implementation summary table present?
- Definition of Done included?
- Phases organized: Setup → Foundational → User Stories → Polish?
- 1=Incomplete, 2=Partial, 3=Adequate, 4=Good, 5=CompleteTHRESHOLDTHRESHOLDopussdd:team-leadCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Task File: <TASK_FILE>
Use agents only from this list: {list ALL available agents with plugin prefix if available, e.g. sdd:developer, review:bug-hunter. Also include general agents: opus, sonnet, haiku}
CRITICAL: DO NOT OUTPUT YOUR PARALLELIZATION, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE..specs/scratchpad/<hex-id>.mdopussdd:team-leadCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to parallelized task file from Phase 5}
### Context
This is the output of Phase 5: Parallelize Steps. The artifact should contain implementation steps
reorganized for maximum parallel execution with explicit dependencies, agent assignments, and
parallelization diagram.
Use agents only from this list: {list ALL available agents with plugin prefix if available, e.g. sdd:developer, review:bug-hunter. Also include general agents: opus, sonnet, haiku}
### Rubric
1. Dependency Accuracy (weight: 0.35)
- Are step dependencies correctly identified?
- No false dependencies (steps marked dependent when they're not)?
- No missing dependencies (steps that actually depend on others)?
- 1=Major dependency errors, 2=Mostly correct, 3=Acceptable, 5=Precise dependencies
2. Parallelization Maximized (weight: 0.30)
- Are parallelizable steps correctly marked with "Parallel with:"?
- Is the parallelization diagram logical?
- 1=No parallelization/wrong, 2=Some optimization, 3=Acceptable, 5=Maximum parallelization
3. Agent Selection Correctness (weight: 0.20)
- Are agent types appropriate for outputs (opus by default, haiku for trivial, sonnet for simple but high in volume)?
- Does selection follow the Agent Selection Guide?
- Are only agents from the provided available agents list used?
- 1=Wrong agents, 2=Mostly appropriate, 3=Acceptable, 4=Optimal selection, 5=Perfect selection
4. Execution Directive Present (weight: 0.15)
- Is the sub-agent execution directive present?
- Are "MUST" requirements for parallel execution clear?
- 1=Missing directive, 2=Partial, 3=Acceptable, 4=Complete directive, 5=Perfect directiveTHRESHOLDTHRESHOLDopussdd:qa-engineerCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Task File: <TASK_FILE>
CRITICAL: DO NOT OUTPUT YOUR VERIFICATIONS, ONLY CREATE THE SCRATCHPAD AND UPDATE THE TASK FILE..specs/scratchpad/<hex-id>.mdopussdd:qa-engineerCLAUDE_PLUGIN_ROOT=${CLAUDE_PLUGIN_ROOT}
Read @${CLAUDE_PLUGIN_ROOT}/prompts/judge.md for evaluation methodology and execute.
### Artifact Path
{path to task file with verifications from Phase 6}
### Context
This is the output of Phase 6: Define Verifications. The artifact should contain LLM-as-Judge
verification sections for each implementation step, including verification levels, custom rubrics,
thresholds, and a verification summary table.
### Rubric
1. Verification Level Appropriateness (weight: 0.30)
- Do verification levels match artifact criticality?
- HIGH criticality → Panel, MEDIUM → Single/Per-Item, LOW/NONE → None?
- 1=Mismatched levels, 2=Mostly appropriate, 3=Acceptable, 5=Precisely calibrated
2. Rubric Quality (weight: 0.30)
- Are criteria specific to the artifact type (not generic)?
- Do weights sum to 1.0?
- Are descriptions clear and measurable?
- 1=Generic/broken rubrics, 2=Adequate, 3=Acceptable, 5=Excellent custom rubrics
3. Threshold Appropriateness (weight: 0.20)
- Are thresholds reasonable (typically 4.0/5.0)?
- Higher for critical, lower for experimental?
- 1=Wrong thresholds, 2=Standard applied, 3=Acceptable, 5=Context-appropriate
4. Coverage Completeness (weight: 0.20)
- Does every step have a Verification section?
- Is the Verification Summary table present?
- 1=Missing verifications, 2=Most covered, 3=Acceptable, 5=100% coverageTHRESHOLDTHRESHOLDgit mv <TASK_FILE> .specs/tasks/todo/
# Fallback if git not available: mv <TASK_FILE> .specs/tasks/todo/### Task Refined
| Property | Value |
|----------|-------|
| **Original File** | `<original TASK_FILE path>` |
| **Final Location** | `.specs/tasks/todo/<filename>` (ready for implementation) |
| **Title** | `<task title>` |
| **Type** | `<feature/bug/refactor/test/docs/chore/ci>` (from filename) |
| **Skill** | `<skill file path or "Skipped">` |
| **Skill Action** | `<Created new / Updated existing / Skipped>` |
| **Analysis** | `<analysis file path or "Skipped">` |
| **Scratchpad** | `<scratchpad file path>` |
| **Implementation Steps** | `<count or "N/A">` |
| **Parallelization Depth** | `<max parallel agents or "N/A">` |
| **Total Verifications** | `<count or "N/A">` |
### Configuration Used
| Setting | Value |
|---------|-------|
| **Target Quality** | {THRESHOLD}/5.0 |
| **Max Iterations** | {MAX_ITERATIONS} |
| **Active Stages** | {ACTIVE_STAGES as comma-separated list} |
| **Skipped Stages** | {SKIP_STAGES or stages not in ACTIVE_STAGES} |
| **Human Checkpoints** | Phase {HUMAN_IN_THE_LOOP_PHASES as comma-separated} |
| **Skip Judges** | {SKIP_JUDGES} |
| **Refine Mode** | {REFINE_MODE} |
### Quality Gates Summary
| Phase | Judge Score | Verdict |
|-------|-------------|---------|
| Phase 2a: Research | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 2b: Codebase Analysis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 2c: Business Analysis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 3: Architecture Synthesis | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 4: Decomposition | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 5: Parallelize | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
| Phase 6: Verify | X.X/5.0 | ✅ PASS / ⚠️ PROCEEDED (max iter) / ⏭️ SKIPPED |
**Threshold Used:** {THRESHOLD}/5.0 (or N/A if SKIP_JUDGES)
**Legend:**
- ✅ PASS - Score >= THRESHOLD
- ⚠️ PROCEEDED (max iter) - Score < THRESHOLD but MAX_ITERATIONS reached, proceeded anyway
- ⏭️ SKIPPED - Stage not in ACTIVE_STAGES
### Artifacts Generated
### Task Status Management
Task status is managed by folder location:
- `draft/` - Tasks created but not yet refined
- `todo/` - Tasks ready for implementation
- `in-progress/` - Tasks currently being worked on
- `done/` - Completed tasks
### Next Steps
1. Review task: `.specs/tasks/todo/<filename>`
- Edit the task file directly to make corrections
- Add `//` comments to lines that need clarification or changes
- Run `/plan` again with `--refine` to incorporate your feedback — it detects changes against git and propagates updates **top-to-bottom** (editing a section only affects sections below it, not above)
2. If everything is fine, begin implementation: `/implement` (will auto-select the task from todo/)THRESHOLDHUMAN_IN_THE_LOOP_PHASESMAX_ITERATIONS--human-in-the-loop⚠️ Phase X did not pass quality threshold (X.X/THRESHOLD) after MAX_ITERATIONS iterationsImplementation → Judge FAIL → Implementation Retry → Judge Retry
↓
PASS → Continue to next stage
FAIL → Repeat until MAX_ITERATIONS
↓
MAX_ITERATIONS reached → Proceed to next stage (with warning)HUMAN_IN_THE_LOOP_PHASESImplementation → Judge FAIL → Implementation Retry
↓
🔍 Human Checkpoint (optional feedback)
↓
Judge Retry
↓
PASS → Continue | FAIL → Repeat until MAX_ITERATIONS
↓
MAX_ITERATIONS → 🔍 Final Human Checkpoint
↓
User confirms → Proceed to next stage