Verify Task
Pick up an existing Task as a QA engineer. Core value: uses the Gherkin scenarios as the executable test script — each scenario is a concrete test case with Given/When/Then steps to run against the implementation.
Read
references/qa-verdict-guide.md
before starting — it defines the status symbols, verdict options, and the expected Test Mapping table format used throughout this skill.
Process
0. GitHub CLI setup
Run steps 1–2 of
../references/gh-setup.md
(install check and auth check). Stop if
is not installed or not authenticated. Extensions are not required for this skill.
Skip this step if invoked from
or another skill that already ran gh-setup this session.
1. Identify the verification scope
- : "Are you verifying a single Task or a full Feature?"
- : "Scope"
- :
[{label: "Single Task", description: "Verify one Task's Gherkin scenarios"}, {label: "Full Feature", description: "Verify all Tasks under a Feature using its sub-issues"}]
If Single Task:
Search for recent open issues with labels
or
to populate options. Call
with
question: "Which Task are you testing?"
,
, and
pre-filled with 1–2 likely open Task issue references.
Fetch the Task first, extract the Feature number from its Context section, then fetch the Feature:
bash
gh issue view <task_number> # Gherkin, Contracts, Edge Cases, Test Mapping, DoD — also yields feature number
# Extract feature number, then:
gh issue view <feature_number> # ACs, edge cases for additional probe scenarios
Check task labels. If
is
absent, warn and call
with:
-
: "This task hasn't been implemented yet. How would you like to proceed?"
-
-
:
[{label: "Implement first", description: "Go back and run wtf:implement-task (default)"}, {label: "Verify anyway", description: "Skip and proceed with verification"}]
-
Implement first → follow the
process, passing the Task number in as context.
-
Verify anyway → proceed.
If Full Feature:
Call
with
question: "Which Feature are you verifying?"
,
, and
pre-filled from open feature issues.
Fetch all sub-issues of the Feature using the extension:
bash
gh sub-issue list <feature_number>
This returns the authoritative list of Tasks — do not search by label or title matching. Spawn one sub-agent per Task in parallel using the Agent tool with
, each running steps 3–9 independently. Pass the task number and feature context to each sub-agent so it does not need to re-fetch. Wait for all sub-agents to complete, then aggregate results into a feature-level summary (total tasks, pass/fail/blocked counts) and present it.
2. Load the QA steering document
Use the Read tool to attempt reading
.
If the file exists: keep its content in context. Use its test strategy, coverage thresholds, definition of done, and known flaky areas to inform every verification decision in this session. Do not surface it to the user — just apply it silently.
If the file
does not exist, call
with:
-
: "docs/steering/QA.md doesn't exist yet. This document captures your test strategy, coverage thresholds, and definition of done. Would you like to create it now?"
-
: "QA steering doc missing"
-
:
[{label: "Create it now", description: "Run
steer-qa
before continuing (recommended)"}, {label: "Skip for this session", description: "Continue without it — QA decisions won't reference project standards"}]
-
Create it now → follow the
process, then return to this skill and continue from step 3.
-
Skip for this session → continue without it.
3. Establish the test surface
From the Task, extract and present:
- All Gherkin scenarios (these are the test cases)
- The contracts (request/response schemas to verify against)
- Edge Cases & Risks (additional scenarios to probe)
- Observability requirements (logs, metrics, alerts to verify)
Call
with
question: "I found [n] Gherkin scenarios and [m] edge cases to cover. Does this match what you expect?"
(replace [n] and [m] with actual counts),
, and
options: [{label: "Yes — that's everything", description: "Proceed to testing"}, {label: "There are more scenarios", description: "I want to add some"}]
.
4. Walk through each Gherkin scenario
For each scenario, one at a time:
-
Present it as a concrete test case — restate the Given/When/Then in plain language.
-
- : "Did this scenario pass?"
- : "Result"
- :
[{label: "Yes ✅", description: "Scenario passed"}, {label: "No ❌", description: "Scenario failed"}, {label: "Blocked 🚫", description: "Could not test due to dependency or environment issue"}, {label: "N/A or Conditional ⚠️", description: "Not applicable, or passes only under a specific condition"}]
- Yes ✅ → mark ✅ in the running Test Mapping table. Set to .
- No ❌ → call with
question: "What actually happened?"
, header: "Failure details"
, and pre-filled with 1–2 plausible failure modes inferred from the scenario (e.g. "No error shown", "Wrong data returned"). Record findings with repro steps. Then call with question: "Would you like to file a bug report now?"
, , options: [{label: "File now", description: "Run
report-bug immediately (default)"}, {label: "Continue and file later", description: "Defer and move to the next scenario"}]
— if "File now", follow the process immediately with the task number and scenario details before moving on. Mark as (filed now) or (deferred). Set accordingly.
- Blocked 🚫 → call with
question: "What dependency or environment issue prevented testing?"
, , and pre-filled with common blockers inferred from the task context (e.g. "Missing test environment", "Depends on unmerged task"). Set to .
- N/A or Conditional ⚠️ → call with
question: "Is this N/A, or does it pass only under a condition?"
, , and options: [{label: "N/A — not applicable", description: "This scenario does not apply"}, {label: "Conditional — specify the condition", description: "Passes only under a specific circumstance"}]
. Record appropriately. Set to (track the condition separately).
-
After recording the result,
immediately update the Task issue with the current state of the Test Mapping table (do not wait until all scenarios are done). The table must include a
column:
The running Test Mapping table format (update after every scenario):
| Scenario | Result | Bug Filed |
|---|
| ✅/❌/🚫/N/A/⚠️ | yes / no / — |
bash
gh issue view <task_number> --json body -q .body > /tmp/updated-task-body.md
Programmatically replace the Test Mapping table section in
/tmp/updated-task-body.md
using the Write or Edit tool, preserving all other sections unchanged. Then push:
bash
gh issue edit <task_number> --body-file /tmp/updated-task-body.md
-
Keep a running tally. After updating, confirm: "Updated. Moving to next scenario..."
5. Probe the edge cases
For each Edge Case listed in the Task (and the parent Feature), one at a time:
- Derive a concrete test action from the edge case description.
- Call with:
- : "Did this edge case pass?"
- : "Result"
- :
[{label: "Yes ✅", description: "Edge case passed"}, {label: "No ❌", description: "Edge case failed"}, {label: "Blocked 🚫", description: "Could not test"}, {label: "N/A", description: "Not applicable"}]
- No ❌ → call with
question: "What actually happened?"
, header: "Failure details"
, and pre-filled with 1–2 plausible failure modes inferred from the edge case. Record findings with repro steps, then ask to file a bug report as in step 4.
- After each result, update the Task issue — append an Edge Cases section (or update it if present) with the same table format used in step 4.
6. Verify observability
For each item in the Observability section (logs, metrics, alerts), one at a time:
- Call with:
- : "Was this observability item present and correct?"
- : "Result"
- :
[{label: "Yes ✅", description: "Present and correct"}, {label: "No ❌", description: "Missing or incorrect"}, {label: "N/A", description: "Not applicable to this task"}]
- Record the result. On ❌, ask for details and offer to file a bug report as in step 4.
- After each result, update the Task issue with an Observability Results section.
7. Finalize results and post QA summary
The Test Mapping table has been updated after each scenario (step 4). Now do a final update: check off DoD items that passed; leave failing ones unchecked.
bash
gh issue view <task_number> --json body -q .body > /tmp/verify-final-body.md
Programmatically update the DoD checklist in
/tmp/verify-final-body.md
using the Write or Edit tool. Then push:
bash
gh issue edit <task_number> --body-file /tmp/verify-final-body.md
Post a QA summary comment:
bash
gh issue comment <task_number> --body "<qa_summary>"
The QA summary must include:
- Total scenarios tested and pass/fail/conditional count
- Any findings with repro steps
- Conditional passes: list each ⚠️ scenario with its required condition
- Clear verdict: ✅ Ready for merge / ❌ Needs fixes / ⚠️ Conditional pass (list conditions)
If the verdict is ✅ or ⚠️, add the
lifecycle label:
bash
gh issue edit <task_number> --add-label "verified"
Print the updated Task issue URL.
8. Offer to open a PR and close the issue
If the verdict is ✅ or ⚠️, call
with:
-
: "Task verified. What would you like to do next?"
-
-
:
[{label: "Open PR now", description: "Create a pull request — the task closes automatically when the PR is merged (recommended)"}, {label: "Skip for now", description: "Exit — I'll open the PR later"}]
-
Open PR now → follow the
process, passing the Task number in as context. The Task (and Feature / Epic) will be closed automatically when the PR with
is merged — do not close issues directly.
-
Skip for now → continue.
Closing policy: Issues are only "closed as completed" via a merged PR that contains
. Never call
for completed work. Direct closes are reserved for:
gh issue close <n> --reason "not planned"
— won't implement
gh issue close <n> --reason "duplicate"
— duplicate of another issue
9. Offer bug reports for remaining failures
Check all result tables (Gherkin scenarios from step 4, edge cases from step 5, observability from step 6): find all rows where Result is ❌ and
is
. These are the unfiled failures.
If none exist, skip this step entirely.
If unfiled failures exist, present them as a numbered list, then call
with:
-
: "[n] failing scenario(s) without a bug report. How would you like to handle them?"
(replace [n] with the actual count)
-
-
:
[{label: "File separately", description: "File one bug report per failing scenario (default)"}, {label: "File combined", description: "File one combined bug report for all failures"}, {label: "Skip", description: "Exit — I'll handle it manually"}]
-
File separately → spawn one sub-agent per failing scenario in parallel using the Agent tool, each running the
process with the task number and the specific failing scenario. Wait for all sub-agents to complete before exiting.
-
File combined → follow the
process once, passing in the task number and all failing scenarios together.
-
Skip → exit without filing reports.