Agent Spec Tool-First Workflow

Version: 3.1.0 | Last Updated: 2026-03-08

You are an expert at using

agent-spec

as a CLI tool for contract-driven AI coding. Help users by:

Planning: Render task contracts before coding with
```
contract
```
Implementing: Follow contract Intent, Decisions, Boundaries
Verifying: Run
```
lifecycle
```
/
```
guard
```
to check code against specs
Reviewing: Use
```
explain
```
for human-readable summaries,
```
stamp
```
for git trailers
Debugging: Interpret verification failures and fix code accordingly

Core Mental Model

The key shift: Review point displacement. Human attention moves from "reading code diffs" to "writing contracts".

Traditional:  Write Issue (10%) → Agent codes (0%) → Read diff (80%) → Approve (10%)
agent-spec:   Write Contract (60%) → Agent codes (0%) → Read explain (30%) → Approve (10%)

Humans define "what is correct" (Contract). Machines verify "is the code correct" (lifecycle). Humans do final "Contract Acceptance" — not Code Review.

Quick Reference

Command	Purpose	When to Use
`agent-spec init`	Scaffold new spec	Starting a new task
`agent-spec contract <spec>`	Render Task Contract	Before coding - read the execution plan
`agent-spec lint <files>`	Spec quality check	After writing spec, before giving to Agent
`agent-spec lifecycle <spec> --code .`	Full lint + verify pipeline	After edits - main quality gate
`agent-spec guard --spec-dir specs --code .`	Repo-wide check	Pre-commit / CI - all specs at once
`agent-spec explain <spec> --format markdown`	PR-ready review summary	Contract Acceptance - paste into PR
`agent-spec explain <spec> --history`	Execution history	See how many retries the Agent needed
`agent-spec stamp <spec> --dry-run`	Preview git trailers	Before committing - traceability
`agent-spec verify <spec> --code .`	Raw verification only	When you want verify without lint gate
`agent-spec checkpoint status`	VCS-aware status	Check uncommitted state

Documentation

Refer to the local files for detailed command patterns:

```
./references/commands.md
```
- Complete CLI command reference with all flags

IMPORTANT: Documentation Completeness Check

Before answering questions, Claude MUST:

Read
```
./references/commands.md
```
for exact command syntax
If file read fails: Inform user "references/commands.md is missing, answering from SKILL.md patterns"
Still answer based on SKILL.md patterns + built-in knowledge

The Seven-Step Workflow

Step 1: Human writes Task Contract (human attention: 60%)

Not a vague Issue — a structured Contract with Intent, Decisions, Boundaries, Completion Criteria.

bash

agent-spec init --level task --lang zh --name "用户注册API"
# Then fill in the four elements in the generated .spec file

Key principle: Exception scenarios >= happy path scenarios. 1 happy + 3 error paths forces you to think through edge cases before coding begins.

Step 2: Contract quality gate

Check Contract quality before handing to Agent. Like "code review" but for the Contract itself.

bash

agent-spec lint specs/user-registration.spec --min-score 0.7

Catches: vague verbs, unquantified constraints, non-deterministic wording, missing test selectors, sycophancy bias, uncovered constraints.

Optional: team "Contract Review" — review 50-80 lines of natural language instead of 500 lines of code diff.

Step 3: Agent reads Contract and codes

Agent consumes the structured contract:

bash

agent-spec contract specs/user-registration.spec

Agent is triple-constrained:

Decisions tell it "how to do it" (no technology shopping)
Boundaries tell it "what to touch" (no unauthorized file changes)
Completion Criteria tell it "when it's done" (all bound tests must pass)

Step 4: Agent self-checks with lifecycle (automatic retry loop)

bash

agent-spec lifecycle specs/user-registration.spec \
  --code . --change-scope worktree --format json --run-log-dir .agent-spec/runs

Four verification layers run in sequence:

lint — re-check Contract quality (prevent spec tampering)
StructuralVerifier — pattern match Must NOT constraints against code
BoundariesVerifier — check changed files are within Allowed Changes
TestVerifier — execute tests bound to each scenario

Agent retry loop (no human needed):
  Code → lifecycle → FAIL (2/5) → read failure_summary → fix → lifecycle → FAIL (4/5) → fix → lifecycle → PASS (5/5) ✓

Run logs record this history — "this Contract took 3 tries to pass".

Retry Protocol

When lifecycle fails, follow this exact sequence:

Run:

agent-spec lifecycle <spec> --code . --format json

Parse JSON output, find each scenario's
```
verdict
```
and
```
evidence
```
For
```
fail
```
: the bound test ran and failed — read evidence to understand why, fix code
For
```
skip
```
: the bound test was not found — check
```
Test:
```
selector matches a real test name
For
```
uncertain
```
: AI verification pending — review manually or enable AI backend
Fix code based on evidence. Do NOT modify the spec file — changing the Contract to make verification pass is sycophancy, not a fix
Re-run lifecycle
After 3 consecutive failures on the same scenario, stop and escalate to the human

Critical rule: The spec defines "what is correct". If the code doesn't match, fix the code. If the spec itself is wrong, switch to authoring mode and update the Contract explicitly — never silently weaken acceptance criteria.

Step 5: Guard gate (pre-commit / CI)

bash

# Pre-commit hook
agent-spec guard --spec-dir specs --code . --change-scope staged

# CI (GitHub Actions)
agent-spec guard --spec-dir specs --code . --change-scope worktree

Runs lint + verify on ALL specs against current changes. Blocks commit/PR if any spec fails.

Step 6: Contract Acceptance replaces Code Review (human attention: 30%)

Human reviews a Contract-level summary, not a code diff:

bash

agent-spec explain specs/user-registration.spec --code . --format markdown

Reviewer judges two questions:

Is the Contract definition correct? (Intent, Decisions, Boundaries make sense?)
Did all verifications pass? (4/4 pass including error paths?)

If both "yes" → approve. This is 10x faster than reading code diffs.

Check retry history if needed:

bash

agent-spec explain specs/user-registration.spec --code . --history

Assisting Contract Acceptance

When helping a human review a completed task:

Run

agent-spec explain <spec> --code . --format markdown

and present the output

If human asks about retry history: run with
```
--history
```
flag
If human asks about specific failures: run
```
agent-spec lifecycle <spec> --code . --format json
```
and extract the relevant scenario results
If human approves: run
```
agent-spec stamp <spec> --code . --dry-run
```
and present the trailers

Step 7: Stamp and archive

bash

agent-spec stamp specs/user-registration.spec --dry-run
# Output: Spec-Name: 用户注册API
#         Spec-Passing: true
#         Spec-Summary: 4/4 passed, 0 failed, 0 skipped, 0 uncertain

Establishes Contract → Commit traceability chain.

Verdict Interpretation

Verdict	Meaning	Action
`pass`	Scenario verified	No action needed
`fail`	Scenario failed verification	Read evidence, fix code
`skip`	Test not found or not run	Add missing test or fix selector
`uncertain`	AI stub / manual review needed	Review manually or enable AI backend

Key rule:
skip
!=
pass
. All four verdicts are distinct.

VCS Awareness

agent-spec auto-detects the VCS from the project root. Behavior differs between git and jj:

Condition	Behavior
`.jj/` exists (even with `.git/` )	Use `--change-scope jj` instead of `worktree`
jj repo	Do NOT run `git add` or `git commit` — jj auto-snapshots all changes
jj repo	`stamp` output includes `Spec-Change:` trailer with jj change ID
jj repo	`explain --history` shows file-level diffs between runs (via operation IDs)
Only `.git/`	Use standard git commands ( `--change-scope staged` or `worktree` )
Neither	Change scope detection unavailable; use `--change <path>` explicitly

Change Set Options

Flag	Behavior	Default
`--change <path>`	Explicit file/dir for boundary checking	(none)
`--change-scope staged`	Git staged files	guard default
`--change-scope worktree`	All git working tree changes	(none)
`--change-scope jj`	Jujutsu VCS changes	(none)
`--change-scope none`	No change detection	lifecycle/verify default

Advanced Features

Verification Layers

bash

# Run only specific layers
agent-spec lifecycle specs/task.spec --code . --layers lint,boundary,test
# Available: lint, boundary, test, ai

Run Logging

bash

agent-spec lifecycle specs/task.spec --code . --run-log-dir .agent-spec/runs
agent-spec explain specs/task.spec --history

AI Mode

bash

agent-spec verify specs/task.spec --code . --ai-mode off      # default - no AI
agent-spec verify specs/task.spec --code . --ai-mode stub      # testing only
agent-spec lifecycle specs/task.spec --code . --ai-mode caller # agent-as-verifier

AI Verification: Caller Mode

When

--ai-mode caller

is used, the calling Agent acts as the AI verifier. This is a two-step protocol:

Step 1: Emit AI requests

bash

agent-spec lifecycle specs/task.spec --code . --ai-mode caller --format json

If any scenarios are skipped (no mechanical verifier covered them), the output JSON includes:

```
"ai_pending": true
```

"ai_requests_file": ".agent-spec/pending-ai-requests.json"

The pending requests file contains

AiRequest

objects with scenario context, code paths, contract intent, and constraints.

Step 2: Resolve with external decisions

The Agent reads the pending requests, analyzes each scenario, then writes decisions:

json

[
  {
    "scenario_name": "场景名称",
    "model": "claude-agent",
    "confidence": 0.92,
    "verdict": "pass",
    "reasoning": "All steps verified by code analysis"
  }
]

Then merges them back:

bash

agent-spec resolve-ai specs/task.spec --code . --decisions decisions.json

This produces a final merged report where Skip verdicts are replaced with the Agent's AI decisions.

When to use caller mode:

When the calling Agent (Claude, Codex, etc.) can read and reason about code
For scenarios that can't be verified by tests alone (design intent, code quality)
When you want the Agent to be both implementor and verifier

When to Use / When NOT to Use

Scenario	Use agent-spec?	Why
Clear feature with defined inputs/outputs	Yes	Contract can express deterministic acceptance criteria
Bug fix with reproducible steps	Yes	Great for "given bug X, when fixed, then Y"
Exploratory prototyping	No	You don't know "what is done" yet - vibe code first
Large architecture refactor	No	Boundaries hard to define, "better architecture" isn't testable
Security/compliance rules	Yes (org.spec)	Encode rules once, enforce mechanically everywhere

Gradual Adoption

Week 1-2:  Pick 2-3 clear bug fixes, write Contracts for them
Week 3-4:  Expand to new feature development
Week 5-8:  Create project.spec with team coding standards
Month 3+:  Consider org.spec for cross-project governance

Common Errors

Error	Cause	Solution
Guard reports N specs failing	Specs have lint or verify issues	Run `lifecycle` on each failing spec individually
`skip` verdict on scenario	Test selector doesn't match any test	Check `Test:` / `Package:` / `Filter:` in spec
Quality score below threshold	Too many lint warnings	Fix vague verbs, add quantifiers, improve testability
Boundary violation detected	Changed file outside allowed paths	Either update Boundaries or revert the change
`uncertain` on all AI scenarios	Using `--ai-mode stub` or no backend	Expected — review manually
Agent keeps failing lifecycle	Contract criteria too vague or too strict	Improve Completion Criteria specificity

Command Priority

Preference	Use	Instead of
`contract`	Render task contract	`brief` (legacy alias)
`lifecycle`	Full pipeline	`verify` alone (misses lint)
`guard`	Repo-wide	Multiple individual `lifecycle` calls
`--change`	Explicit paths known	`--change-scope` when paths are known
CLI commands	Tool-first approach	`spec-gateway` library API

When to Switch to Authoring Mode

During implementation, if you discover:

A missing exception path that should be in Completion Criteria
A Boundary that's too restrictive (need to modify more files than allowed)
A Decision that needs to change (technology choice was wrong)

Switch to

agent-spec-authoring

skill, update the Contract FIRST, re-run

agent-spec lint

to validate the change, then resume implementation. Do NOT silently work outside the Contract's boundaries.

Escalation

Switch to library integration only when:

Embedding
```
agent-spec
```
into another Rust agent runtime
Testing
```
spec-gateway
```
internals

Injecting a host

AiBackend

via

verify_with_backend(Arc<dyn AiBackend>)

agent-spec-tool-first

NPX Install

Tags

SKILL.md Content

Agent Spec Tool-First Workflow

Core Mental Model

Quick Reference

Documentation

IMPORTANT: Documentation Completeness Check

The Seven-Step Workflow

Step 1: Human writes Task Contract (human attention: 60%)

Step 2: Contract quality gate

Step 3: Agent reads Contract and codes

Step 4: Agent self-checks with lifecycle (automatic retry loop)

Retry Protocol

Step 5: Guard gate (pre-commit / CI)

Step 6: Contract Acceptance replaces Code Review (human attention: 30%)

Assisting Contract Acceptance

Step 7: Stamp and archive

Verdict Interpretation

VCS Awareness

Change Set Options

Advanced Features

Verification Layers

Run Logging

AI Mode

AI Verification: Caller Mode

When to Use / When NOT to Use

Gradual Adoption

Common Errors

Command Priority

When to Switch to Authoring Mode

Escalation