Skill Optimizer
Reduce skill token cost without losing coverage. Every token in SKILL.md body is paid per conversation — references/ files are loaded on-demand.
Optimization Workflow
Phase 1: Analyze
Measure the current skill before changing anything.
- Count SKILL.md body lines (exclude frontmatter) and estimate tokens (~4.5 tokens/line for mixed code/prose).
- Count description characters.
- List every file with line counts.
- Identify duplication: for each body section (at any heading level), check if the same concept or procedure is also covered in a reference file. Count those body lines and divide by total body lines for the overlap percentage.
- List nouns from the description that appear verbatim in a sibling skill's description — these need domain-qualifying in Phase 2.
If the skill has no
directory, optimization may require creating reference files first. See the
playbook for guidance on this edge case.
Output a table:
| Metric | Current |
|---------------------|---------|
| Description chars | ??? |
| Body lines | ??? |
| Body tokens (est.) | ??? |
| Duplication % | ??? |
| Reference files | ??? |
Phase 2: Plan
Decide what stays in the body, what moves to references, and what gets compressed.
Body retention criteria — keep a section in the body ONLY if it meets at least one:
- Complex multi-step pattern requiring coordination across multiple sections or files
- Non-obvious logic, parameters, or decision rules that agents frequently get wrong without inline guidance
- A concept unique to this skill with no external documentation
- Primary use case the skill exists for (the thing agents reach for most often)
Everything else belongs in the appropriate
file. See the
playbook decision tree for concrete examples of what typically stays vs. moves.
Description compression rules:
- Lead with the package/tool name and a one-line identity
- Replace enumerations of 4+ specific names (APIs, checks, steps) with category-based phrasing (e.g., "hooks for auth, sessions, tokens" instead of listing each hook name)
- Qualify generic keywords with the skill's domain to reduce false positives (e.g., "MyLib integrations with Redis" not "Redis integration")
- Merge items that share a theme into a single line (e.g., "error handling" + "retry logic" → "Error handling and retry logic in MyLib")
- Verify every original trigger category maps 1:1 to the compressed version — no categories dropped
Plan the Reference Guide section — for each reference file, write a one-line description of when to read it. This section is load-bearing: it tells agents which file to consult.
Target metrics:
- Body: under ~250 lines
- Description: under ~700 characters
- Duplication with references: 0%
Phase 3: Execute
Apply the plan. Work in this order:
- Compress the description — rewrite the YAML field. Keep all trigger categories; do not remove any "when to use" signals.
- Remove duplicate sections from body — delete sections already covered in references.
- Add the Reference Guide section — add explicit pointers to each reference file with descriptions. See the playbook for the recommended format.
- Add a Maintenance Note — add a note at the bottom of the body with: (a) the body-line budget (~250 lines), (b) a pointer to the ADR if one exists, and (c) a one-sentence rationale for the split. See the playbook template.
- Bump version — increment minor if the skill uses versioning.
Do NOT:
- Move "when to use" triggers from description to body (description is the only field read for triggering)
- Remove code examples from retained body sections (they are the value)
- Create new reference files just to move content — use existing files when possible
- Add content that duplicates what is already in references
Phase 4: Validate
Use the
tool to spawn a subagent (opus model) to challenge coverage. Provide it:
- The full SKILL.md (body + frontmatter)
- All reference files
- A list of 15-25 validation questions. Prompt the user (via ) to provide questions the skill must answer. If the user declines or has none, derive them from the description trigger categories — see the playbook for derivation rules.
The subagent evaluates each question:
- From SKILL.md alone: YES / PARTIAL / NO
- From SKILL.md + references: YES / PARTIAL / NO
- Gap: content missing from ALL files
Pass criteria:
- 0 regressions (nothing answerable before that isn't answerable after)
- All trigger categories in the description still present
- Body under ~250 lines
If gaps are found, determine whether they are pre-existing (never covered) or regressions (lost during optimization). Only regressions require fixes — restore or rewrite the missing content in the body or appropriate reference file, then re-evaluate only the affected questions.
Fallback: If subagent spawning is unavailable, self-evaluate: for each question, attempt to answer it using only the optimized files and rate confidence as HIGH / MEDIUM / LOW. Any LOW-confidence answer on a question that was previously answerable is a regression.
Output
After validation, produce a summary table:
| Metric | Before | After | Change |
|-------------------|--------|--------|--------|
| Description chars | ??? | ??? | -??% |
| Body lines | ??? | ??? | -??% |
| Body tokens (est.)| ??? | ??? | -??% |
| Duplication % | ??? | 0% | -??% |
| Regressions | n/a | 0 | |
Reference
For detailed checklists, before/after examples, and the full validation methodology, see optimization-playbook.md.
Maintenance Note
Body budget: ~120 lines (general target for optimized skills: ~250). The optimization workflow and decision rules are the core value and stay in the body; expanded examples, checklists, and the decision tree live in the playbook reference.