audit-context
Diagnose a nao context. Find gaps, MECE violations, failure root causes, and bloat. Output is a short in-conversation report ending in a prioritized plan.
Diagnose only — never fix. Route fixes to
/
/
.
Run any time: right after
, mid-build, before a release, or when the agent's behavior gets surprising.
Six checks (run in order)
1. Synced context
Read
. What's wired in (warehouse, repos, Notion, semantic layer, MCPs)? What's
missing (dbt repo, ETL configs, BI repo, internal docs)? Has
run — are
,
,
,
populated?
Scope check: <100 tables is the hard ceiling, ≤20 is the target. Better 12 well-documented tables than 80 half-documented ones. Flag oversized scope explicitly — it's the biggest predictor of reliability failure.
2. vs target structure
Six standard sections (from
): Business overview, Data architecture, Core data models (Most Used + Tables detail), Key Metrics Reference, Date filtering, Analysis Process. Per section, mark
present / missing / thin. Flag placeholders,
markers, and metric entries with no source-of-truth pointer.
3. Context coverage (per table)
For every table in
: is it in
? Does it have a
block? Is there dbt context (
repos/<dbt>/models/**/schema.yml
)? Any extra
?
Then per-table gaps: undocumented columns the agent will reference, calculated fields with no explanation, foreign keys with no relation, common WHERE filters not mentioned anywhere. A table with no docs anywhere is a high-priority finding.
4. Data model consistency (MECE)
- Mutually exclusive? Two tables computing the same metric differently (worst issue — the agent picks one unpredictably).
- Collectively exhaustive? Asked metrics that no in-scope table can answer.
- Duplicated columns? Same logical field under different names ( / / ).
- Ambiguous columns? without unit, without enum values.
5. Test coverage
If
is empty → recommend
. Otherwise read
(most recent run) and categorize each failure:
| Category | Looks like | Fix |
|---|
| Data model | Wrong column / wrong table | Add column descriptions; clarify granularity |
| Date selection | Wrong period / week start | Add DO/DON'T SQL in |
| Test issue | Test SQL itself is wrong | Fix the test, not the context |
| Interpretation | Reasonable but different reading | Add to naming conventions or |
| Metric definition | Wrong formula / source | Tighten or add a semantic layer |
Propose the smallest rule change per failure. Sort by impact (tests affected).
6. Token optimization
- Files >40KB (flag).
- blocks exceeding the 10-column cap.
- Duplication between and .
- In-scope tables with no mention in any test or recent question (trim candidates).
- Raw / staging tables that snuck into scope.
If
is bloated, suggest moving per-table detail to
and keeping only the one-line pointer in
. For multi-domain bloat, propose a per-domain file map referenced from
. Show the proposed structure before moving anything.
Output (in conversation, not a file)
Lead with a one-paragraph summary: sync state | scope wideness (N tables vs ≤100 ceiling) | rules quality (N/6 sections substantive) | test coverage (N tests, X% passing).
Then deep-dive only the sections with findings. Skip clean ones. Format hints:
- Synced / RULES.md / token bloat → bulleted gaps.
- Context coverage → table:
Table | RULES.md | dbt docs | Extra .md | Gap
.
- MECE → bullets.
- Test failures → table:
Test | Category | Proposed fix
.
End with a prioritized plan (easiest-win → biggest-work), each item naming the skill that does the work:
## Plan
1. (easy / 5 min) ... → write-context-rules
2. (small / 30 min) ... → create-context-tests
3. (medium / 1-2 hr) ... → audit-context (rerun after)
4. (large / multi-session) ... → add-semantic-layer
Guardrails
- Apply one change at a time. Re-run tests between fixes.
- Tests are the source of truth. If the user says "it's working," ask for the latest pass rate first.
- Don't move or split files without confirmation. Show the file map first.
- Don't fix in this skill — diagnose only.