Improve Codebase Architecture
Expose architecture friction and propose deepening opportunities—refactors that turn shallow modules into deep modules. The goals are testability and AI-navigability.
Glossary
Use these terms precisely in every recommendation. Language consistency is key—do not drift into using "component", "service", "API", or "boundary". See LANGUAGE.md for full definitions.
- Module — Anything with an interface and implementation (function, class, package, slice).
- Interface — Everything a caller must know to use the module correctly: types, invariants, error modes, ordering, config. Not just a type signature.
- Implementation — Internal code.
- Depth — Leverage on the interface: a small interface with a large amount of behaviour behind it. Deep = high leverage. Shallow = interface is nearly as complex as the implementation.
- Seam — The location of an interface; a place where behaviour can be changed without editing in-place. (Use this term instead of "boundary".)
- Adapter — A concrete thing that satisfies an interface at a seam.
- Leverage — What callers gain from depth.
- Locality — What maintainers gain from depth: changes, bugs, and knowledge are concentrated in one place.
Key Principles (full list in LANGUAGE.md):
- Deletion test: Imagine deleting this module. If complexity disappears, it's just a pass-through. If complexity re-emerges in N callers, it's providing value.
- The interface is the test surface.
- One adapter = hypothetical seam. Two adapters = real seam.
This skill references the project's domain model. Domain language names good seams; ADRs record decisions that the skill should not re-debate.
Process
1. Explore
First, read the project's domain glossary and any ADRs for the areas you will touch.
Then use the Agent tool with
to traverse the codebase. Don't rigidly follow heuristics; explore naturally and note where you feel friction:
- Does understanding a concept require jumping back and forth between many small modules?
- Which modules are shallow, with an interface nearly as complex as the implementation?
- Which pure functions were extracted solely for testability, but the real bugs lie in how they're called (lacking locality)?
- Which tightly-coupled modules leak across seams?
- Which parts of the codebase are untested, or hard to test via the current interface?
Apply the deletion test to anything suspected of being shallow: Would deleting it concentrate complexity, or just move it? "Concentrates" is the signal you're looking for.
2. Present candidates
Present a numbered list of deepening opportunities. Each candidate includes:
- Files — Which files/modules are involved
- Problem — Why the current architecture causes friction
- Solution — Describe what will change in plain language
- Benefits — Explain using locality and leverage, and how tests will improve
Use domain vocabulary from CONTEXT.md and architecture vocabulary from LANGUAGE.md. If
defines "Order", say "Order intake module" instead of "FooBarHandler" or "Order service".
ADR Conflicts: If a candidate conflicts with an existing ADR, only propose it if the friction is significant enough to warrant reopening the ADR. Mark it clearly (e.g., "contradicts ADR-0007 — but worth reopening because…"). Do not list every refactor that is theoretically prohibited by an ADR.
Do not propose interfaces without asking the user first. Ask the user: "Which one would you like to explore?"
3. Grilling loop
Once the user selects a candidate, enter a grilling conversation. Walk through the entire design tree with them: constraints, dependencies, the shape of the deepened module, what's behind the seam, and which tests will survive changes.
Generate inline side effects as decisions take shape:
- Naming a deepened module with a concept not in ? Add the term to , following the same discipline as (see CONTEXT-FORMAT.md). Lazily create the file if it doesn't exist.
- Tightened an ambiguous term during the conversation? Update immediately.
- User rejected a candidate with a substantial reason? Propose an ADR, phrased as: "Would you like me to record this as an ADR to prevent future architecture reviews from suggesting it again?" Only propose this if future explorers will actually need this reason to avoid repeating the suggestion; skip short-term reasons ("not worth it now") and obvious reasons. See ADR-FORMAT.md.
- Want to explore alternative interfaces for the deepened module? See INTERFACE-DESIGN.md.