Codebase Analyzer Skill
Operator Context
This skill operates as an operator for statistical codebase analysis, configuring Claude's behavior for measurement-based rule discovery from Go codebases. It implements a Measure, Don't Read methodology -- Python scripts count patterns to avoid LLM training bias override, then statistics are interpreted to derive confidence-scored rules.
Hardcoded Behaviors (Always Apply)
- CLAUDE.md Compliance: Read and follow repository CLAUDE.md files before execution. Project instructions override default behaviors.
- Over-Engineering Prevention: Scripts perform pure statistical measurement only. No feature additions beyond counting patterns. No speculative metrics or flexibility that was not requested.
- Measurement-Only Analysis: Scripts count and measure; NEVER interpret or judge code quality during data collection phase. The LLM is a calculator, not a judge.
- No Training Bias: Analysis MUST avoid LLM interpretation of "good" vs "bad" patterns during measurement. What IS in the code is the local standard.
- Confidence Gating: Only derive rules from patterns with >70% consistency. Below that threshold, report statistics without creating rules.
- Separate Measurement from Interpretation: Run scripts first (mechanical), then interpret statistics second (analytical). Never combine these steps.
Default Behaviors (ON unless disabled)
- Communication Style: Report facts without self-congratulation. Show complete statistics rather than describing them. Be concise but informative.
- Temporary File Cleanup: Analysis scripts do not create temporary files (single-pass processing). Any debug outputs or iteration files should be removed at completion.
- Verbose Output: Display summary statistics to stderr, full JSON to stdout or file.
- Confidence Thresholds: HIGH (>85%), MEDIUM (70-85%), below 70% not extracted as rule.
- Vendor Filtering: Automatically skip vendor/, testdata/, and generated code to avoid polluting statistics with external patterns.
Optional Behaviors (OFF unless enabled)
- Cross-Repository Analysis: Compare patterns across multiple repos (requires explicit request).
- Historical Tracking: Re-analyze same repo over time to track pattern evolution (requires explicit request).
- Custom Metric Addition: Add new measurement categories beyond the 100 standard metrics (requires explicit request).
What This Skill CAN Do
- Extract implicit coding rules through statistical analysis of Go codebases
- Measure 100 metrics across 25 categories using Python scripts
- Derive confidence-scored rules from pattern frequency data
- Produce a 10-dimensional Style Vector quality fingerprint (0-100 scores)
- Discover shadow constitution rules (linter suppressions teams accept)
- Compare patterns across multiple repositories for team-wide standards
What This Skill CANNOT Do
- Judge code quality subjectively (measures patterns, not "good" vs "bad")
- Analyze non-Go codebases (scripts are Go-specific)
- Derive rules from codebases with fewer than 50 Go files (insufficient sample)
- Replace code review or linting (produces rules, not enforcement)
- Skip measurement and rely on LLM "reading" the code
Instructions
Phase 1: CONFIGURE (Do NOT proceed without validated target)
Goal: Validate target and select analyzer variant.
Step 1: Validate the target
- Confirm path points to a Go repository root with .go files
- Check for standard structure (cmd/, internal/, pkg/)
- Verify sufficient file count (50+ files for meaningful rules, 100+ ideal)
Step 2: Select cartographer variant
| Variant | Script | Metrics | Use When |
|---|
| Omni (recommended) | (not yet implemented) | 100 across 25 categories | Full codebase profiling |
| Basic | (not yet implemented) | ~15 categories | Quick pattern overview |
| Ultimate | | 6 focused categories | Performance pattern detection |
Step 3: Verify environment
- Python 3.7+ available
- No external dependencies needed (uses only Python standard library)
- Output directories exist or can be created
===============================================================
PHASE 1: CONFIGURE
===============================================================
Target Repository:
- Path: [/path/to/repo]
- Go Files: [N files found]
- Structure: [cmd/ | internal/ | pkg/ | flat]
Variant Selected: [Omni | Basic | Ultimate]
Reason: [why this variant]
Validation:
- [ ] Path exists and contains .go files
- [ ] File count >= 50 (actual: N)
- [ ] Python 3.7+ available
- [ ] Output directory writable
CONFIGURE complete. Proceeding to MEASURE...
===============================================================
Gate: Target directory exists, contains 50+ Go files, variant selected. Proceed only when gate passes.
Phase 2: MEASURE (Do NOT interpret during this phase)
Goal: Run statistical analysis scripts. Pure measurement -- no interpretation yet.
Step 1: Execute the cartographer
bash
# TODO: scripts/cartographer_omni.py not yet implemented
# Manual alternative: use grep/find to count patterns across Go files
# Example: count error wrapping patterns
grep -rn 'fmt.Errorf.*%w' ~/repos/my-project --include="*.go" | wc -l
# Example: count constructor patterns
grep -rn 'func New' ~/repos/my-project --include="*.go" | wc -l
Step 2: Verify output integrity
- Confirm JSON output is valid and complete
- Check file count matches expectations (no vendor pollution)
- Verify all three lenses produced data
- Confirm derived_rules section exists in output
Step 3: Check for data quality issues
- File count suspiciously high? Vendor code may be included
- File count suspiciously low? Subdirectories may be missed
- All percentages near 50%? May indicate mixed codebase or insufficient data
===============================================================
PHASE 2: MEASURE
===============================================================
Script Executed: [cartographer_omni.py (not yet implemented — use manual pattern counting)]
Target: [/path/to/repo]
Results:
- Files analyzed: [N]
- Total lines: [N]
- Categories measured: [N of 25]
- Derived rules: [N auto-extracted]
Data Quality:
- [ ] JSON output valid
- [ ] File count reasonable (no vendor pollution)
- [ ] All three lenses have data
- [ ] No unexpected zeros in major categories
Output saved to: [path/to/output.json]
MEASURE complete. Proceeding to INTERPRET...
===============================================================
Gate: Script completed without errors, JSON output is valid, file count is reasonable. Proceed only when gate passes.
Phase 3: INTERPRET (Now the LLM analyzes)
Goal: Derive rules from statistics. This is where LLM interpretation happens -- AFTER measurement is complete.
Step 1: Review the three lenses
| Lens | Question | Measures |
|---|
| Consistency (Frequency) | "How often do they use X?" | Imports, test frameworks, logging, modern features |
| Signature (Structure) | "How do they name/structure things?" | Constructors, receivers, parameter order, variables |
| Idiom (Implementation) | "How do they implement patterns?" | Error handling, control flow, context usage, defer |
For detailed lens explanations, see
references/three-lenses.md
.
Step 2: Extract rules by confidence
| Confidence | Threshold | Action | Example |
|---|
| HIGH | >85% consistency | Extract as enforceable rule | "96% use err not e" -> MUST use err |
| MEDIUM | 70-85% consistency | Extract as recommendation | "78% guard clauses" -> SHOULD prefer guards |
| Below 70% | Not extracted | Report as observation only | "55% single-letter receivers" -> No rule |
Step 3: Review Style Vector (Omni only)
- 10 composite scores (0-100): Consistency, Modernization, Safety, Idiomaticity, Documentation, Testing Maturity, Architecture, Performance, Observability, Production Readiness
- Identify strengths (scores >75) and gaps (scores <50)
- Note shadow constitution entries (accepted linter suppressions)
Step 4: Cross-reference lenses
- Pattern confirmed across multiple lenses = higher confidence
- Pattern in one lens only = standard confidence
- Contradictions between lenses = investigate further
Gate: Rules extracted with evidence and confidence levels. Style Vector reviewed. Proceed only when gate passes.
Phase 4: DELIVER (Do NOT mark complete without artifacts)
Goal: Produce actionable output artifacts.
Step 1: Save statistical report
cartography_data/{repo_name}_cartography.json
Step 2: Generate derived rules document
derived_rules/{repo_name}_rules.md
Format each rule as:
markdown
## Rule: [Statement]
**Confidence**: HIGH/MEDIUM
**Evidence**: [X% consistency across N occurrences]
**Category**: [error_handling | naming | control_flow | architecture | ...]
**Lens**: [Consistency | Signature | Idiom | Multiple]
Step 3: Summarize Style Vector (Omni only)
markdown
## Style Vector Summary
|-----------|-------|------------|
| Consistency | [0-100] | [Strength/Gap/Neutral] |
| Modernization | [0-100] | [Strength/Gap/Neutral] |
| ... | ... | ... |
Step 4: Recommend next steps
- Compare with pr-miner data if available (explicit vs implicit rules)
- Suggest CLAUDE.md updates for high-confidence rules
- Identify golangci-lint rules that could enforce discovered patterns
- Suggest quarterly re-analysis schedule
===============================================================
PHASE 4: DELIVER
===============================================================
Artifacts:
- [ ] JSON report: [path]
- [ ] Rules document: [path]
- [ ] Style Vector summary: [included in rules doc]
Results Summary:
- HIGH confidence rules: [N]
- MEDIUM confidence rules: [N]
- Observations (below threshold): [N]
- Style Vector overall: [strong/mixed/weak]
Next Steps:
1. [Specific recommendation]
2. [Specific recommendation]
3. [Specific recommendation]
DELIVER complete. Analysis finished.
===============================================================
Gate: JSON report saved, rules document generated, next steps documented. Analysis complete.
Complementary Skills
| Skill | Extracts | Combined Value |
|---|
| pr-miner | Explicit rules (what people argue about in reviews) | Agreement = HIGH confidence; Silence + consistency = implicit rule |
| codebase-analyzer | Implicit rules (what they actually do) | pr-miner says X but code does Y = rule not followed |
Reconciliation Matrix
| pr-miner | codebase-analyzer | Conclusion |
|---|
| Says X | Shows X at >85% | Confirmed rule (both explicit and practiced) |
| Silent | Shows X at >85% | Implicit rule (nobody argues because everyone agrees) |
| Says X | Shows Y at >85% | Rule stated but not followed (needs enforcement or is outdated) |
| Mixed signals | Inconsistent | No standard yet (opportunity to establish one) |
Examples
Example 1: Single Repository Analysis
User says: "What conventions does this repo follow?"
Actions:
- Validate target has 100+ Go files (CONFIGURE)
- Run pattern counting against the repo (MEASURE)
- Extract rules from statistics: error wrapping 89%, guard clauses 5.2x, New{Type} 94% (INTERPRET)
- Save JSON report and rules document (DELIVER)
Result: 30+ rules extracted with confidence levels, Style Vector produced
Example 2: Team-Wide Standards Discovery
User says: "Find our team's coding patterns across all services"
Actions:
- Validate all target repos, confirm 50+ files each (CONFIGURE)
- Run cartographer on each repo separately (MEASURE)
- Cross-reference patterns: error wrapping 87-91% across all repos = team standard (INTERPRET)
- Produce team-wide rules document with per-repo breakdowns (DELIVER)
Result: Team-wide standards with cross-repo evidence
Example 3: Onboarding New Developer
User says: "I just joined the team, what coding patterns should I follow?"
Actions:
- Identify main team repos, validate Go file counts (CONFIGURE)
- Run omni-cartographer on primary service (MEASURE)
- Extract top 10 HIGH confidence rules as onboarding checklist (INTERPRET)
- Produce concise rules doc focusing on error handling, naming, and control flow (DELIVER)
Result: Evidence-based onboarding guide with concrete examples from actual codebase
Error Handling
Error: "No Go files found"
Cause: Path does not point to a Go repository root, or .go files are in subdirectories not being scanned
Solution:
- Verify path points to repository root with or
find . -name "*.go" | head
- If Go files are nested, point to parent directory
- Confirm vendor/ is not the only directory containing Go files
Error: "No rules derived"
Cause: Codebase too small (<50 files) or patterns genuinely inconsistent
Solution:
- Check file count -- if <50, combine analysis across multiple repos from same team
- If >50 files but no rules, team genuinely lacks consistent patterns
- Lower threshold to 60% to find emerging patterns (note reduced confidence)
Error: "Statistics dominated by vendor/generated code"
Cause: Vendor directory or generated files not filtered, polluting pattern data
Solution:
- Verify scripts are filtering vendor/, testdata/, and _test files for core patterns
- If non-standard structure, analyze specific directories manually
- Check for generated code markers (Code generated by...) and exclude those files
Anti-Patterns
Anti-Pattern 1: LLM Reading Instead of Script Measuring
What: Using Claude to "read the codebase and find patterns" instead of running cartographer scripts
Why wrong: LLM applies training bias -- reports what "should be" instead of what IS. When the LLM sees
it reports "not wrapping errors properly" even if that IS the local standard.
Do instead: Run the cartographer script first (measurement), then interpret the statistics (analysis). Two separate steps, never combined.
Anti-Pattern 2: Rules from Low-Confidence Patterns
What: Creating enforceable rules from patterns below 70% consistency (e.g., "45% use fmt.Errorf with %w" becomes "All errors must use fmt.Errorf")
Why wrong: Forces consistency where the team has not achieved it organically. Causes false positives in reviews. Team may be transitioning between patterns.
Do instead: Only derive rules from HIGH confidence (>85%). For 70-85%, suggest "consider standardizing." Below 70%, report as observation only.
Anti-Pattern 3: Analyzing Insufficient Sample Size
What: Running analysis on a repo with <50 Go files and treating results as definitive patterns
Why wrong: Small sample size produces high variance. Patterns that appear consistent at 20 files may be coincidence. Cannot distinguish signal from noise.
Do instead: Require 50+ files minimum. For small repos, combine analysis across multiple team repos. For monorepos, analyze the full tree.
Anti-Pattern 4: One-Time Analysis Without Follow-Up
What: Analyzing once, extracting rules, never re-running as the codebase evolves
Why wrong: Coding patterns evolve with team growth and new Go versions. One-time snapshot becomes stale within months. Cannot measure impact of standardization efforts.
Do instead: Re-analyze quarterly. Compare Style Vector scores over time. Track pattern adoption (e.g., "Did Modernization score improve after Go 1.21 adoption?").
Anti-Pattern 5: Mixing Measurement and Interpretation
What: Having the LLM "read" code files and count patterns manually instead of running the deterministic Python scripts
Why wrong: LLM counting is unreliable at scale -- misses files, double-counts, applies inconsistent criteria. Python scripts produce deterministic, reproducible results across runs.
Do instead: ALWAYS run the cartographer script for measurement (Phase 2). The LLM's role begins at interpretation (Phase 3), working from the script's JSON output.
References
This skill uses these shared patterns:
- Anti-Rationalization - Prevents shortcut rationalizations
- Verification Checklist - Pre-completion checks
Domain-Specific Anti-Rationalization
| Rationalization | Why It's Wrong | Required Action |
|---|
| "I can read the code and find patterns" | Reading applies training bias; measures what "should be" not what IS | Run cartographer scripts for measurement |
| "Small repo is fine for analysis" | <50 files produces unreliable statistics | Combine repos or accept limited confidence |
| "This 55% pattern should be a rule" | Below 70% is noise, not signal | Only extract rules above confidence threshold |
| "Analysis was done last year, still valid" | Patterns evolve with team and language | Re-analyze quarterly |
Reference Files
${CLAUDE_SKILL_DIR}/references/three-lenses.md
: Detailed explanation of the three analysis lenses
${CLAUDE_SKILL_DIR}/references/examples.md
: Real-world analysis examples and workflows
${CLAUDE_SKILL_DIR}/references/metrics-catalog.md
: Complete 100-metric catalog across 25 categories
Prerequisites
- Python 3.7+
- Go codebase to analyze (50+ files recommended)
- No external dependencies (uses only Python standard library)