Lean 4 Theorem Proving
Use this skill whenever you're editing Lean 4 proofs, debugging Lean builds, formalizing mathematics in Lean, or learning Lean 4 concepts. It prioritizes LSP-based inspection and mathlib search, with scripted primitives for sorry analysis, axiom checking, and error parsing.
Core Principles
Search before prove. Many mathematical facts already exist in mathlib. Search exhaustively before writing tactics.
Build incrementally. Lean's type checker is your test suite—if it compiles with no sorries and standard axioms only, the proof is sound.
Respect scope. Follow the user's preference: fill one sorry, its transitive dependencies, all sorries in a file, or everything. Ask if unclear.
Use 100-character line width for Lean files. Do not wrap lines at 80 characters — Lean and mathlib convention is 100. If a line fits within 100 characters, keep it on one line. See mathlib-style for breaking strategies when lines exceed 100.
Never change statements or add axioms without explicit permission. Theorem/lemma statements, type signatures, and docstrings are off-limits unless the user requests changes. Inline comments may be adjusted; docstrings may not (they're part of the API). Custom axioms require explicit approval—if a proof seems to need one, stop and discuss. Exception: within synthesis wrappers (
,
), session-generated declarations may be redrafted under the outer-loop statement-safety rules; see cycle-engine.md.
Commands
| Command | Purpose |
|---|
| Draft Lean declaration skeletons from informal claims |
| Interactive formalization — drafting plus guided proving |
| Autonomous end-to-end formalization from informal sources |
| Guided cycle-by-cycle theorem proving with explicit checkpoints |
| Autonomous multi-cycle theorem proving with explicit stop budgets |
| Save progress with a safe commit checkpoint |
| Read-only code review of Lean proofs |
| Leverage mathlib, extract helpers, simplify proof strategies |
| Improve Lean proofs for directness, clarity, performance, and brevity |
| Interactive teaching and mathlib exploration |
| Diagnostics, cleanup, and migration help |
This plugin ships a host-agnostic parser (
) that covers the
parser-decidable startup rules of the six parameter-heavy commands (
,
,
,
,
,
). A small set of
documented startup rules in these commands depend on runtime context (repo-
level search, interactive prompting) and are applied by the command after
reading the parser's output. The other commands (
,
,
,
,
) remain model-parsed.
When a host adapter installs the
hook, the parser runs
before the model sees a
prompt matching one of the six covered
commands, injects a
block into context, and rejects
invalid invocations at the hook level; invocations of the other commands pass
through unchanged. Hosts without the hook fall back to model-parsed startup
via the shared
command-invocation.md
contract.
Commands always announce resolved inputs, reject invalid startup configs before
doing work, and treat wall-clock budgets like
as
best-effort.
Which Command?
| Situation | Command |
|---|
| Draft a Lean skeleton (skeleton by default) | |
| Draft + prove interactively | |
| Filling sorries (interactive) | |
| Filling sorries (unattended) | |
| Save point (per-file + project build, best-effort axiom scan, commit) | |
| Quality check (read-only) | |
| Simplify proof strategies (mathlib leverage, helpers) | |
| Optimizing compiled proofs | |
| New to this project / exploring | |
| Navigating mathlib for a topic | /lean4:learn --mode=mathlib
|
| Something not working | |
| Formalize + prove end-to-end (unattended) | /lean4:autoformalize --source=... --claim-select=first --out=...
|
Contributing (lean4-contribute plugin)
If the plugin is installed, you may
suggest these commands at natural stopping points. Rules:
- Suggest first, never invoke unprompted. Offer a one-line question; do not start the command flow.
- Only invoke after explicit user opt-in in the current conversation. Silence, topic change, or implicit frustration do not count as consent.
- At most once per topic per session unless the user engages.
- Never mid-proof. Wait for a natural stopping point.
| Situation | Suggest |
|---|
| Problem appears to be in lean4-skills itself (wrong command behavior, contradictory docs, broken lint, bad guardrail, confusing plugin UX) — not ordinary Lean/mathlib/user-proof problems | "This looks like a lean4-skills bug. Want me to draft a bug report?" → /lean4-contribute:bug-report
|
| User wants a workflow the plugin doesn't support, says a command should behave differently, or you must recommend awkward manual steps due to a missing feature | "This looks like a plugin workflow gap. Want me to draft a feature request?" → /lean4-contribute:feature-request
|
| Result seems reusable beyond the current task: tactic-selection heuristic, mathlib search pattern, anti-pattern, documentation gap with a clear lesson — not one-off theorem facts or private repo details | "That seems reusable beyond this task. Want me to draft a shareable insight?" → /lean4-contribute:share-insight
|
If the plugin is not installed and the user clearly hit a lean4-skills bug, workflow gap, or reusable insight (same criteria as above — not ordinary Lean/mathlib issues), you may offer the install hint once:
- At most once per session. Do not repeat if the user declined, ignored it, or moved on.
- Never mid-proof or during an active debugging loop.
- One short line, not a pitch: "If you want, install the plugin and I can draft that report for you here." See the lean4-contribute README for setup.
Typical Workflow
┌─ Entry points (pick one) ──────────────────────────────────────────────────────────┐
│ /lean4:draft Skeleton by default (--mode=attempt for shallow proof) │
│ /lean4:formalize Interactive: draft + guided proving │
│ /lean4:autoformalize Autonomous: draft + autonomous proving │
└────────────────────────────────────────────────────────────────────────────────────┘
↓ (if sorries remain)
/lean4:prove / autoprove Proof engines (sorry filling, no header edits)
↓
/lean4:refactor Leverage mathlib, extract helpers (optional)
↓
/lean4:golf Improve proofs (optional)
↓
/lean4:checkpoint Save point (per-file + project build)
Use
at any point to explore repo structure or navigate mathlib. Three entry points:
for skeletons,
for interactive synthesis (draft + guided proving),
for unattended source-to-proof.
Notes:
- asks before each cycle; loops autonomously with explicit stop budgets
- Both trigger at configured intervals ()
- When reviews run (via ), they act as gates: review → replan → continue. In prove, replan requires user approval; in autoprove, replan auto-continues
- Review supports (default) or (triage); review is always read-only
- wraps draft+autoprove in a single command (source → claims → skeletons → proofs); replaces
autoprove --formalize=auto
- Proof engines (/) never modify declaration headers (header fence)
- If you hit environment issues, run to diagnose
LSP Tools (Preferred)
Sub-second feedback and search tools (LeanSearch, Loogle, LeanFinder) via Lean LSP MCP:
lean_goal(file, line) # See exact goal
lean_hover_info(file, line, col) # Understand types
lean_local_search("keyword") # Fast local + mathlib (unlimited)
lean_leanfinder("goal or query") # Semantic, goal-aware (10/30s)
lean_leansearch("natural language") # Semantic search (3/30s)
lean_loogle("?a → ?b → _") # Type-pattern (unlimited if local mode)
lean_hammer_premise(file, line, col) # Premise suggestions for simp/aesop/grind (3/30s)
lean_state_search(file, line, col) # Goal-conditioned lemma search (3/30s)
lean_multi_attempt(file, line, snippets=[...]) # Test multiple tactics
lean_diagnostic_messages(file) # Per-file error/warning check
lean_code_actions(file, line) # Resolve "Try this" suggestions to edits
is for isolated scratch experiments, not a substitute for live proof-state inspection via
/
/
. Prefer live-file tools when the question depends on actual file context.
Capabilities
| Capability | Required | Check | Fallback |
|---|
| Lean / Lake | yes | , | none — run |
| Python 3 | yes (scripts) | set by bootstrap | none for script-dependent operations |
| yes (set by bootstrap) | | run |
| Lean LSP MCP | no | try on any file | scripts + (file-level only) |
| no | try calling it | on temp file |
| no | try calling it | manual "Try this" application |
| Subagent dispatch | no | host-dependent | run work in main thread |
| Slash commands | no | host-dependent | follow skill instructions directly |
Operating Profiles
The skill adapts to what's available. Determine your profile by checking capabilities above, then follow the corresponding guidance.
full (all capabilities)
MCP + subagents + commands. Full workflow with live goal inspection, tactic testing, and parallel subagent dispatch (requires disjoint owned-file sets per agent, or separate worktrees). Subagents get pre-collected MCP context per
cycle-engine.md § Pre-flight Context. If
is unavailable, use
scratch files with
for isolated experiments.
mcp_main_only (MCP available, no subagent dispatch)
MCP works in the main thread. Run all proof work directly — do not delegate to subagents. All cycle-engine phases execute in-thread. If
is unavailable, use
scratch files with
for isolated experiments.
scripts_only (no MCP, no subagents)
Use
for search and
/
for validation.
Key limitations in this mode:
- No live goal inspection — is unavailable; you can read the file and check compilation output, but cannot see proof state at a specific line
- No tactic testing — is unavailable; edits must be validated by compiling the file ()
- No real-time diagnostics — is unavailable; use (from project root) for compilation errors, but feedback is file-level, not line-level
- Search is script-based —
$LEAN4_SCRIPTS/smart_search.sh
replaces LSP search tools
This mode is functional for straightforward proofs but significantly slower and less precise than MCP-backed workflows.
review_only (read-only, no edits)
Read proof state and assess quality. No edits, no commits, no subagent dispatch.
File Handling Rules
Scratch-work ladder (in preference order):
- Live file + MCP tools (, , )
- for isolated experiments
- scratch files only when is unavailable and the experiment must not touch the live file
- Never create scratch files in the repo root
File inspection: Use Read and Grep to view source files. Never write Python scripts, temp files, or use
pipelines just to read lines from a file you already have access to.
Staging: Stage only files touched during the current session. Never use
or broad glob patterns. Print the exact staged set before committing.
See sorry-filling.md for the full scratch-work preference order.
Core Primitives
| Script | Purpose | Output |
|---|
| Find sorries with context | text (default), json, markdown, summary |
| Best-effort axiom scan (top-level declarations) | text |
| Multi-source mathlib search | text |
| Detect optimization patterns | JSON |
| Find declaration usages | text |
Usage: Invoked by commands automatically. See references/ for details.
Invocation contract: Never run bare script names. Always use:
- Python:
${LEAN4_PYTHON_BIN:-python3} "$LEAN4_SCRIPTS/script.py" ...
- Shell:
bash "$LEAN4_SCRIPTS/script.sh" ...
- Report-only calls: add to , , — suppresses exit 1 on findings; real errors still exit 1. Do not use in gate commands like .
- Keep stderr visible for Lean scripts (no redirection), so real errors are not hidden.
If
is unset or missing, run
and stay LSP-only until resolved.
Automation
- prove — guided, asks before each cycle. Ideal for interactive sessions.
- autoprove — autonomous, loops with explicit stop budgets. Ideal for unattended runs.
Both share the same cycle engine (plan → work → checkpoint → review → replan → continue/stop) and follow the LSP-first protocol: LSP tools are normative for discovery and search; script fallback only when LSP is unavailable or exhausted. Compiler-guided repair is escalation-only — not the first response to build errors. For complex proofs, they may delegate to internal workflows for deep sorry-filling (with snapshot, rollback, and scope budgets), proof repair, or axiom elimination. You don't invoke these directly.
Skill-Only Behavior
When editing
files without invoking a command, the skill runs
one bounded pass:
- Read the goal or error via /
- Search mathlib with up to 2 LSP tools (e.g. + //)
- Try the Automation Tactics cascade
- Validate with (no project-gate in this mode)
- No looping, no deep escalation, no multi-cycle behavior, no commits
- End with suggestions:
Use
for guided cycle-by-cycle help.
Use
for autonomous cycles with stop safeguards.
Quality Gate
A proof is complete when:
- passes
- Zero sorries in agreed scope
- Only standard axioms (, , )
- No statement changes without permission
Verification ladder:
lean_diagnostic_messages(file)
per-edit →
lake env lean <path/to/File.lean>
file gate (run from project root) →
project gate only. See
cycle-engine: Build Target Policy.
Common Fixes
See compilation-errors for error-by-error guidance (type mismatch, unknown identifier, failed to synthesize, timeout, etc.).
Type Class Patterns
lean
-- Local instance for this proof block
haveI : MeasurableSpace Ω := inferInstance
letI : Fintype α := ⟨...⟩
-- Scoped instances (affects current section)
open scoped Topology MeasureTheory
Order matters: provide outer structures before inner ones.
Automation Tactics
Try in order (stop on first success):
→
→
→
→
→
→
→
→
→
Note:
/
query mathlib (slow).
and
are powerful but may timeout. See
grind-tactic for interactive workflows, annotation strategy, and simproc escalation.
Troubleshooting
If LSP tools aren't responding, check your operating profile above. In
mode,
provides search and
provides file-level compilation feedback, but live goal inspection, tactic testing, and line-level diagnostics are unavailable. If environment variables (
,
) are missing, run
to diagnose.
Script environment check:
bash
echo "$LEAN4_SCRIPTS"
ls -l "$LEAN4_SCRIPTS/sorry_analyzer.py"
# One-pass discovery for troubleshooting (human-readable default text):
${LEAN4_PYTHON_BIN:-python3} "$LEAN4_SCRIPTS/sorry_analyzer.py" . --report-only
# Structured output (optional): --format=json
# Counts only (optional): --format=summary
Cold start / fresh worktree:
- Fresh worktree or after ? Prime the cache in that worktree before the first real build.
- Use the project's cache command: on newer Lake, or where the project still uses the mathlib cache executable.
- If Lean LSP is cold or timing out on first use, run one to bootstrap the workspace.
- After bootstrap, return to the normal verification ladder:
lean_diagnostic_messages(file)
→ lake env lean <path/to/File.lean>
(from project root) → only at checkpoint/final gate.
- Do not symlink another worktree's ; use Lake cache/artifact mechanisms instead.
References
Cycle Engine: cycle-engine — shared prove/autoprove logic (stuck, deep mode, falsification, safety)
LSP Tools: lean-lsp-server (quick start),
lean-lsp-tools-api (full API — grep
for tool names)
Search: mathlib-guide (read when searching for existing lemmas), lean-phrasebook (math→Lean translations)
Errors: compilation-errors (read first for any build error),
instance-pollution (typeclass conflicts — grep
for patterns),
compiler-guided-repair (escalation-only repair — not first-pass)
Tactics: tactics-reference (tactic lookup — grep
),
grind-tactic (SMT-style automation — when simp can't close),
simp-reference (simp hygiene + custom simprocs),
tactic-patterns,
calc-patterns
Proof Development: proof-templates, proof-refactoring (28K — grep by topic), proof-simplification (strategy-level: mathlib search, congr lemmas, helper extraction), sorry-filling
Optimization: proof-golfing (includes safety rules, bounded LSP lemma replacement, bulk rewrites, anti-patterns; escalates to axiom-eliminator), proof-golfing-patterns, performance-optimization (grep by symptom), profiling-workflows (diagnose slow builds/proofs)
Domain: domain-patterns (25K — grep
),
measure-theory (28K),
axiom-elimination
Style: mathlib-style, verso-docs (Verso doc comment roles and fixups)
Custom Syntax: lean4-custom-syntax (read when building notations, macros, elaborators, or DSLs), metaprogramming-patterns (MetaM/TacticM API — composable blocks, elaborators), scaffold-dsl (copy-paste DSL template), json-patterns (json% syntax + ToJson)
Quality: linter-authoring (project-specific linter rules),
ffi-interop (FFI,
, init, symbol linkage)
Workflows: agent-workflows, subagent-workflows, command-examples, learn-pathways (intent taxonomy, game tracks, source handling)
Internals: review-hook-schema, compiler-internals (attributes, specialization, pipeline)