Loading...
Loading...
Token-efficient persistent memory system for Claude Code that extends your session limits by 3-5x. Layered architecture with progressive loading, compact encoding, branch-aware context, smart compression, session diffing, conflict detection, session continuation protocol, and recovery mode. Activates at session start (if MEMORY.md exists), on "remember this", "pick up where we left off", "what were we doing", "wrap up", "save progress", "don't forget", "switch context", "hand off", "memory health", "save state", "continue where I left off", "context budget", "how much context left", or any session start on a project with existing memory files. This skill solves two problems at once: Claude forgetting everything between sessions, AND sessions hitting context limits too fast. It replaces thousands of wasted re-explanation tokens with a compact, structured memory load that gives Claude full project context in under 2,000 tokens.
npx skill4agent add nagendhra-web/memory-bank memory-bank┌─────────────────────────────────────────────┐
│ Layer 2: GLOBAL MEMORY │
│ ~/.claude/GLOBAL-MEMORY.md │
│ Cross-project patterns, user preferences, │
│ reusable decisions. Permanent. │
├─────────────────────────────────────────────┤
│ Layer 1: PROJECT MEMORY │
│ ./MEMORY.md (+ branch overlays) │
│ Architecture, decisions, active work. │
│ Lives as long as the project. │
├─────────────────────────────────────────────┤
│ Layer 0: SESSION CONTEXT │
│ In-conversation only. │
│ Current task focus, scratch notes. │
│ Dies when session ends (persisted to L1). │
└─────────────────────────────────────────────┘~/.claude/GLOBAL-MEMORY.mdSeefor full architecture details.references/memory-layers.md
| Trigger | Action |
|---|---|
Session starts, | Full load sequence |
| Mid-session update |
| Full session write |
| Load + summarize |
| Branch-aware load |
| Health check |
| Generate handoff doc |
| Run compression |
| Recovery mode |
| Session continuation protocol |
| Context budget check |
| Emergency save + continuation file |
Step 1: Detect memory files
└─ Check for MEMORY.md in project root
└─ Check for ~/.claude/GLOBAL-MEMORY.md
└─ Check for MEMORY-ARCHIVE.md (has history been archived?)
Step 2: Detect git context
└─ Current branch name
└─ Check for .memory/branches/<branch>.md overlay
└─ Days since last session (from "Last updated" field)
Step 3: Session diff (if git available)
└─ Commits since last memory update
└─ Files changed since last session
└─ Any conflicts between memory and current code state
Step 4: Health check
└─ Score memory freshness (see Health Scoring below)
└─ Flag stale entries
└─ Flag referenced files that no longer exist
Step 5: Context-aware greeting
└─ Summarize where we left off (2-3 sentences, specific)
└─ Report any drift detected (code changed, memory stale)
└─ State the next immediate action
└─ Ask: "Ready to continue, or has the plan changed?""Welcome back! Last session you finished the Stripe webhook handler inand were about to write integration tests. Thesrc/api/webhooks/stripe.tsfunction is complete buthandlePaymentSuccess()is stubbed out. 3 commits have landed since — all yours, no surprises. Ready to pick up with the integration tests?"handleRefund()
"Welcome back! Your memory is from 5 days ago on, but you're now onmain. I found a branch overlay from 3 days ago with context about the profile avatar upload. However,feature/user-profilesreferenced in memory was renamed tosrc/components/Avatar.tsx. Want me to update memory with the current state before we continue?"ProfileImage.tsx
MEMORY.mdKey DecisionsActive WorkCompletedWhere We Left OffBlockersNotesStep 1: Audit the session
└─ What was accomplished? (be specific: files, functions, lines)
└─ What decisions were made and why?
└─ What's blocked or unresolved?
└─ What should happen next? (crystal clear next step)
Step 2: Compress completed work
└─ Move finished items to Completed with one-line summaries
└─ Remove resolved blockers
└─ Archive stale notes
Step 3: Update memory health metadata
└─ Update "Last updated" timestamp
└─ Increment session counter
└─ Update file reference table (verify paths still exist)
Step 4: Write MEMORY.md
└─ Full overwrite with current state
└─ Verify the file was written successfully
Step 5: Check compression threshold
└─ If > 150 lines, suggest compression
└─ If > 200 lines, auto-compress (see Smart Compression)
Step 6: Prompt for global memory
└─ Any cross-project learnings worth saving to Layer 2?
└─ New user preferences discovered?# Project Memory
Last updated: [DATE] | Session [N] | Branch: [BRANCH]
Memory health: [SCORE]/10
## Project Overview
[1-2 sentences. What this is, what stack, what stage.]
## Where We Left Off
- **Current task:** [specific task with file/function reference]
- **Status:** [done | in progress | blocked]
- **Next immediate step:** [so clear Claude can start without asking anything]
- **Open question:** [decision pending, if any]
## Completed
- [DATE] [one-line summary with key files touched]
- [DATE] [one-line summary]
## Active Work
- [ ] [task — specific file, function, or component]
- [ ] [task]
- [x] [recently completed, will archive on next compression]
## Blockers
- [blocker with context on what's needed to unblock]
## Key Decisions
| Date | Decision | Reasoning | Affects |
|------|----------|-----------|---------|
| [DATE] | [what was decided] | [why] | [files/areas impacted] |
## Key Files
| File | Purpose | Last Modified |
|------|---------|---------------|
| [path] | [what it does] | [session N] |
## Architecture Notes
[Non-obvious design choices, data flow, system boundaries]
## Known Issues
- [issue, severity, and workaround if any]
## Session Log
| Session | Date | Summary |
|---------|------|---------|
| [N] | [DATE] | [one-line summary of what happened] |
## User Preferences
[How the user likes to work — discovered across sessions]
## External Context
[APIs, services, env setup — NO secrets, NO credentials, NEVER]MEMORY.md <- Base project memory (main/trunk)
.memory/
branches/
feature-auth.md <- Overlay for feature/auth branch
feature-payments.md <- Overlay for feature/payments branch
bugfix-race-condition.md <- Overlay for bugfix branchMEMORY.md.memory/branches/<branch-slug>.mdMEMORY.md.memory/branches/<branch>.mdfeature/authSeefor merge strategies.references/branch-aware-memory.md
MEMORY-ARCHIVE.md# Memory Archive
Archived sessions from Project Memory.
## Sessions 1-8 Summary
[Paragraph summary of early project work]
## Key Milestones
- Session 2: Initial project scaffolding complete
- Session 5: Auth system shipped
- Session 8: Database migration to Prisma completeSeefor the full compression algorithm.references/smart-compression.md
# Get the date from MEMORY.md "Last updated" field
# Then check what happened since
git log --oneline --since="[last-updated-date]"
git diff --stat HEAD~[commits-since]"Since your last session (3 days ago), there have been 7 commits: 4 by you, 3 by @teammate. Key changes:was refactored,src/api/users.tshas 2 new dependencies (zod, @tanstack/query). Your memory referencespackage.json— I'll verify it's still accurate."src/api/users.ts
package.jsonsrc/auth/login.ts.envSeefor conflict resolution strategies.references/session-diffing.md
| Dimension | Weight | Score 10 | Score 1 |
|---|---|---|---|
| Freshness | 30% | Updated today | > 14 days old |
| Relevance | 30% | All referenced files exist | Most files missing/renamed |
| Completeness | 20% | All sections filled, next step clear | Missing key sections |
| Actionability | 20% | Can start working immediately | Need to ask 3+ questions |
Memory health: 8/10
Freshness: 9/10 (updated yesterday)
Relevance: 7/10 (2 file paths changed)
Completeness: 8/10 (all sections present)
Actionability: 9/10 (next step is crystal clear)Step 1: Scan the project
└─ Read package.json / pyproject.toml / go.mod (detect stack)
└─ Read README.md and CLAUDE.md (project context)
└─ List key directories and recent files
Step 2: Read git history
└─ Last 20 commits (who, what, when)
└─ Current branch and recent branches
└─ Any open/recent PRs
Step 3: Reconstruct memory
└─ Build Project Overview from package.json + README
└─ Build Key Files from most-modified files in git log
└─ Build Key Decisions from commit messages and code patterns
└─ Set "Where We Left Off" from most recent commits
└─ Flag confidence level: "Reconstructed from code — verify with user"
Step 4: Present and confirm
└─ Show reconstructed memory to user
└─ Ask for corrections
└─ Write verified MEMORY.md# Project Handoff: [Project Name]
Generated: [DATE] | By: [user] via Claude Code
## Quick Start
1. Clone: `git clone [repo]`
2. Install: `[package manager] install`
3. Setup: [env vars, database, etc.]
4. Run: `[dev command]`
## Current State
[Where the project is right now — what works, what doesn't]
## Architecture
[System diagram, key components, data flow]
## Active Work
[What's in progress, what's next, what's blocked]
## Key Decisions & Why
[Decisions that a new developer would question — with the reasoning]
## Gotchas
[Things that will bite you if you don't know about them]
## Who to Ask
[People, channels, or docs for domain-specific questions]Session start WITHOUT memory-bank:
User: "Let's continue working on the app"
Claude: "What app? What stack? What were we doing?"
User: "It's a Next.js e-commerce app with Prisma and Stripe..."
[400+ tokens explaining the project]
User: "We were building the checkout flow..."
[300+ tokens explaining current state]
User: "The key files are..."
[200+ tokens listing files]
User: "We decided to use X because..."
[300+ tokens re-explaining decisions]
Total wasted: ~1,200+ tokens EVERY SESSION just to get back to baseline.
Over 10 sessions: ~12,000 tokens wasted on re-explanation alone.Session start WITH memory-bank:
Claude reads MEMORY.md: ~800 tokens (compact, structured, complete)
Claude greets with full context: ~150 tokens
User: "Let's go"
Total: ~950 tokens. Savings: 60-80% per session start.
Over 10 sessions: ~9,000+ tokens saved on context alone.Tier 1: ALWAYS load (costs ~200 tokens)
└─ Project Overview (1-2 sentences)
└─ Where We Left Off (current task, status, next step)
└─ Active Blockers
Tier 2: Load on DEMAND (costs ~300 tokens when needed)
└─ Key Decisions (only when a decision comes up)
└─ Key Files (only when working with files not in Tier 1)
└─ Architecture Notes (only when touching architecture)
Tier 3: Load ONLY when asked (costs ~200 tokens when needed)
└─ Session Log (only for velocity/history questions)
└─ User Preferences (only on first session or when relevant)
└─ External Context (only when working with APIs/services)BAD (38 tokens):
"We made the decision to use Prisma as our ORM instead of Drizzle
because it provides better TypeScript type inference and the team
is already familiar with it from previous projects."
GOOD (14 tokens):
| 2025-04-01 | Prisma over Drizzle | Type inference, team familiarity | All DB |BAD (scattered prose — 120 tokens for 5 files):
The main checkout route is in src/app/api/checkout/route.ts. The Stripe
client is configured in src/lib/stripe.ts. Cart state management is in...
GOOD (table — 60 tokens for 5 files):
| File | Purpose |
| src/app/api/checkout/route.ts | Stripe session creation |
| src/lib/stripe.ts | Stripe client singleton |
| src/stores/cart.ts | Zustand cart + persistence |BAD (prose):
We are currently working on the webhook handler, which is partially
complete. We also need to write tests and haven't started yet.
GOOD (checklist):
- [x] Stripe webhook handler — handlePaymentSuccess()
- [ ] handleRefund() — stubbed, needs implementation
- [ ] Integration tests for webhook endpointsBAD: "The project is essentially a web application that was built for..."
GOOD: "Bakery e-commerce. Next.js 14, Prisma, Stripe. Launching April."At session start, estimate the context budget:
Available context: ~200,000 tokens (Claude's window)
Memory load: ~800 tokens (Tier 1 + loaded Tiers)
System prompt: ~2,000 tokens
Remaining for work: ~197,200 tokens
At 40% usage (~80,000 tokens consumed):
→ Suggest: "We're at 40% context. Consider compacting soon."
At 60% usage (~120,000 tokens consumed):
→ Save a session checkpoint automatically
→ Suggest: "Context at 60%. Good time to /compact or start fresh."
At 80% usage (~160,000 tokens consumed):
→ Auto-save full state to MEMORY.md
→ Alert: "Context is at 80%. Saving state now — you can continue
in a new session with zero loss. Say 'wrap up' or keep going."Step 1: EMERGENCY SAVE (before context dies)
└─ Write MEMORY.md with EVERYTHING from current session
└─ Include exact cursor position: file, function, line number
└─ Include any uncommitted mental model (what Claude was thinking)
└─ Include partial work state: what's done, what's half-done, what's next
Step 2: Write CONTINUATION.md (a one-shot warm-up file)
└─ Ultra-compact: under 50 lines, under 500 tokens
└─ Contains ONLY what the next session needs to start immediately
└─ Format:
```markdown
# Continue: [task name]
Resume from: `src/auth/refresh.ts:47` — writing rotateToken()
## State
- handlePaymentSuccess(): DONE ✓
- handleRefund(): stubbed at line 89, needs Stripe refund.created event
- Tests: NOT STARTED
## Context
- Stripe webhook sig verified in middleware (line 12)
- Using stripe.webhooks.constructEvent() not manual HMAC
- Refund handler follows same pattern as payment handler
## Immediate Next Action
Implement handleRefund() in src/api/webhooks/stripe/route.ts:89
using the stripe.refund.created event payload. Pattern:
extract refund.payment_intent → find order → update status to "refunded"
**Trigger phrases:** "save state", "I'm running out of context",
"continue this later", "session is getting long"
### Token Savings By Feature
| Feature | Tokens Saved Per Session | How |
|---------|------------------------|-----|
| Structured memory vs re-explaining | 800-1,500 | Compact format replaces verbal explanation |
| Progressive loading (Tier 1 only) | 300-600 | Don't load what you don't need |
| Compact encoding (tables > prose) | 200-400 | Same info, fewer tokens |
| Session continuation protocol | 500-1,000 | Zero warm-up in new sessions |
| Smart compression | 200-500 | Smaller file = fewer tokens to read |
| Branch-aware selective loading | 100-300 | Skip irrelevant branch context |
| **Total per session** | **2,100-4,300** | |
| **Over 10 sessions** | **21,000-43,000** | |
### Anti-Patterns That Waste Tokens
**Never do these in memory files:**
**Always do these:**
> See `references/context-efficiency.md` for the full token optimization guide.
---
## Rules for Excellent Memory
**Be surgical, not vague.**
Bad: "Working on auth"
Good: "Implementing JWT refresh token rotation in `src/auth/refresh.ts` —
`rotateToken()` is complete, needs Redis TTL logic in `src/cache/tokens.ts:47`"
**The "Next immediate step" is the single most important line.**
It should be so precise that Claude can start coding the instant a session
begins, with zero clarifying questions.
**Capture the "why" behind every decision.**
Future Claude will encounter the same trade-offs and re-litigate them
unless the reasoning is recorded.
**Never store secrets.** No API keys, passwords, tokens, or credentials.
Ever. Not even "temporarily". Reference `.env` or a secrets manager instead.
**Overwrite on session end, surgical update mid-session.**
Session end = full rewrite for consistency. Mid-session = targeted section
updates to avoid losing context.
**Keep it under 150 lines.** Compress aggressively. Stale information is
actively harmful — it misleads more than it helps.
---
## Auto-Setup via CLAUDE.md
For fully automatic memory with all features, add to project `CLAUDE.md`
(or `~/.claude/CLAUDE.md` for all projects):
```markdown
## Memory
At the start of every session:
1. Check for MEMORY.md in the project root
2. Check for ~/.claude/GLOBAL-MEMORY.md
3. Check current git branch and look for .memory/branches/<branch>.md
4. Run session diff — what changed since last memory update
5. Score memory health and flag any issues
6. Greet me with a specific summary and the next immediate step
During sessions:
- Update memory when I say "remember this" or complete a milestone
- Track key decisions with reasoning in the decision table
At session end (when I say "wrap up", "save", "done for now"):
1. Write comprehensive MEMORY.md with full current state
2. Ensure "Next immediate step" is crystal clear
3. Run compression if over 150 lines
4. Confirm what was savedSeefor the full integration guide.references/claude-md-integration.md
references/memory-layers.mdreferences/branch-aware-memory.mdreferences/smart-compression.mdreferences/session-diffing.mdreferences/advanced-patterns.mdreferences/context-efficiency.mdreferences/claude-md-integration.mdexamples/solo-fullstack.mdexamples/team-backend.mdexamples/monorepo.mdexamples/minimal.md