memory-bank
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMemory Bank
内存库
An adaptive memory system that gives Claude Code persistent, intelligent
context across sessions — while cutting token waste so your sessions last
3-5x longer. Not a flat file — a layered architecture that compresses,
branches, diffs, self-heals, and loads only what matters.
一款自适应内存系统,为Claude Code提供跨会话的持久化智能上下文——同时减少令牌浪费,让会话时长延长3-5倍。它并非普通的扁平文件,而是具备压缩、分支、差异对比、自我修复能力,且仅加载关键内容的分层架构。
Core Architecture
核心架构
Memory Bank operates on three layers:
┌─────────────────────────────────────────────┐
│ Layer 2: GLOBAL MEMORY │
│ ~/.claude/GLOBAL-MEMORY.md │
│ Cross-project patterns, user preferences, │
│ reusable decisions. Permanent. │
├─────────────────────────────────────────────┤
│ Layer 1: PROJECT MEMORY │
│ ./MEMORY.md (+ branch overlays) │
│ Architecture, decisions, active work. │
│ Lives as long as the project. │
├─────────────────────────────────────────────┤
│ Layer 0: SESSION CONTEXT │
│ In-conversation only. │
│ Current task focus, scratch notes. │
│ Dies when session ends (persisted to L1). │
└─────────────────────────────────────────────┘Layer 0 (Session) — Ephemeral. Tracks what you're doing right now.
Automatically flushed to Layer 1 at session end.
Layer 1 (Project) — The primary memory file. Tracks project state,
decisions, active work, blockers. Branch-aware: each git branch can have
its own overlay that merges with the base memory.
Layer 2 (Global) — Cross-project knowledge. Your coding preferences,
tool choices, patterns you always use. Lives in .
Loaded alongside Layer 1 at session start.
~/.claude/GLOBAL-MEMORY.mdSeefor full architecture details.references/memory-layers.md
内存库基于三层架构运行:
┌─────────────────────────────────────────────┐
│ Layer 2: GLOBAL MEMORY │
│ ~/.claude/GLOBAL-MEMORY.md │
│ Cross-project patterns, user preferences, │
│ reusable decisions. Permanent. │
├─────────────────────────────────────────────┤
│ Layer 1: PROJECT MEMORY │
│ ./MEMORY.md (+ branch overlays) │
│ Architecture, decisions, active work. │
│ Lives as long as the project. │
├─────────────────────────────────────────────┤
│ Layer 0: SESSION CONTEXT │
│ In-conversation only. │
│ Current task focus, scratch notes. │
│ Dies when session ends (persisted to L1). │
└─────────────────────────────────────────────┘Layer 0(会话层) — 临时存储。跟踪当前正在执行的任务,会话结束时自动同步到Layer 1。
Layer 1(项目层) — 主内存文件。跟踪项目状态、决策、进行中的工作和阻塞点。支持分支感知:每个git分支可拥有独立的覆盖层,与基础内存合并。
Layer 2(全局层) — 跨项目知识。记录你的编码偏好、工具选择和常用模式,存储在中,会话启动时与Layer 1一同加载。
~/.claude/GLOBAL-MEMORY.md完整架构细节请参考。references/memory-layers.md
When to Activate
激活时机
| Trigger | Action |
|---|---|
Session starts, | Full load sequence |
| Mid-session update |
| Full session write |
| Load + summarize |
| Branch-aware load |
| Health check |
| Generate handoff doc |
| Run compression |
| Recovery mode |
| Session continuation protocol |
| Context budget check |
| Emergency save + continuation file |
| 触发词 | 操作 |
|---|---|
会话启动且存在 | 执行完整加载流程 |
| "记住这个"、"别忘了" | 会话中更新内存 |
| "收尾"、"保存进度"、"先到这" | 执行完整会话写入 |
| "接着上次的继续"、"我们刚才在做什么" | 加载并总结上下文 |
| "切换到[分支]"、"[功能]的上下文" | 分支感知加载 |
| "内存健康状况"、"内存是否过期" | 健康检查 |
| "交接"、"给新人做入职培训" | 生成交接文档 |
| "压缩内存"、"清理内存" | 执行压缩操作 |
| "重建内存" | 启动恢复模式 |
| "保存状态"、"稍后继续" | 执行会话续接协议 |
| "上下文预算"、"还剩多少上下文" | 上下文预算检查 |
| "快用完上下文了"、"会话太长了" | 紧急保存并生成续接文件 |
Workflow
工作流程
1. Session Start — The Load Sequence
1. 会话启动——加载流程
Execute this sequence before doing anything else:
Step 1: Detect memory files
└─ Check for MEMORY.md in project root
└─ Check for ~/.claude/GLOBAL-MEMORY.md
└─ Check for MEMORY-ARCHIVE.md (has history been archived?)
Step 2: Detect git context
└─ Current branch name
└─ Check for .memory/branches/<branch>.md overlay
└─ Days since last session (from "Last updated" field)
Step 3: Session diff (if git available)
└─ Commits since last memory update
└─ Files changed since last session
└─ Any conflicts between memory and current code state
Step 4: Health check
└─ Score memory freshness (see Health Scoring below)
└─ Flag stale entries
└─ Flag referenced files that no longer exist
Step 5: Context-aware greeting
└─ Summarize where we left off (2-3 sentences, specific)
└─ Report any drift detected (code changed, memory stale)
└─ State the next immediate action
└─ Ask: "Ready to continue, or has the plan changed?"Example greeting (fresh memory, same branch):
"Welcome back! Last session you finished the Stripe webhook handler inand were about to write integration tests. Thesrc/api/webhooks/stripe.tsfunction is complete buthandlePaymentSuccess()is stubbed out. 3 commits have landed since — all yours, no surprises. Ready to pick up with the integration tests?"handleRefund()
Example greeting (stale memory, branch switched):
"Welcome back! Your memory is from 5 days ago on, but you're now onmain. I found a branch overlay from 3 days ago with context about the profile avatar upload. However,feature/user-profilesreferenced in memory was renamed tosrc/components/Avatar.tsx. Want me to update memory with the current state before we continue?"ProfileImage.tsx
If no MEMORY.md exists:
- Proceed normally
- After first meaningful work, offer: "Want me to start tracking our progress? I'll create a memory file so next session picks up instantly."
在执行任何操作前先完成以下流程:
Step 1: 检测内存文件
└─ 检查项目根目录是否存在MEMORY.md
└─ 检查是否存在~/.claude/GLOBAL-MEMORY.md
└─ 检查是否存在MEMORY-ARCHIVE.md(历史记录是否已归档)
Step 2: 检测git上下文
└─ 当前分支名称
└─ 检查是否存在.memory/branches/<branch>.md覆盖层
└─ 距离上次会话的天数(从“最后更新”字段获取)
Step 3: 会话差异对比(若git可用)
└─ 上次内存更新后的提交记录
└─ 上次会话后修改的文件
└─ 内存与当前代码状态的冲突
Step 4: 健康检查
└─ 评估内存新鲜度(见下方健康评分)
└─ 标记过期条目
└─ 标记已不存在的引用文件
Step 5: 上下文感知问候
└─ 总结上次中断点(2-3句具体内容)
└─ 报告检测到的任何偏差(代码变更、内存过期)
└─ 明确下一步操作
└─ 询问:“准备继续,还是计划有变化?”示例问候(内存新鲜,分支未变):
"欢迎回来!上次会话你完成了中的Stripe webhook处理器,接下来要编写集成测试。src/api/webhooks/stripe.ts函数已完成,但handlePaymentSuccess()仅为桩实现。之后有3次提交——都是你做的,无意外情况。准备开始编写集成测试了吗?"handleRefund()
示例问候(内存过期,分支已切换):
"欢迎回来!你的内存记录来自5天前的分支,但当前处于main分支。我找到了3天前的分支覆盖层,包含头像上传功能的上下文。不过内存中引用的feature/user-profiles已重命名为src/components/Avatar.tsx。需要我先更新内存到当前状态再继续吗?"ProfileImage.tsx
若不存在MEMORY.md:
- 正常进行会话
- 在完成首个有意义的工作后,询问:"需要我开始跟踪进度吗?我会创建内存文件,下次会话可直接从中断处继续。"
2. Mid-Session Updates
2. 会话中更新
When the user says "remember this" or you complete a significant milestone:
- Read current
MEMORY.md - Determine what changed:
- New decision made? → Update
Key Decisions - Task completed? → Move from to
Active Work, updateCompletedWhere We Left Off - New blocker? → Add to
Blockers - Important context? → Add to
Notes
- New decision made? → Update
- Write the updated file
- Confirm with specifics: "Saved — added the Zod migration decision and marked the user model as complete."
Do NOT rewrite the entire file on mid-session updates. Only modify
the sections that changed. This preserves context from session start.
当用户说“记住这个”或完成重要里程碑时:
- 读取当前
MEMORY.md - 判断变更内容:
- 是否做出新决策?→ 更新“关键决策”部分
- 是否完成任务?→ 从“进行中工作”移至“已完成”,更新“上次中断点”
- 是否出现新阻塞?→ 添加到“阻塞点”
- 是否有重要上下文?→ 添加到“备注”
- 写入更新后的文件
- 给出具体确认:"已保存——添加了Zod迁移决策,并标记用户模型已完成。"
会话中更新时请勿重写整个文件,仅修改发生变化的部分,以保留会话启动时的上下文。
3. Session End — The Write Sequence
3. 会话结束——写入流程
When wrapping up, execute a full memory write:
Step 1: Audit the session
└─ What was accomplished? (be specific: files, functions, lines)
└─ What decisions were made and why?
└─ What's blocked or unresolved?
└─ What should happen next? (crystal clear next step)
Step 2: Compress completed work
└─ Move finished items to Completed with one-line summaries
└─ Remove resolved blockers
└─ Archive stale notes
Step 3: Update memory health metadata
└─ Update "Last updated" timestamp
└─ Increment session counter
└─ Update file reference table (verify paths still exist)
Step 4: Write MEMORY.md
└─ Full overwrite with current state
└─ Verify the file was written successfully
Step 5: Check compression threshold
└─ If > 150 lines, suggest compression
└─ If > 200 lines, auto-compress (see Smart Compression)
Step 6: Prompt for global memory
└─ Any cross-project learnings worth saving to Layer 2?
└─ New user preferences discovered?收尾时执行完整内存写入:
Step 1: 审计会话内容
└─ 完成了哪些工作?(具体到文件、函数、代码行)
└─ 做出了哪些决策及原因?
└─ 哪些内容阻塞或未解决?
└─ 下一步该做什么?(明确到无需额外询问)
Step 2: 压缩已完成工作
└─ 将已完成项移至“已完成”并添加一行摘要
└─ 删除已解决的阻塞点
└─ 归档过期备注
Step 3: 更新内存健康元数据
└─ 更新“最后更新”时间戳
└─ 增加会话计数器
└─ 更新文件引用表(验证路径是否仍存在)
Step 4: 写入MEMORY.md
└─ 用当前状态完全覆盖文件
└─ 验证文件写入成功
Step 5: 检查压缩阈值
└─ 若超过150行,建议压缩
└─ 若超过200行,自动压缩(见智能压缩)
Step 6: 询问全局内存更新
└─ 是否有跨项目经验值得保存到Layer 2?
└─ 是否发现新的用户偏好?MEMORY.md Template
MEMORY.md模板
markdown
undefinedmarkdown
undefinedProject Memory
项目内存
Last updated: [DATE] | Session [N] | Branch: [BRANCH]
Memory health: [SCORE]/10
最后更新:[日期] | 会话次数 [N] | 分支:[分支名]
内存健康评分:[分数]/10
Project Overview
项目概述
[1-2 sentences. What this is, what stack, what stage.]
[1-2句话说明项目是什么、技术栈、所处阶段。]
Where We Left Off
上次中断点
- Current task: [specific task with file/function reference]
- Status: [done | in progress | blocked]
- Next immediate step: [so clear Claude can start without asking anything]
- Open question: [decision pending, if any]
- 当前任务: [具体任务,包含文件/函数引用]
- 状态: [已完成 | 进行中 | 阻塞]
- 下一步操作: [明确到Claude无需询问即可开始]
- 待解决问题: [若有未决决策]
Completed
已完成
- [DATE] [one-line summary with key files touched]
- [DATE] [one-line summary]
- [日期] [一行摘要,包含涉及的关键文件]
- [日期] [一行摘要]
Active Work
进行中工作
- [task — specific file, function, or component]
- [task]
- [recently completed, will archive on next compression]
- [任务——具体文件、函数或组件]
- [任务]
- [近期完成,下次压缩时归档]
Blockers
阻塞点
- [blocker with context on what's needed to unblock]
- [阻塞点及解决所需的上下文]
Key Decisions
关键决策
| Date | Decision | Reasoning | Affects |
|---|---|---|---|
| [DATE] | [what was decided] | [why] | [files/areas impacted] |
| 日期 | 决策内容 | 原因 | 影响范围 |
|---|---|---|---|
| [日期] | [决策内容] | [原因] | [受影响的文件/领域] |
Key Files
关键文件
| File | Purpose | Last Modified |
|---|---|---|
| [path] | [what it does] | [session N] |
| 文件路径 | 用途 | 最后修改会话 |
|---|---|---|
| [路径] | [功能说明] | [会话N] |
Architecture Notes
架构备注
[Non-obvious design choices, data flow, system boundaries]
[非直观的设计选择、数据流、系统边界]
Known Issues
已知问题
- [issue, severity, and workaround if any]
- [问题、严重程度及临时解决方案(若有)]
Session Log
会话日志
| Session | Date | Summary |
|---|---|---|
| [N] | [DATE] | [one-line summary of what happened] |
| 会话次数 | 日期 | 摘要 |
|---|---|---|
| [N] | [日期] | [会话内容的一行摘要] |
User Preferences
用户偏好
[How the user likes to work — discovered across sessions]
[用户的工作习惯——跨会话总结]
External Context
外部上下文
[APIs, services, env setup — NO secrets, NO credentials, NEVER]
---[API、服务、环境配置——禁止存储密钥、凭证,绝对不允许]
---Branch-Aware Memory
分支感知内存
When working across multiple git branches, memory adapts:
MEMORY.md <- Base project memory (main/trunk)
.memory/
branches/
feature-auth.md <- Overlay for feature/auth branch
feature-payments.md <- Overlay for feature/payments branch
bugfix-race-condition.md <- Overlay for bugfix branchHow it works:
- At session start, detect current git branch
- Load base first
MEMORY.md - Check for an overlay
.memory/branches/<branch-slug>.md - Merge overlay on top of base (overlay sections take priority)
- At session end, write changes back to the correct layer:
- Architecture decisions → base (shared across branches)
MEMORY.md - Branch-specific work →
.memory/branches/<branch>.md
- Architecture decisions → base
On branch merge:
- When a feature branch merges to main, prompt:
"The branch just merged. Want me to fold its memory overlay into the base MEMORY.md and clean up the branch file?"
feature/auth
Seefor merge strategies.references/branch-aware-memory.md
在多个git分支间工作时,内存会自适应调整:
MEMORY.md <- 基础项目内存(主分支)
.memory/
branches/
feature-auth.md <- feature/auth分支的覆盖层
feature-payments.md <- feature/payments分支的覆盖层
bugfix-race-condition.md <- bugfix分支的覆盖层工作原理:
- 会话启动时检测当前git分支
- 先加载基础
MEMORY.md - 检查是否存在覆盖层
.memory/branches/<branch-slug>.md - 将覆盖层合并到基础内存上(覆盖层内容优先级更高)
- 会话结束时,将变更写入对应层级:
- 架构决策 → 基础(跨分支共享)
MEMORY.md - 分支专属工作 →
.memory/branches/<branch>.md
- 架构决策 → 基础
分支合并时:
- 当功能分支合并到主分支时,提示:
"分支已合并到主分支。需要我将其内存覆盖层合并到基础MEMORY.md并清理分支文件吗?"
feature/auth
合并策略请参考。references/branch-aware-memory.md
Smart Compression
智能压缩
Memory files grow. Smart Compression keeps them useful:
Auto-compress triggers:
- MEMORY.md exceeds 150 lines → suggest compression
- MEMORY.md exceeds 200 lines → auto-compress
- Entries older than 5 sessions → candidates for archival
Compression rules:
- Completed tasks older than 3 sessions → collapse to one-liner in Session Log
- Resolved blockers → remove entirely
- Stale "Active Work" items (no progress in 3+ sessions) → flag for user
- Decision Log entries → NEVER compress (permanent record)
- Architecture Notes → NEVER compress (permanent record)
Archival:
When session count exceeds 10, create :
MEMORY-ARCHIVE.mdmarkdown
undefined内存文件会逐渐增大,智能压缩功能确保其可用性:
自动压缩触发条件:
- MEMORY.md超过150行 → 建议压缩
- MEMORY.md超过200行 → 自动压缩
- 超过5个会话的条目 → 归档候选
压缩规则:
- 超过3个会话的已完成任务 → 压缩为会话日志中的一行摘要
- 已解决的阻塞点 → 完全删除
- 超过3个会话无进展的“进行中工作”项 → 标记给用户
- 决策日志条目 → 绝不压缩(永久记录)
- 架构备注 → 绝不压缩(永久记录)
归档:
当会话次数超过10次时,创建:
MEMORY-ARCHIVE.mdmarkdown
undefinedMemory Archive
内存归档
Archived sessions from Project Memory.
项目内存的归档会话记录。
Sessions 1-8 Summary
会话1-8摘要
[Paragraph summary of early project work]
[早期项目工作的段落摘要]
Key Milestones
关键里程碑
- Session 2: Initial project scaffolding complete
- Session 5: Auth system shipped
- Session 8: Database migration to Prisma complete
> See `references/smart-compression.md` for the full compression algorithm.
---- 会话2:完成初始项目脚手架
- 会话5:发布认证系统
- 会话8:完成数据库到Prisma的迁移
> 完整压缩算法请参考`references/smart-compression.md`。
---Session Diffing
会话差异对比
At session start, detect what changed since memory was last written:
bash
undefined会话启动时,检测自上次内存写入以来的变更:
bash
undefinedGet the date from MEMORY.md "Last updated" field
从MEMORY.md的“最后更新”字段获取日期
Then check what happened since
然后检查该日期之后的变更
git log --oneline --since="[last-updated-date]"
git diff --stat HEAD~[commits-since]
**Report format:**
> "Since your last session (3 days ago), there have been 7 commits:
> 4 by you, 3 by @teammate. Key changes: `src/api/users.ts` was refactored,
> `package.json` has 2 new dependencies (zod, @tanstack/query).
> Your memory references `src/api/users.ts` — I'll verify it's still accurate."
**Conflict detection:**
When session diff reveals changes that contradict memory:
- Memory says "using Express" but `package.json` now has Fastify → flag
- Memory references `src/auth/login.ts` but file was deleted → flag
- Memory says "blocked on API key" but `.env` now has it → update
> See `references/session-diffing.md` for conflict resolution strategies.
---git log --oneline --since="[last-updated-date]"
git diff --stat HEAD~[commits-since]
**报告格式:**
> "距离上次会话已过去3天,期间有7次提交:
> 4次由你提交,3次由@teammate提交。关键变更:`src/api/users.ts`已重构,
> `package.json`新增2个依赖(zod、@tanstack/query)。
> 你的内存中引用了`src/api/users.ts`——我会验证其准确性。"
**冲突检测:**
当会话差异对比发现与内存矛盾的变更时:
- 内存记录“使用Express”但`package.json`现在包含Fastify → 标记
- 内存引用`src/auth/login.ts`但文件已删除 → 标记
- 内存记录“因API密钥阻塞”但`.env`已包含该密钥 → 更新内存
> 冲突解决策略请参考`references/session-diffing.md`。
---Memory Health Scoring
内存健康评分
Rate memory on a 1-10 scale across four dimensions:
| Dimension | Weight | Score 10 | Score 1 |
|---|---|---|---|
| Freshness | 30% | Updated today | > 14 days old |
| Relevance | 30% | All referenced files exist | Most files missing/renamed |
| Completeness | 20% | All sections filled, next step clear | Missing key sections |
| Actionability | 20% | Can start working immediately | Need to ask 3+ questions |
Display at session start:
Memory health: 8/10
Freshness: 9/10 (updated yesterday)
Relevance: 7/10 (2 file paths changed)
Completeness: 8/10 (all sections present)
Actionability: 9/10 (next step is crystal clear)If health < 5: Trigger recovery mode or suggest a memory rebuild.
从四个维度对内存进行1-10分评分:
| 维度 | 权重 | 10分标准 | 1分标准 |
|---|---|---|---|
| 新鲜度 | 30% | 今日更新 | 超过14天未更新 |
| 相关性 | 30% | 所有引用文件均存在 | 多数文件已缺失/重命名 |
| 完整性 | 20% | 所有字段已填充,下一步明确 | 缺失关键部分 |
| 可执行性 | 20% | 可立即开始工作 | 需要询问3个以上问题 |
会话启动时展示:
内存健康评分:8/10
新鲜度: 9/10(昨日更新)
相关性: 7/10(2个文件路径已变更)
完整性: 8/10(所有部分均存在)
可执行性: 9/10(下一步操作明确)若评分<5: 触发恢复模式或建议重建内存。
Recovery Mode
恢复模式
When memory is severely stale, corrupted, or missing critical context:
Step 1: Scan the project
└─ Read package.json / pyproject.toml / go.mod (detect stack)
└─ Read README.md and CLAUDE.md (project context)
└─ List key directories and recent files
Step 2: Read git history
└─ Last 20 commits (who, what, when)
└─ Current branch and recent branches
└─ Any open/recent PRs
Step 3: Reconstruct memory
└─ Build Project Overview from package.json + README
└─ Build Key Files from most-modified files in git log
└─ Build Key Decisions from commit messages and code patterns
└─ Set "Where We Left Off" from most recent commits
└─ Flag confidence level: "Reconstructed from code — verify with user"
Step 4: Present and confirm
└─ Show reconstructed memory to user
└─ Ask for corrections
└─ Write verified MEMORY.md当内存严重过期、损坏或缺失关键上下文时:
Step 1: 扫描项目
└─ 读取package.json / pyproject.toml / go.mod(检测技术栈)
└─ 读取README.md和CLAUDE.md(项目上下文)
└─ 列出关键目录和近期修改的文件
Step 2: 读取git历史
└─ 最近20次提交(提交人、内容、时间)
└─ 当前分支和近期分支
└─ 任何打开/近期的PR
Step 3: 重建内存
└─ 基于package.json + README构建项目概述
└─ 基于git日志中修改最频繁的文件构建关键文件列表
└─ 基于提交信息和代码模式构建关键决策
└─ 基于最近的提交设置“上次中断点”
└─ 标记置信度:“从代码重建——请用户验证”
Step 4: 展示并确认
└─ 向用户展示重建后的内存
└─ 询问是否需要修正
└─ 写入验证后的MEMORY.mdHandoff Protocol
交接协议
Generate a developer handoff document that's optimized for humans (not Claude):
markdown
undefined生成面向开发者的交接文档(优化为人类阅读,而非Claude):
markdown
undefinedProject Handoff: [Project Name]
项目交接:[项目名称]
Generated: [DATE] | By: [user] via Claude Code
生成时间:[日期] | 生成人:[用户] via Claude Code
Quick Start
快速开始
- Clone:
git clone [repo] - Install:
[package manager] install - Setup: [env vars, database, etc.]
- Run:
[dev command]
- 克隆:
git clone [仓库地址] - 安装:
[包管理器] install - 配置:[环境变量、数据库等]
- 运行:
[开发命令]
Current State
当前状态
[Where the project is right now — what works, what doesn't]
[项目当前进展——已完成功能、未完成功能]
Architecture
架构
[System diagram, key components, data flow]
[系统图、核心组件、数据流]
Active Work
进行中工作
[What's in progress, what's next, what's blocked]
[正在开发的内容、下一步计划、阻塞点]
Key Decisions & Why
关键决策及原因
[Decisions that a new developer would question — with the reasoning]
[新开发者可能质疑的决策——附带原因]
Gotchas
注意事项
[Things that will bite you if you don't know about them]
[不了解就会踩坑的细节]
Who to Ask
咨询对象
[People, channels, or docs for domain-specific questions]
Trigger with: "generate a handoff", "onboard someone to this project",
"write a handoff doc"
---[领域相关问题的联系人、沟通渠道或文档]
触发指令:"生成交接文档"、"给新人做项目入职培训"、"写一份交接文档"
---Context Efficiency Engine
上下文效率引擎
The #1 complaint with Claude Code: sessions hit context limits too fast.
You spend half your tokens re-explaining context, and the other half doing
actual work. Memory Bank flips this ratio.
Claude Code最常见的抱怨:会话过快达到上下文限制。你一半的令牌用于重复解释上下文,另一半才用于实际工作。内存库彻底扭转了这一比例。
The Token Problem (Without Memory Bank)
无内存库时的令牌问题
Session start WITHOUT memory-bank:
User: "Let's continue working on the app"
Claude: "What app? What stack? What were we doing?"
User: "It's a Next.js e-commerce app with Prisma and Stripe..."
[400+ tokens explaining the project]
User: "We were building the checkout flow..."
[300+ tokens explaining current state]
User: "The key files are..."
[200+ tokens listing files]
User: "We decided to use X because..."
[300+ tokens re-explaining decisions]
Total wasted: ~1,200+ tokens EVERY SESSION just to get back to baseline.
Over 10 sessions: ~12,000 tokens wasted on re-explanation alone.无内存库的会话启动:
用户:“继续开发这个应用”
Claude:“哪个应用?用的什么技术栈?我们刚才在做什么?”
用户:“这是一个基于Next.js的电商应用,用Prisma和Stripe..."
[400+令牌用于解释项目]
用户:“我们正在构建结账流程..."
[300+令牌用于解释当前状态]
用户:“关键文件是..."
[200+令牌用于列出文件]
用户:“我们决定用X是因为..."
[300+令牌用于重复解释决策]
每次会话浪费的令牌:~1200+,仅用于恢复到基线状态。
10次会话累计浪费:~12000令牌,全部用于重复解释。The Token Solution (With Memory Bank)
有内存库时的令牌解决方案
Session start WITH memory-bank:
Claude reads MEMORY.md: ~800 tokens (compact, structured, complete)
Claude greets with full context: ~150 tokens
User: "Let's go"
Total: ~950 tokens. Savings: 60-80% per session start.
Over 10 sessions: ~9,000+ tokens saved on context alone.But session-start savings are just the beginning.
有内存库的会话启动:
Claude读取MEMORY.md:~800令牌(紧凑、结构化、完整)
Claude用完整上下文问候:~150令牌
用户:“开始吧”
总令牌消耗:~950。每次会话节省60-80%。
10次会话累计节省:~9000+令牌,全部来自上下文解释。但会话启动时的节省只是开始。
Progressive Loading
渐进式加载
Don't dump everything into context. Load in tiers:
Tier 1: ALWAYS load (costs ~200 tokens)
└─ Project Overview (1-2 sentences)
└─ Where We Left Off (current task, status, next step)
└─ Active Blockers
Tier 2: Load on DEMAND (costs ~300 tokens when needed)
└─ Key Decisions (only when a decision comes up)
└─ Key Files (only when working with files not in Tier 1)
└─ Architecture Notes (only when touching architecture)
Tier 3: Load ONLY when asked (costs ~200 tokens when needed)
└─ Session Log (only for velocity/history questions)
└─ User Preferences (only on first session or when relevant)
└─ External Context (only when working with APIs/services)Result: Instead of loading 800 tokens of memory at once, load 200
tokens immediately and the rest only when actually needed. Most sessions
never need Tier 3 at all.
不要一次性加载所有内容,分阶段加载:
Tier 1:始终加载(约200令牌)
└─ 项目概述(1-2句话)
└─ 上次中断点(当前任务、状态、下一步)
└─ 当前阻塞点
Tier 2:按需加载(需要时约300令牌)
└─ 关键决策(仅当涉及决策时加载)
└─ 关键文件(仅当处理Tier 1未包含的文件时加载)
└─ 架构备注(仅当涉及架构修改时加载)
Tier 3:仅在请求时加载(需要时约200令牌)
└─ 会话日志(仅当询问进度/历史时加载)
└─ 用户偏好(仅在首次会话或相关时加载)
└─ 外部上下文(仅当处理API/服务时加载)结果: 无需一次性加载800令牌的内存,只需立即加载200令牌,其余内容仅在需要时加载。大多数会话根本不需要Tier 3。
Compact Encoding Rules
紧凑编码规则
Every line in MEMORY.md is optimized for maximum information per token:
Use structured shorthand, not prose:
BAD (38 tokens):
"We made the decision to use Prisma as our ORM instead of Drizzle
because it provides better TypeScript type inference and the team
is already familiar with it from previous projects."
GOOD (14 tokens):
| 2025-04-01 | Prisma over Drizzle | Type inference, team familiarity | All DB |Use tables for structured data (they compress well):
BAD (scattered prose — 120 tokens for 5 files):
The main checkout route is in src/app/api/checkout/route.ts. The Stripe
client is configured in src/lib/stripe.ts. Cart state management is in...
GOOD (table — 60 tokens for 5 files):
| File | Purpose |
| src/app/api/checkout/route.ts | Stripe session creation |
| src/lib/stripe.ts | Stripe client singleton |
| src/stores/cart.ts | Zustand cart + persistence |Use checklists for active work (scannable, dense):
BAD (prose):
We are currently working on the webhook handler, which is partially
complete. We also need to write tests and haven't started yet.
GOOD (checklist):
- [x] Stripe webhook handler — handlePaymentSuccess()
- [ ] handleRefund() — stubbed, needs implementation
- [ ] Integration tests for webhook endpointsOne line, one fact. No filler words:
BAD: "The project is essentially a web application that was built for..."
GOOD: "Bakery e-commerce. Next.js 14, Prisma, Stripe. Launching April."MEMORY.md中的每一行都经过优化,以实现每令牌的最大信息密度:
使用结构化简写而非散文:
糟糕(38令牌):
“我们决定使用Prisma作为ORM而非Drizzle,
因为它提供更好的TypeScript类型推断,且团队在之前的项目中已经熟悉它。”
优秀(14令牌):
| 2025-04-01 | 选Prisma弃Drizzle | 类型推断、团队熟悉度 | 所有DB相关 |用表格存储结构化数据(压缩效率高):
糟糕(分散的散文——5个文件用120令牌):
主结账路由在src/app/api/checkout/route.ts。Stripe客户端配置在src/lib/stripe.ts。购物车状态管理在...
优秀(表格——5个文件用60令牌):
| 文件路径 | 用途 |
| src/app/api/checkout/route.ts | 创建Stripe会话 |
| src/lib/stripe.ts | Stripe客户端单例 |
| src/stores/cart.ts | Zustand购物车+持久化 |用清单记录进行中工作(易扫描、密度高):
糟糕(散文):
我们当前正在开发webhook处理器,已部分完成。还需要编写测试,但尚未开始。
优秀(清单):
- [x] Stripe webhook处理器 — handlePaymentSuccess()
- [ ] handleRefund() — 桩实现,需完善
- [ ] webhook端点的集成测试一行一个事实,无冗余词汇:
糟糕:“这个项目本质上是一个为...构建的web应用”
优秀:“面包店电商应用。Next.js 14、Prisma、Stripe。四月上线。”Context Budget Tracking
上下文预算跟踪
Monitor token usage and warn before hitting limits:
At session start, estimate the context budget:
Available context: ~200,000 tokens (Claude's window)
Memory load: ~800 tokens (Tier 1 + loaded Tiers)
System prompt: ~2,000 tokens
Remaining for work: ~197,200 tokens
At 40% usage (~80,000 tokens consumed):
→ Suggest: "We're at 40% context. Consider compacting soon."
At 60% usage (~120,000 tokens consumed):
→ Save a session checkpoint automatically
→ Suggest: "Context at 60%. Good time to /compact or start fresh."
At 80% usage (~160,000 tokens consumed):
→ Auto-save full state to MEMORY.md
→ Alert: "Context is at 80%. Saving state now — you can continue
in a new session with zero loss. Say 'wrap up' or keep going."监控令牌使用情况,在达到限制前发出警告:
会话启动时估算上下文预算:
可用上下文: ~200000令牌(Claude的上下文窗口)
内存加载: ~800令牌(Tier 1 + 已加载的其他Tier)
系统提示: ~2000令牌
剩余工作令牌: ~197200令牌
使用量达40%(约80000令牌)时:
→ 建议:“已使用40%上下文。考虑尽快压缩。”
使用量达60%(约120000令牌)时:
→ 自动保存会话检查点
→ 建议:“已使用60%上下文。适合执行/compact或开启新会话。”
使用量达80%(约160000令牌)时:
→ 自动将完整状态保存到MEMORY.md
→ 警告:“已使用80%上下文。正在保存状态——你可以在新会话中继续,无任何损失。请说‘收尾’或继续。”Session Continuation Protocol
会话续接协议
When a session hits context limits or user wants to start fresh:
Step 1: EMERGENCY SAVE (before context dies)
└─ Write MEMORY.md with EVERYTHING from current session
└─ Include exact cursor position: file, function, line number
└─ Include any uncommitted mental model (what Claude was thinking)
└─ Include partial work state: what's done, what's half-done, what's next
Step 2: Write CONTINUATION.md (a one-shot warm-up file)
└─ Ultra-compact: under 50 lines, under 500 tokens
└─ Contains ONLY what the next session needs to start immediately
└─ Format:
```markdown
# Continue: [task name]
Resume from: `src/auth/refresh.ts:47` — writing rotateToken()
## State
- handlePaymentSuccess(): DONE ✓
- handleRefund(): stubbed at line 89, needs Stripe refund.created event
- Tests: NOT STARTED
## Context
- Stripe webhook sig verified in middleware (line 12)
- Using stripe.webhooks.constructEvent() not manual HMAC
- Refund handler follows same pattern as payment handler
## Immediate Next Action
Implement handleRefund() in src/api/webhooks/stripe/route.ts:89
using the stripe.refund.created event payload. Pattern:
extract refund.payment_intent → find order → update status to "refunded"Step 3: GREET AND GO (next session)
└─ Read CONTINUATION.md first (it's the fast-path)
└─ Read MEMORY.md for full context only if needed
└─ Delete CONTINUATION.md after loading
└─ Start working immediately — no questions, no warm-up
**Trigger phrases:** "save state", "I'm running out of context",
"continue this later", "session is getting long"当会话达到上下文限制或用户希望开启新会话时:
Step 1: 紧急保存(上下文失效前)
└─ 将当前会话的所有内容写入MEMORY.md
└─ 包含精确的光标位置:文件、函数、行号
└─ 包含未提交的思维模型(Claude的思考内容)
└─ 包含部分工作状态:已完成、进行中、下一步
Step 2: 写入CONTINUATION.md(一次性预热文件)
└─ 超紧凑:少于50行,少于500令牌
└─ 仅包含新会话立即开始所需的内容
└─ 格式:
```markdown
# 续接:[任务名称]
从以下位置恢复:`src/auth/refresh.ts:47` — 编写rotateToken()
## 状态
- handlePaymentSuccess(): 已完成 ✓
- handleRefund(): 第89行桩实现,需处理Stripe refund.created事件
- 测试:未开始
## 上下文
- Stripe webhook签名在中间件中验证(第12行)
- 使用stripe.webhooks.constructEvent()而非手动HMAC
- 退款处理器遵循与支付处理器相同的模式
## 立即下一步
在src/api/webhooks/stripe/route.ts:89实现handleRefund(),
使用stripe.refund.created事件 payload。模式:
提取refund.payment_intent → 查找订单 → 更新状态为“已退款”Step 3: 问候并开始(新会话)
└─ 先读取CONTINUATION.md(快速路径)
└─ 仅在需要时读取MEMORY.md获取完整上下文
└─ 加载后删除CONTINUATION.md
└─ 立即开始工作——无需询问、无需预热
**触发短语:** "保存状态"、"快用完上下文了"、"稍后继续"、"会话太长了"Token Savings By Feature
各功能的令牌节省情况
| Feature | Tokens Saved Per Session | How |
|---|---|---|
| Structured memory vs re-explaining | 800-1,500 | Compact format replaces verbal explanation |
| Progressive loading (Tier 1 only) | 300-600 | Don't load what you don't need |
| Compact encoding (tables > prose) | 200-400 | Same info, fewer tokens |
| Session continuation protocol | 500-1,000 | Zero warm-up in new sessions |
| Smart compression | 200-500 | Smaller file = fewer tokens to read |
| Branch-aware selective loading | 100-300 | Skip irrelevant branch context |
| Total per session | 2,100-4,300 | |
| Over 10 sessions | 21,000-43,000 |
| 功能 | 每次会话节省令牌 | 实现方式 |
|---|---|---|
| 结构化内存替代重复解释 | 800-1500 | 紧凑格式替代口头解释 |
| 渐进式加载(仅Tier 1) | 300-600 | 不加载非必要内容 |
| 紧凑编码(表格优于散文) | 200-400 | 相同信息,更少令牌 |
| 会话续接协议 | 500-1000 | 新会话无需预热 |
| 智能压缩 | 200-500 | 文件更小,读取令牌更少 |
| 分支感知选择性加载 | 100-300 | 跳过无关分支上下文 |
| 每次会话总计 | 2100-4300 | |
| 10次会话总计 | 21000-43000 |
Anti-Patterns That Waste Tokens
浪费令牌的反模式
Never do these in memory files:
✗ Verbose prose where a table works
✗ Repeating the same information in multiple sections
✗ Storing code snippets in memory (reference file:line instead)
✗ Long descriptions of completed work (one-line summaries only)
✗ Keeping resolved blockers (delete them)
✗ Storing information that's in README.md or CLAUDE.md already
✗ Using memory for things Git tracks (commit history, diffs, blame)Always do these:
✓ Tables for structured data (decisions, files, tasks)
✓ Checklists for active work
✓ One sentence for Project Overview (not a paragraph)
✓ File:line references instead of describing code
✓ Delete resolved items (they're in git history)
✓ Reference other files instead of duplicating contentSeefor the full token optimization guide.references/context-efficiency.md
内存文件中绝对不要做这些:
✓ 能用表格时使用冗长散文
✓ 在多个部分重复相同信息
✓ 在内存中存储代码片段(改用文件:行号引用)
✓ 对已完成工作进行长篇描述(仅用一行摘要)
✓ 保留已解决的阻塞点(删除它们)
✓ 存储已在README.md或CLAUDE.md中的信息
✓ 用内存存储Git已跟踪的内容(提交历史、差异、 blame)始终遵循这些原则:
✓ 用表格存储结构化数据(决策、文件、任务)
✓ 用清单记录进行中工作
✓ 项目概述用一句话(而非段落)
✓ 用文件:行号引用替代描述代码
✓ 删除已解决的条目(Git历史中已有记录)
✓ 引用其他文件而非重复内容完整令牌优化指南请参考。references/context-efficiency.md
Rules for Excellent Memory
优质内存的规则
Be surgical, not vague.
Bad: "Working on auth"
Good: "Implementing JWT refresh token rotation in —
is complete, needs Redis TTL logic in "
src/auth/refresh.tsrotateToken()src/cache/tokens.ts:47The "Next immediate step" is the single most important line.
It should be so precise that Claude can start coding the instant a session
begins, with zero clarifying questions.
Capture the "why" behind every decision.
Future Claude will encounter the same trade-offs and re-litigate them
unless the reasoning is recorded.
Never store secrets. No API keys, passwords, tokens, or credentials.
Ever. Not even "temporarily". Reference or a secrets manager instead.
.envOverwrite on session end, surgical update mid-session.
Session end = full rewrite for consistency. Mid-session = targeted section
updates to avoid losing context.
Keep it under 150 lines. Compress aggressively. Stale information is
actively harmful — it misleads more than it helps.
精准具体,而非模糊笼统。
糟糕:“开发认证功能”
优秀:“在中实现JWT刷新令牌轮换——
已完成,需在中添加Redis TTL逻辑”
src/auth/refresh.tsrotateToken()src/cache/tokens.ts:47“下一步操作”是最重要的内容。
它应足够精确,让Claude在会话开始时即可立即编码,无需任何澄清问题。
记录每个决策的“原因”。
未来的Claude会遇到相同的权衡,除非记录了原因,否则会重新讨论。
绝对不要存储密钥。 任何API密钥、密码、令牌或凭证都不行,哪怕是“临时”的。改用或密钥管理器引用。
.env会话结束时重写,会话中精准更新。
会话结束 = 完全重写以保证一致性。会话中 = 针对性更新部分内容,避免丢失上下文。
保持在150行以内。 积极压缩。过期信息有害——它带来的误导多于帮助。
Auto-Setup via CLAUDE.md
通过CLAUDE.md自动设置
For fully automatic memory with all features, add to project
(or for all projects):
CLAUDE.md~/.claude/CLAUDE.mdmarkdown
undefined要实现全自动化内存及所有功能,在项目的(或所有项目的)中添加以下内容:
CLAUDE.md~/.claude/CLAUDE.mdmarkdown
undefinedMemory
内存设置
At the start of every session:
- Check for MEMORY.md in the project root
- Check for ~/.claude/GLOBAL-MEMORY.md
- Check current git branch and look for .memory/branches/<branch>.md
- Run session diff — what changed since last memory update
- Score memory health and flag any issues
- Greet me with a specific summary and the next immediate step
During sessions:
- Update memory when I say "remember this" or complete a milestone
- Track key decisions with reasoning in the decision table
At session end (when I say "wrap up", "save", "done for now"):
- Write comprehensive MEMORY.md with full current state
- Ensure "Next immediate step" is crystal clear
- Run compression if over 150 lines
- Confirm what was saved
> See `references/claude-md-integration.md` for the full integration guide.
---每次会话启动时:
- 检查项目根目录是否存在MEMORY.md
- 检查是否存在~/.claude/GLOBAL-MEMORY.md
- 检查当前git分支并查找.memory/branches/<branch>.md
- 执行会话差异对比——自上次内存更新以来的变更
- 评估内存健康状况并标记问题
- 用具体摘要和下一步操作问候我
会话期间:
- 当我说“记住这个”或完成里程碑时更新内存
- 在决策表中记录关键决策及原因
会话结束时(当我说“收尾”“保存”“先到这”时):
- 写入包含完整当前状态的MEMORY.md
- 确保“下一步操作”清晰明确
- 若超过150行则执行压缩
- 确认保存内容
> 完整集成指南请参考`references/claude-md-integration.md`。
---Reference Files
参考文件
- — Full architecture of the 3-tier memory system with promotion rules and cross-layer interactions
references/memory-layers.md - — Git branch integration, overlay merging, and cleanup strategies
references/branch-aware-memory.md - — Compression algorithm, archival thresholds, and what to never compress
references/smart-compression.md - — Cross-session change detection, conflict resolution, and drift correction
references/session-diffing.md - — Team workflows, velocity tracking, handoff protocol, and enterprise patterns
references/advanced-patterns.md - — Token optimization guide, progressive loading details, compact encoding reference
references/context-efficiency.md - — Complete setup guide for automatic triggering across all projects
references/claude-md-integration.md
- — 三层内存系统的完整架构,包含层级晋升规则和跨层交互
references/memory-layers.md - — Git分支集成、覆盖层合并和清理策略
references/branch-aware-memory.md - — 压缩算法、归档阈值及绝不压缩的内容
references/smart-compression.md - — 跨会话变更检测、冲突解决和偏差修正
references/session-diffing.md - — 团队工作流、进度跟踪、交接协议和企业级模式
references/advanced-patterns.md - — 令牌优化指南、渐进式加载细节、紧凑编码参考
references/context-efficiency.md - — 全项目自动触发的完整设置指南
references/claude-md-integration.md
Examples
示例
- — Memory for a solo developer on a Next.js app
examples/solo-fullstack.md - — Team-shared memory for a backend service
examples/team-backend.md - — Multi-domain memory for a monorepo
examples/monorepo.md - — 5-line memory for quick prototypes
examples/minimal.md
- — 独立开发者的Next.js应用内存示例
examples/solo-fullstack.md - — 团队共享的后端服务内存示例
examples/team-backend.md - — 单体仓库的多领域内存示例
examples/monorepo.md - — 快速原型的5行内存示例
examples/minimal.md