memory-systems
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseMemory System Design
内存系统设计
Memory provides persistence that allows agents to maintain continuity across sessions and reason over accumulated knowledge.
内存提供持久化能力,让Agent能够在跨会话场景下保持连续性,并基于累积的知识进行推理。
Memory Architecture Spectrum
内存架构范围
| Layer | Latency | Persistence | Use Case |
|---|---|---|---|
| Working Memory | Zero | Volatile | Context window |
| Short-Term | Low | Session | Session state |
| Long-Term | Medium | Persistent | Cross-session knowledge |
| Entity Memory | Medium | Persistent | Entity tracking |
| Temporal KG | Medium | Persistent | Time-aware queries |
| 层级 | 延迟 | 持久化能力 | 适用场景 |
|---|---|---|---|
| 工作内存 | 零延迟 | 易失性 | 上下文窗口 |
| 短期内存 | 低延迟 | 会话级 | 会话状态 |
| 长期内存 | 中等延迟 | 持久化 | 跨会话知识 |
| 实体内存 | 中等延迟 | 持久化 | 实体跟踪 |
| 时序知识图谱(Temporal KG) | 中等延迟 | 持久化 | 时间感知查询 |
Memory System Performance
内存系统性能
| System | DMR Accuracy | Retrieval Latency |
|---|---|---|
| Zep (Temporal KG) | 94.8% | 2.58s |
| MemGPT | 93.4% | Variable |
| GraphRAG | 75-85% | Variable |
| Vector RAG | 60-70% | Fast |
| Recursive Summary | 35.3% | Low |
| 系统 | DMR准确率 | 检索延迟 |
|---|---|---|
| Zep(时序知识图谱) | 94.8% | 2.58秒 |
| MemGPT | 93.4% | 可变 |
| GraphRAG | 75-85% | 可变 |
| 向量RAG | 60-70% | 快速 |
| 递归摘要 | 35.3% | 低 |
Why Vector Stores Fall Short
向量存储的局限性
Vector stores lose relationship information:
- Can retrieve "Customer X purchased Product Y"
- Cannot answer "What did customers who bought Y also buy?"
- Cannot distinguish current vs outdated facts
向量存储会丢失关系信息:
- 可以检索到「客户X购买了产品Y」
- 无法回答「购买Y的客户还买了什么?」
- 无法区分当前事实与过时事实
Memory Implementation Patterns
内存实现模式
Pattern 1: File-System-as-Memory
模式1:文件系统作为内存
python
undefinedpython
undefinedSimple, no infrastructure needed
Simple, no infrastructure needed
def store_fact(entity_id, fact):
path = f"memory/{entity_id}.json"
facts = load_json(path, default=[])
facts.append({"fact": fact, "timestamp": now()})
save_json(path, facts)
undefineddef store_fact(entity_id, fact):
path = f"memory/{entity_id}.json"
facts = load_json(path, default=[])
facts.append({"fact": fact, "timestamp": now()})
save_json(path, facts)
undefinedPattern 2: Vector RAG with Metadata
模式2:带元数据的向量RAG
python
undefinedpython
undefinedEmbed facts with rich metadata
Embed facts with rich metadata
vector_store.add(
embedding=embed(fact),
metadata={
"entity_id": entity_id,
"valid_from": now(),
"source": "conversation",
"confidence": 0.95
}
)
undefinedvector_store.add(
embedding=embed(fact),
metadata={
"entity_id": entity_id,
"valid_from": now(),
"source": "conversation",
"confidence": 0.95
}
)
undefinedPattern 3: Knowledge Graph
模式3:知识图谱
python
undefinedpython
undefinedPreserve relationships
Preserve relationships
graph.create_relationship(
from_entity="Customer_123",
relationship="PURCHASED",
to_entity="Product_456",
properties={"date": "2024-01-15", "quantity": 2}
)
undefinedgraph.create_relationship(
from_entity="Customer_123",
relationship="PURCHASED",
to_entity="Product_456",
properties={"date": "2024-01-15", "quantity": 2}
)
undefinedPattern 4: Temporal Knowledge Graph
模式4:时序知识图谱
python
undefinedpython
undefinedTime-travel queries
Time-travel queries
def query_address_at_time(user_id, query_time):
return graph.query("""
MATCH (user)-[r:LIVES_AT]->(address)
WHERE user.id = $user_id
AND r.valid_from <= $query_time
AND (r.valid_until IS NULL OR r.valid_until > $query_time)
RETURN address
""", {"user_id": user_id, "query_time": query_time})
undefineddef query_address_at_time(user_id, query_time):
return graph.query("""
MATCH (user)-[r:LIVES_AT]->(address)
WHERE user.id = $user_id
AND r.valid_from <= $query_time
AND (r.valid_until IS NULL OR r.valid_until > $query_time)
RETURN address
""", {"user_id": user_id, "query_time": query_time})
undefinedEntity Memory
实体内存
Track entities consistently across conversations:
- Entity Identity: "John Doe" in one conversation = same person in another
- Entity Properties: Facts discovered about entities over time
- Entity Relationships: Relationships discovered between entities
python
def remember_entity(entity_id, properties):
memory.store({
"type": "entity",
"id": entity_id,
"properties": properties,
"last_updated": now()
})跨会话一致跟踪实体:
- 实体身份:一次对话中的「John Doe」等同于另一次对话中的同一个人
- 实体属性:随时间推移发现的实体相关事实
- 实体关系:发现的实体之间的关系
python
def remember_entity(entity_id, properties):
memory.store({
"type": "entity",
"id": entity_id,
"properties": properties,
"last_updated": now()
})Memory Consolidation
内存整合
Trigger consolidation when:
- Memory accumulates significantly
- Retrieval returns too many outdated results
- Periodically on schedule
- Explicit request
Process:
- Identify outdated facts
- Merge related facts
- Update validity periods
- Archive/delete obsolete facts
- Rebuild indexes
在以下场景触发整合:
- 内存累积量显著增加时
- 检索返回过多过时结果时
- 按计划定期执行
- 收到明确请求时
流程:
- 识别过时事实
- 合并相关事实
- 更新有效期
- 归档/删除过时事实
- 重建索引
Choosing Memory Architecture
选择内存架构
| Requirement | Architecture |
|---|---|
| Simple persistence | File-system memory |
| Semantic search | Vector RAG with metadata |
| Relationship reasoning | Knowledge graph |
| Temporal validity | Temporal knowledge graph |
| 需求 | 架构 |
|---|---|
| 简单持久化 | 文件系统内存 |
| 语义搜索 | 带元数据的向量RAG |
| 关系推理 | 知识图谱 |
| 时序有效性 | 时序知识图谱 |
Best Practices
最佳实践
- Match architecture to query requirements
- Implement progressive disclosure for access
- Use temporal validity to prevent conflicts
- Consolidate periodically
- Design for retrieval failures gracefully
- Consider privacy implications
- Implement backup and recovery
- Monitor growth and performance
- 根据查询需求匹配架构
- 实现渐进式披露以控制访问
- 使用时序有效性避免冲突
- 定期执行整合操作
- 优雅设计检索失败的处理逻辑
- 考虑隐私影响
- 实现备份与恢复机制
- 监控内存增长与性能