openclaw-history-ingest
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOpenClaw History Ingest — Session & Memory Mining
OpenClaw历史数据导入 — 会话与记忆挖掘
You are extracting knowledge from the user's OpenClaw agent history and distilling it into the Obsidian wiki. OpenClaw stores both a structured long-term MEMORY.md and per-session JSONL transcripts — focus on durable knowledge, not operational telemetry.
This skill can be invoked directly or via the router ().
wiki-history-ingest/wiki-history-ingest openclaw你需要从用户的OpenClaw Agent历史数据中提取知识,并将其提炼到Obsidian wiki中。OpenClaw会存储结构化的长期MEMORY.md文件和每会话的JSONL转录文件——重点关注持久化知识,而非操作遥测数据。
此技能可直接调用,也可通过路由调用()。
wiki-history-ingest/wiki-history-ingest openclawBefore You Start
开始之前
- Read to get
.envandOBSIDIAN_VAULT_PATH(default toOPENCLAW_HISTORY_PATHif unset)~/.openclaw - Read at the vault root to check what has already been ingested
.manifest.json - Read at the vault root to understand what the wiki already contains
index.md
- 读取文件获取
.env和OBSIDIAN_VAULT_PATH(若未设置则默认值为OPENCLAW_HISTORY_PATH)~/.openclaw - 读取vault根目录下的文件,查看已导入的内容
.manifest.json - 读取vault根目录下的文件,了解wiki已包含的内容
index.md
Ingest Modes
导入模式
Append Mode (default)
追加模式(默认)
Check for each source file. Only process:
.manifest.json- Files not in the manifest (new session logs, updated MEMORY.md or daily notes)
- Files whose modification time is newer than in the manifest
ingested_at
Use this mode for regular syncs.
检查中的每个源文件。仅处理以下文件:
.manifest.json- 未在清单中的文件(新会话日志、更新后的MEMORY.md或每日笔记)
- 修改时间晚于清单中时间的文件
ingested_at
此模式适用于常规同步。
Full Mode
完整模式
Process everything regardless of manifest. Use after or if the user explicitly asks for a full re-ingest.
wiki-rebuild无论清单记录如何,处理所有文件。在执行后,或用户明确要求完整重新导入时使用此模式。
wiki-rebuildOpenClaw Data Layout
OpenClaw数据结构
OpenClaw stores all local artifacts under .
~/.openclaw/~/.openclaw/
├── openclaw.json # Global config
├── credentials/ # Auth tokens (skip entirely)
├── workspace/ # Agent workspace
│ ├── MEMORY.md # Long-term memory (loaded every session)
│ ├── DREAMS.md # Optional dream diary / summaries
│ └── memory/
│ ├── YYYY-MM-DD.md # Daily notes (today + yesterday auto-loaded)
│ └── ...
└── agents/
└── <agentId>/
├── agent/
│ └── models.json # Agent config (skip)
└── sessions/
├── sessions.json # Session index
└── <sessionId>.jsonl # Session transcript (JSONL, append-only)OpenClaw将所有本地工件存储在目录下。
~/.openclaw/~/.openclaw/
├── openclaw.json # 全局配置
├── credentials/ # 认证令牌(完全跳过)
├── workspace/ # Agent工作区
│ ├── MEMORY.md # 长期记忆(每次会话都会加载)
│ ├── DREAMS.md # 可选的梦境日志/摘要
│ └── memory/
│ ├── YYYY-MM-DD.md # 每日笔记(自动加载今日和昨日的内容)
│ └── ...
└── agents/
└── <agentId>/
├── agent/
│ └── models.json # Agent配置(跳过)
└── sessions/
├── sessions.json # 会话索引
└── <sessionId>.jsonl # 会话转录(JSONL格式,仅追加)Key data sources ranked by value
按价值排序的关键数据源
- — highest signal; long-term durable facts the agent accumulated
workspace/MEMORY.md - — daily notes; recent entries often contain active project context
workspace/memory/YYYY-MM-DD.md - — session transcripts; rich but noisy
agents/*/sessions/<id>.jsonl - — session index for inventory and timestamps
agents/*/sessions/sessions.json - — optional summaries; ingest if present
workspace/DREAMS.md
Skip entirely. Skip (runtime config, not user knowledge).
credentials/agents/*/agent/models.json- — 信号价值最高;Agent积累的长期持久化事实
workspace/MEMORY.md - — 每日笔记;近期条目通常包含活跃项目上下文
workspace/memory/YYYY-MM-DD.md - — 会话转录;信息丰富但噪音较多
agents/*/sessions/<id>.jsonl - — 会话索引,用于清单和时间戳
agents/*/sessions/sessions.json - — 可选摘要;若存在则导入
workspace/DREAMS.md
完全跳过目录。跳过(运行时配置,不属于用户知识)。
credentials/agents/*/agent/models.jsonStep 1: Survey and Compute Delta
步骤1:排查并计算增量
Scan and compare against :
OPENCLAW_HISTORY_PATH.manifest.json~/.openclaw/workspace/MEMORY.md- (if present)
~/.openclaw/workspace/DREAMS.md ~/.openclaw/workspace/memory/*.md~/.openclaw/agents/*/sessions/sessions.json~/.openclaw/agents/*/sessions/*.jsonl
Classify each file:
- New — not in manifest
- Modified — in manifest but file is newer than
ingested_at - Unchanged — already ingested and unchanged
Report a concise delta summary before deep parsing.
扫描并与对比:
OPENCLAW_HISTORY_PATH.manifest.json~/.openclaw/workspace/MEMORY.md- (若存在)
~/.openclaw/workspace/DREAMS.md ~/.openclaw/workspace/memory/*.md~/.openclaw/agents/*/sessions/sessions.json~/.openclaw/agents/*/sessions/*.jsonl
对每个文件进行分类:
- 新增 — 未在清单中
- 已修改 — 在清单中,但文件修改时间晚于
ingested_at - 未变更 — 已导入且未修改
在深度解析前,先报告简洁的增量摘要。
Step 2: Parse MEMORY.md First
步骤2:优先解析MEMORY.md
MEMORY.md- Durable facts about the user's preferences, environment, and recurring patterns
- Decisions and context the agent was told to remember
- Project-specific notes the agent accumulated over many sessions
Read it in full and extract concept-level knowledge. Do not create one wiki page per MEMORY.md entry — cluster by topic.
MEMORY.md- 关于用户偏好、环境和重复模式的持久化事实
- 告知Agent需要记住的决策和上下文
- Agent在多个会话中积累的项目特定笔记
完整读取该文件并提取概念级知识。不要为每个MEMORY.md条目创建一个wiki页面——按主题聚类。
Step 3: Parse Daily Notes
步骤3:解析每日笔记
workspace/memory/YYYY-MM-DD.md- Active project context and decisions made
- Patterns or techniques discovered
- Recurring blockers or solved problems
Older daily notes have diminishing signal — summarize in bulk rather than extracting line-by-line.
workspace/memory/YYYY-MM-DD.md- 活跃项目上下文和已做出的决策
- 发现的模式或技术
- 重复出现的障碍或已解决的问题
较旧的每日笔记信号价值递减——批量总结而非逐行提取。
Step 4: Parse Session JSONL Safely
步骤4:安全解析会话JSONL
Each session file is JSONL (append-only, one JSON object per line):
json
{"role": "user", "content": "...", "timestamp": "..."}
{"role": "assistant", "content": "...", "timestamp": "..."}
{"role": "tool", "name": "...", "content": "...", "timestamp": "..."}每个会话文件都是JSONL格式(仅追加,每行一个JSON对象):
json
{"role": "user", "content": "...", "timestamp": "..."}
{"role": "assistant", "content": "...", "timestamp": "..."}
{"role": "tool", "name": "...", "content": "...", "timestamp": "..."}Extraction rules
提取规则
- Prioritize assistant turns that state conclusions, decisions, or patterns
- Extract user intent from high-signal turns; skip low-information follow-ups
- Tool calls are context, not primary knowledge — only extract if the result contains a reusable insight
- Cross-reference index to get session names/labels before opening individual transcripts
sessions.json
- 优先提取助手发言中包含结论、决策或模式的内容
- 从高信号发言中提取用户意图;跳过低信息量的后续内容
- 工具调用是上下文信息,而非核心知识——仅当结果包含可复用洞察时才提取
- 在打开单个转录文件前,先交叉引用索引获取会话名称/标签
sessions.json
Critical privacy filter
隐私过滤关键规则
Session transcripts can include injected instructions, tool payloads, and sensitive text. Do not ingest verbatim.
- Remove API keys, tokens, passwords, credentials
- Redact private identifiers unless relevant and user-approved
- Summarize; do not quote raw transcripts verbatim
会话转录可能包含注入的指令、工具负载和敏感文本。不要原样导入。
- 删除API密钥、令牌、密码、凭证
- 编辑私人标识符,除非相关且经用户批准
- 进行总结;不要直接引用原始转录内容
Step 5: Cluster by Topic
步骤5:按主题聚类
Do not create one wiki page per session or per MEMORY.md entry.
- Group by stable topic (concept, tool, project, technique)
- Split mixed sessions into separate themes
- Merge recurring patterns across dates and agents
- Use session or workspace path to infer project scope when available
cwd
不要为每个会话或每个MEMORY.md条目创建一个wiki页面。
- 按稳定主题(概念、工具、项目、技术)分组
- 将混合会话拆分为不同主题
- 合并跨日期和Agent的重复模式
- 若可用,使用会话的或工作区路径推断项目范围
cwd
Step 6: Distill into Wiki Pages
步骤6:提炼为Wiki页面
Route extracted knowledge using existing wiki conventions:
- Project-specific architecture/process →
projects/<name>/... - General concepts →
concepts/ - Recurring techniques/debug playbooks →
skills/ - Tools/services/frameworks →
entities/ - Cross-session patterns →
synthesis/
For each impacted project, create/update .
projects/<name>/<name>.md使用现有wiki约定路由提取的知识:
- 项目特定架构/流程 →
projects/<name>/... - 通用概念 →
concepts/ - 重复使用的技术/调试手册 →
skills/ - 工具/服务/框架 →
entities/ - 跨会话模式 →
synthesis/
对于每个受影响的项目,创建/更新。
projects/<name>/<name>.mdWriting rules
写作规则
- Distill knowledge, not chronology
- Avoid "on date X we discussed..." unless date context is essential
- Add frontmatter on each new/updated page (1–2 sentences, ≤ 200 chars)
summary: - Add provenance markers:
- when directly grounded in explicit session/memory content
^[extracted] - when synthesizing patterns across multiple sessions
^[inferred] - when sessions conflict
^[ambiguous]
- Add/update frontmatter mix for each changed page
provenance:
- 提炼知识,而非记录时间线
- 避免使用“在X日期我们讨论了...”,除非日期上下文至关重要
- 在每个新建/更新的页面添加前置元数据(1-2句话,≤200字符)
summary: - 添加来源标记:
- 直接来自明确的会话/记忆内容
^[extracted] - 综合多个会话的模式得出
^[inferred] - 会话内容存在冲突时
^[ambiguous]
- 为每个修改过的页面添加/更新前置元数据
provenance:
Step 7: Update Manifest, Log, and Index
步骤7:更新清单、日志和索引
Update .manifest.json
.manifest.json更新.manifest.json
.manifest.jsonFor each processed source file:
- ,
ingested_at,size_bytesmodified_at - :
source_type|openclaw_memory|openclaw_daily_note|openclaw_sessionopenclaw_dreams - : agent directory name (when applicable)
agent_id - ,
pages_createdpages_updated
Add/update a top-level summary block:
json
{
"openclaw": {
"source_path": "~/.openclaw/",
"last_ingested": "TIMESTAMP",
"memory_updated_at": "TIMESTAMP",
"daily_notes_ingested": 14,
"sessions_ingested": 23,
"pages_created": 6,
"pages_updated": 18
}
}对于每个已处理的源文件:
- 、
ingested_at、size_bytesmodified_at - :
source_type|openclaw_memory|openclaw_daily_note|openclaw_sessionopenclaw_dreams - : Agent目录名称(适用时)
agent_id - 、
pages_createdpages_updated
添加/更新顶级摘要块:
json
{
"openclaw": {
"source_path": "~/.openclaw/",
"last_ingested": "TIMESTAMP",
"memory_updated_at": "TIMESTAMP",
"daily_notes_ingested": 14,
"sessions_ingested": 23,
"pages_created": 6,
"pages_updated": 18
}
}Update special files
更新特殊文件
Update and :
index.mdlog.md- [TIMESTAMP] OPENCLAW_HISTORY_INGEST memory=updated daily_notes=N sessions=M pages_updated=X pages_created=Y mode=append|full更新和:
index.mdlog.md- [TIMESTAMP] OPENCLAW_HISTORY_INGEST memory=updated daily_notes=N sessions=M pages_updated=X pages_created=Y mode=append|fullPrivacy and Compliance
隐私与合规
- Distill and synthesize; avoid raw memory or transcript dumps
- Default to redaction for anything that looks sensitive
- Ask the user before storing personal or sensitive details
- Keep references to other people minimal and purpose-bound
- 提炼和综合内容;避免直接导入原始记忆或转录内容
- 默认编辑所有看起来敏感的内容
- 在存储个人或敏感细节前询问用户
- 尽量减少对他人的引用,且仅用于特定目的
Reference
参考
See for field-level notes and parsing guidance.
references/openclaw-data-format.md有关字段级说明和解析指南,请参阅。
references/openclaw-data-format.md