wiki-ingest
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePersonal Wiki — Ingest
个人Wiki — 导入处理
Process raw source documents into structured, interlinked wiki pages.
将原始源文档处理为结构化、相互关联的Wiki页面。
Identify Sources to Process
确定待处理的源文档
Determine which files need ingestion:
- If the user specifies a file or files, use those
- If the user says "process new sources", "ingest", or similar without specifying files — run batch detection:
- List all files in (excluding
raw/)raw/assets/ - Read and extract all previously ingested source filenames from
wiki/log.mdentriesingest - Any file in not listed in the log is unprocessed
raw/ - Show the user the list of unprocessed files found
- List all files in
- If no unprocessed files are found, tell the user
确定需要导入的文件:
- 如果用户指定了某个或某些文件,使用这些文件
- 如果用户说“处理新源文档”“导入”或类似表述但未指定文件——执行批量检测:
- 列出目录下的所有文件(排除
raw/)raw/assets/ - 读取并从
wiki/log.md条目提取所有已导入的源文件名ingest - 目录中未在日志里列出的文件即为未处理文件
raw/ - 向用户展示找到的未处理文件列表
- 列出
- 如果未找到未处理文件,告知用户
Process Each Source
处理每个源文档
For each source file, process autonomously — read, create pages, and report results. No confirmation step between sources.
对每个源文档自主进行处理——读取内容、创建页面并报告结果。源文档之间无需确认步骤。
1. Read the source completely
1. 完整读取源文档
Read the entire file. If the file contains image references, note them — read the images separately if they contain important information.
For books: support both chapter-by-chapter and whole-book ingestion. When ingesting a chapter, create or update a parent book page in that links to all chapter summaries. When ingesting a whole book, create a single comprehensive source page.
wiki/sources/完整读取整个文件。如果文件包含图片引用,记录下来——如果图片包含重要信息,需单独读取图片内容。
对于书籍:支持按章节导入和整本书导入。导入单章节时,在目录下创建或更新一个父级书籍页面,链接到所有章节摘要。导入整本书时,创建一个单一的综合源页面。
wiki/sources/2. Create source summary page
2. 创建源文档摘要页面
Create a new file in named after the source (slugified). Include:
wiki/sources/---
tags: [relevant, tags]
sources: [original-filename.md]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---
# Source Title
**Source:** original-filename.md
**Date ingested:** YYYY-MM-DD
**Type:** article | paper | transcript | notes | book | chapter | etc.
## Summary
Structured summary of the source content.
## Key Claims
- Claim 1
- Claim 2
- ...
## Entities Mentioned
- [[Entity Name]] — brief context
- ...
## Concepts Covered
- [[Concept Name]] — brief context
- ...在目录下创建一个以源文档名称(slug化)命名的新文件。内容包含:
wiki/sources/---
tags: [relevant, tags]
sources: [original-filename.md]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---
# 源文档标题
**来源:** original-filename.md
**导入日期:** YYYY-MM-DD
**类型:** 文章 | 论文 | 文字记录 | 笔记 | 书籍 | 章节 | 等
## 摘要
源文档内容的结构化摘要。
## 核心主张
- 主张1
- 主张2
- ...
## 提及的实体
- [[实体名称]] — 简要背景
- ...
## 涵盖的概念
- [[概念名称]] — 简要背景
- ...3. Update entity and concept pages
3. 更新实体与概念页面
For each entity (person, organization, product, tool) and concept (idea, framework, theory, pattern) mentioned in the source:
If a wiki page already exists:
- Read the existing page
- Add new information from this source
- Add the source to the frontmatter list
sources: - Update the date
updated: - Flag any contradictions with existing content using a callout block:
> [!warning] Contradiction
> Source A claims X, but Source B claims Y.
> Sources: [[Source A]], [[Source B]]If no wiki page exists and the topic is substantive enough:
- Create a new page in the appropriate subdirectory:
- for people, organizations, products, tools
wiki/entities/ - for ideas, frameworks, theories, patterns
wiki/concepts/
- Include YAML frontmatter with tags, sources, created, and updated fields
- Write a focused summary based on what this source says about the topic
If the topic is only mentioned in passing:
- Use a without creating a page — the lint pass will flag frequently-mentioned-but-missing pages later
[[wikilink]]
对于源文档中提及的每个实体(人物、组织、产品、工具)和概念(想法、框架、理论、模式):
如果Wiki页面已存在:
- 读取现有页面
- 添加来自此源文档的新信息
- 将该源文档添加到前置元数据列表中
sources: - 更新日期
updated: - 使用提示框标记与现有内容的矛盾之处:
> [!warning] Contradiction
> Source A claims X, but Source B claims Y.
> Sources: [[Source A]], [[Source B]]如果Wiki页面不存在且主题足够重要:
- 在合适的子目录中创建新页面:
- 用于人物、组织、产品、工具
wiki/entities/ - 用于想法、框架、理论、模式
wiki/concepts/
- 包含带有tags、sources、created和updated字段的YAML前置元数据
- 根据此源文档中关于该主题的内容撰写聚焦的摘要
如果主题只是被顺带提及:
- 使用语法但不创建页面——后续的检查步骤会标记频繁提及但缺失的页面
[[wikilink]]
4. Add wikilinks
4. 添加Wiki链接
Ensure all related pages link to each other using syntax. Every mention of an entity or concept that has its own page should be linked.
[[wikilink]]确保所有相关页面使用语法相互链接。每个提及的实体或概念只要有对应的页面,都应添加链接。
[[wikilink]]5. Update wiki/index.md
5. 更新wiki/index.md
For each new page created, add an entry under the appropriate category header:
- [[Page Name]] — one-line summary (under 120 characters)对于每个创建的新页面,在合适的分类标题下添加条目:
- [[页面名称]] — 一行摘要(不超过120字符)6. Update wiki/log.md
6. 更新wiki/log.md
Append:
## [YYYY-MM-DD] ingest | Source Title
Processed source-filename.md. Created N new pages, updated M existing pages.
New entities: [[Entity1]], [[Entity2]]. New concepts: [[Concept1]].追加内容:
## [YYYY-MM-DD] ingest | 源文档标题
处理了source-filename.md。创建了N个新页面,更新了M个现有页面。
新增实体:[[Entity1]], [[Entity2]]。新增概念:[[Concept1]]。7. Report results
7. 报告处理结果
Tell the user what was done:
- Pages created (with links)
- Pages updated (with what changed)
- New entities and concepts identified
- Any contradictions found with existing content
When processing multiple sources (batch), report aggregate results at the end:
- Total sources processed
- Total pages created and updated
- Summary of new entities and concepts across all sources
告知用户已完成的操作:
- 创建的页面(附带链接)
- 更新的页面(以及变更内容)
- 识别出的新实体和概念
- 发现的与现有内容的矛盾之处
当处理多个源文档(批量处理)时,在最后报告汇总结果:
- 处理的源文档总数
- 创建和更新的页面总数
- 所有源文档中新增实体和概念的汇总
Conventions
约定规则
- Tags are organic — let them emerge naturally from content. Don't force a domain taxonomy. When a source spans multiple domains, tag with all relevant domains.
- Source summary pages are factual only. Save interpretation and synthesis for concept and synthesis pages.
- A single source typically touches 10-15 wiki pages. This is normal and expected.
- When new information contradicts existing wiki content, add a callout with both sources cited.
> [!warning] Contradiction - Prefer updating existing pages over creating new ones. Only create a new page when the topic is substantive enough to warrant it.
- Use for all internal references. Never use raw file paths.
[[wikilinks]]
- 标签是自然生成的——从内容中自然衍生,不要强行套用领域分类体系。当源文档涉及多个领域时,标记所有相关领域的标签。
- 源文档摘要页面仅包含事实内容。将解读和综合内容留到概念与综合页面中。
- 单个源文档通常会关联10-15个Wiki页面,这是正常且预期的情况。
- 当新信息与现有Wiki内容矛盾时,添加提示框并引用两个来源。
> [!warning] Contradiction - 优先更新现有页面而非创建新页面。仅当主题足够重要值得单独成页时才创建新页面。
- 所有内部引用使用语法,绝不使用原始文件路径。
[[wikilinks]]
What's Next
后续操作
After ingesting sources, the user can:
- Ask questions with to explore what was ingested
/wiki-query - Ingest more sources — clip another article and run again
/wiki-ingest - Health-check with after every 10 ingests to catch gaps
/wiki-lint - Discover connections with to find cross-domain patterns
/wiki-explore - Get inspired with to generate creative prompts
/wiki-spark
导入源文档后,用户可以:
- 使用提问,探索已导入的内容
/wiki-query - 导入更多源文档——截取另一篇文章并再次运行
/wiki-ingest - 每导入10次后使用进行健康检查,排查内容缺口
/wiki-lint - 使用发现关联,找到跨领域的模式
/wiki-explore - 使用获取灵感,生成创意提示词
/wiki-spark