wiki-ingest

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Personal Wiki — Ingest

个人Wiki — 导入处理

Process raw source documents into structured, interlinked wiki pages.
将原始源文档处理为结构化、相互关联的Wiki页面。

Identify Sources to Process

确定待处理的源文档

Determine which files need ingestion:
  1. If the user specifies a file or files, use those
  2. If the user says "process new sources", "ingest", or similar without specifying files — run batch detection:
    • List all files in
      raw/
      (excluding
      raw/assets/
      )
    • Read
      wiki/log.md
      and extract all previously ingested source filenames from
      ingest
      entries
    • Any file in
      raw/
      not listed in the log is unprocessed
    • Show the user the list of unprocessed files found
  3. If no unprocessed files are found, tell the user
确定需要导入的文件:
  1. 如果用户指定了某个或某些文件,使用这些文件
  2. 如果用户说“处理新源文档”“导入”或类似表述但未指定文件——执行批量检测:
    • 列出
      raw/
      目录下的所有文件(排除
      raw/assets/
    • 读取
      wiki/log.md
      并从
      ingest
      条目提取所有已导入的源文件名
    • raw/
      目录中未在日志里列出的文件即为未处理文件
    • 向用户展示找到的未处理文件列表
  3. 如果未找到未处理文件,告知用户

Process Each Source

处理每个源文档

For each source file, process autonomously — read, create pages, and report results. No confirmation step between sources.
对每个源文档自主进行处理——读取内容、创建页面并报告结果。源文档之间无需确认步骤。

1. Read the source completely

1. 完整读取源文档

Read the entire file. If the file contains image references, note them — read the images separately if they contain important information.
For books: support both chapter-by-chapter and whole-book ingestion. When ingesting a chapter, create or update a parent book page in
wiki/sources/
that links to all chapter summaries. When ingesting a whole book, create a single comprehensive source page.
完整读取整个文件。如果文件包含图片引用,记录下来——如果图片包含重要信息,需单独读取图片内容。
对于书籍:支持按章节导入和整本书导入。导入单章节时,在
wiki/sources/
目录下创建或更新一个父级书籍页面,链接到所有章节摘要。导入整本书时,创建一个单一的综合源页面。

2. Create source summary page

2. 创建源文档摘要页面

Create a new file in
wiki/sources/
named after the source (slugified). Include:
---
tags: [relevant, tags]
sources: [original-filename.md]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---

# Source Title

**Source:** original-filename.md
**Date ingested:** YYYY-MM-DD
**Type:** article | paper | transcript | notes | book | chapter | etc.

## Summary

Structured summary of the source content.

## Key Claims

- Claim 1
- Claim 2
- ...

## Entities Mentioned

- [[Entity Name]] — brief context
- ...

## Concepts Covered

- [[Concept Name]] — brief context
- ...
wiki/sources/
目录下创建一个以源文档名称(slug化)命名的新文件。内容包含:
---
tags: [relevant, tags]
sources: [original-filename.md]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---

# 源文档标题

**来源:** original-filename.md
**导入日期:** YYYY-MM-DD
**类型:** 文章 | 论文 | 文字记录 | 笔记 | 书籍 | 章节 | 等

## 摘要

源文档内容的结构化摘要。

## 核心主张

- 主张1
- 主张2
- ...

## 提及的实体

- [[实体名称]] — 简要背景
- ...

## 涵盖的概念

- [[概念名称]] — 简要背景
- ...

3. Update entity and concept pages

3. 更新实体与概念页面

For each entity (person, organization, product, tool) and concept (idea, framework, theory, pattern) mentioned in the source:
If a wiki page already exists:
  • Read the existing page
  • Add new information from this source
  • Add the source to the
    sources:
    frontmatter list
  • Update the
    updated:
    date
  • Flag any contradictions with existing content using a callout block:
> [!warning] Contradiction
> Source A claims X, but Source B claims Y.
> Sources: [[Source A]], [[Source B]]
If no wiki page exists and the topic is substantive enough:
  • Create a new page in the appropriate subdirectory:
    • wiki/entities/
      for people, organizations, products, tools
    • wiki/concepts/
      for ideas, frameworks, theories, patterns
  • Include YAML frontmatter with tags, sources, created, and updated fields
  • Write a focused summary based on what this source says about the topic
If the topic is only mentioned in passing:
  • Use a
    [[wikilink]]
    without creating a page — the lint pass will flag frequently-mentioned-but-missing pages later
对于源文档中提及的每个实体(人物、组织、产品、工具)和概念(想法、框架、理论、模式):
如果Wiki页面已存在:
  • 读取现有页面
  • 添加来自此源文档的新信息
  • 将该源文档添加到
    sources:
    前置元数据列表中
  • 更新
    updated:
    日期
  • 使用提示框标记与现有内容的矛盾之处:
> [!warning] Contradiction
> Source A claims X, but Source B claims Y.
> Sources: [[Source A]], [[Source B]]
如果Wiki页面不存在且主题足够重要:
  • 在合适的子目录中创建新页面:
    • wiki/entities/
      用于人物、组织、产品、工具
    • wiki/concepts/
      用于想法、框架、理论、模式
  • 包含带有tags、sources、created和updated字段的YAML前置元数据
  • 根据此源文档中关于该主题的内容撰写聚焦的摘要
如果主题只是被顺带提及:
  • 使用
    [[wikilink]]
    语法但不创建页面——后续的检查步骤会标记频繁提及但缺失的页面

4. Add wikilinks

4. 添加Wiki链接

Ensure all related pages link to each other using
[[wikilink]]
syntax. Every mention of an entity or concept that has its own page should be linked.
确保所有相关页面使用
[[wikilink]]
语法相互链接。每个提及的实体或概念只要有对应的页面,都应添加链接。

5. Update wiki/index.md

5. 更新wiki/index.md

For each new page created, add an entry under the appropriate category header:
- [[Page Name]] — one-line summary (under 120 characters)
对于每个创建的新页面,在合适的分类标题下添加条目:
- [[页面名称]] — 一行摘要(不超过120字符)

6. Update wiki/log.md

6. 更新wiki/log.md

Append:
## [YYYY-MM-DD] ingest | Source Title
Processed source-filename.md. Created N new pages, updated M existing pages.
New entities: [[Entity1]], [[Entity2]]. New concepts: [[Concept1]].
追加内容:
## [YYYY-MM-DD] ingest | 源文档标题
处理了source-filename.md。创建了N个新页面,更新了M个现有页面。
新增实体:[[Entity1]], [[Entity2]]。新增概念:[[Concept1]]。

7. Report results

7. 报告处理结果

Tell the user what was done:
  • Pages created (with links)
  • Pages updated (with what changed)
  • New entities and concepts identified
  • Any contradictions found with existing content
When processing multiple sources (batch), report aggregate results at the end:
  • Total sources processed
  • Total pages created and updated
  • Summary of new entities and concepts across all sources
告知用户已完成的操作:
  • 创建的页面(附带链接)
  • 更新的页面(以及变更内容)
  • 识别出的新实体和概念
  • 发现的与现有内容的矛盾之处
当处理多个源文档(批量处理)时,在最后报告汇总结果:
  • 处理的源文档总数
  • 创建和更新的页面总数
  • 所有源文档中新增实体和概念的汇总

Conventions

约定规则

  • Tags are organic — let them emerge naturally from content. Don't force a domain taxonomy. When a source spans multiple domains, tag with all relevant domains.
  • Source summary pages are factual only. Save interpretation and synthesis for concept and synthesis pages.
  • A single source typically touches 10-15 wiki pages. This is normal and expected.
  • When new information contradicts existing wiki content, add a
    > [!warning] Contradiction
    callout
    with both sources cited.
  • Prefer updating existing pages over creating new ones. Only create a new page when the topic is substantive enough to warrant it.
  • Use
    [[wikilinks]]
    for all internal references. Never use raw file paths.
  • 标签是自然生成的——从内容中自然衍生,不要强行套用领域分类体系。当源文档涉及多个领域时,标记所有相关领域的标签。
  • 源文档摘要页面仅包含事实内容。将解读和综合内容留到概念与综合页面中。
  • 单个源文档通常会关联10-15个Wiki页面,这是正常且预期的情况。
  • 当新信息与现有Wiki内容矛盾时,添加
    > [!warning] Contradiction
    提示框
    并引用两个来源。
  • 优先更新现有页面而非创建新页面。仅当主题足够重要值得单独成页时才创建新页面。
  • 所有内部引用使用
    [[wikilinks]]
    语法,绝不使用原始文件路径。

What's Next

后续操作

After ingesting sources, the user can:
  • Ask questions with
    /wiki-query
    to explore what was ingested
  • Ingest more sources — clip another article and run
    /wiki-ingest
    again
  • Health-check with
    /wiki-lint
    after every 10 ingests to catch gaps
  • Discover connections with
    /wiki-explore
    to find cross-domain patterns
  • Get inspired with
    /wiki-spark
    to generate creative prompts
导入源文档后,用户可以:
  • 使用
    /wiki-query
    提问,探索已导入的内容
  • 导入更多源文档——截取另一篇文章并再次运行
    /wiki-ingest
  • 每导入10次后使用
    /wiki-lint
    进行健康检查,排查内容缺口
  • 使用
    /wiki-explore
    发现关联,找到跨领域的模式
  • 使用
    /wiki-spark
    获取灵感,生成创意提示词