wiki-ingest
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseIngest — Process source into the wiki
摄入流程——将源文件处理并纳入wiki
Follow for format conventions, frontmatter, naming, and language.
wiki/CONVENTIONS.md请遵循中的格式规范、前置元数据、命名规则和语言要求。
wiki/CONVENTIONS.mdLanguage
语言规范
Write the artifact in the user's language. Apply correct grammar and any required diacritics or script-specific characters. If the user's language is unclear, ask before generating output.
使用用户的语言编写文档产物。确保语法正确,包含必要的变音符号或特定脚本字符。如果用户的语言不明确,在生成输出前先询问用户。
Query language alignment
查询语言对齐
When matching a source against existing wiki pages, search in the wiki language, not necessarily in the user's or source's language. Determine the wiki language from ( or ), then from wiki frontmatter/index if guardrails are absent. Keep product names, filenames, APIs, schema names, and code identifiers unchanged.
.wiki-guardrails.ymlquery_languagelanguage- Translate QMD ,
intent:, andvec:into the wiki language.hyde: - Keep exact source terms in when they may appear untranslated.
lex: - Write/update wiki artifacts in the wiki language unless the target page clearly uses another language.
- Communicate progress and questions to the user in the user's language.
将源文件与现有wiki页面匹配时,请使用wiki的语言进行搜索,而非用户语言或源文件语言。优先从文件的或字段确定wiki语言;如果没有该配置文件,则从wiki的前置元数据或索引页面判断。产品名称、文件名、API、schema名称和代码标识符保持不变。
.wiki-guardrails.ymlquery_languagelanguage- 将QMD的、
intent:和vec:字段翻译为wiki语言。hyde: - 当源文件中的术语可能以未翻译形式出现时,在中保留原术语。
lex: - 除非目标页面明确使用其他语言,否则请使用wiki语言编写/更新wiki文档。
- 使用用户的语言向其沟通进度和疑问。
Retrieval — prefer QMD when available
检索——优先使用QMD(若已配置)
When searching for existing pages that should absorb the new source, use QMD (local hybrid search) if it is set up for this wiki. Setup is one-time per repo — see .
docs/wiki/qmd-setup.md- Use (MCP) or
mcp__qmd__query(CLI) instead of grep/glob whenever available.qmd query --json --files - Always pass an line to QMD describing the topic of the new source — it dramatically improves the matches against existing pages.
intent: - Fall back to /
greponly when QMD is not configured.Glob
After the ingest finishes, remind the owner to run (or if a new collection was added) so the index reflects the new pages — never run those commands automatically.
qmd updateqmd embed当搜索应整合新源文件的现有页面时,如果该wiki已配置QMD(本地混合搜索工具),请优先使用它。每个仓库只需配置一次——详见。
docs/wiki/qmd-setup.md- 只要可用,就使用(MCP)或
mcp__qmd__query(CLI)替代grep/glob进行搜索。qmd query --json --files - 务必向QMD传递一行描述新源文件的主题——这能大幅提升与现有页面的匹配效果。
intent: - 仅当未配置QMD时,才回退使用/
grep。Glob
摄入完成后,提醒仓库所有者运行(如果添加了新集合则运行),使索引能反映新增页面——切勿自动执行这些命令。
qmd updateqmd embedSteps
步骤
-
Identify the source the user asked to process (comes from the conversation context or the provided path).
-
Check if it has already been ingested by consulting:
raw/index.md- If already ingested (has a summary in ) → follow the Re-ingest flow (section below).
wiki/sources/ - If not → proceed with normal ingest.
- If already ingested (has a summary in
-
Read the source incompletely (in parts if needed due to context limits).
raw/- or
.txttranscripts → read directly..md - If MCP is available → transcribe video/audio directly.
video-whisper - If MCP is available → convert PDF to markdown.
pdf-docling
-
Extract and present in parallel:
- Identify the 3-5 most important points in the source.
- For each point, run a QMD query (or grep fallback) to find related existing pages — use the point itself as and a short paraphrase as
intent:.vec: - Present the points to the human alongside the matched pages.
- Ask if the user wants to emphasize or skip anything.
- Wait for confirmation before proceeding.
-
Decide where each rule lands. Respect the audience separation:
- — all business/product rules (audience: business). Pricing, journeys, policies, monetization, privacy/compliance rules, anything customer-facing or contractual.
wiki/business/ - — app-level technical docs (audience: dev). Stack, gotchas, deploy specifics. No business rules here — link out to
wiki/apps/if relevant.wiki/business/ - — operational procedures (audience: ops).
wiki/ops/ - — data models and schemas (audience: dev).
wiki/data/ - If the source mixes business and technical content, split it across the right folders rather than mashing everything into one page.
-
Create/update wiki pages based on confirmation:
- If the page already exists → update (add source in , revise content, flag contradictions).
sources: - If it does not exist → create a new page following .
wiki/CONVENTIONS.md - Use standard markdown links: (not wikilinks).
[text](./path.md) - Do NOT duplicate content across pages — use cross-refs.
- Sibling repos are referenced as (relative to the wiki), not absolute paths. Add the GitHub remote link
../<repo>/when first introducing the repo.[github.com/<org>/<repo>](https://github.com/<org>/<repo>)
- If the page already exists → update (add source in
-
Create the source summary in:
wiki/sources/<slug>.mdyaml--- title: "Summary — <Source Title>" audience: mixed sources: - raw/<subdir>/<filename> updated: YYYY-MM-DD tags: [source, <relevant-tags>] status: stable ---Summary content:- Metadata (date, participants, duration if applicable)
- Key points summary
- Wiki pages created/updated with links
- Insights not captured in other pages (if any)
- Decisions left pending — surface explicitly if the source raises questions the owner needs to answer rather than silently picking a side.
-
Update indexes in parallel:
- — mark the source as ✅ Ingested with a link to the summary.
raw/index.md - — add new pages to the corresponding table, update descriptions of modified pages.
wiki/index.md - — log the operation at the top (after the header, before existing entries):
wiki/log.md## [YYYY-MM-DD] ingest | <Source Title> - Pages created: ... - Pages updated: ... - Summary: wiki/sources/<slug>.md
-
Focused post-ingest lint:
- Verify cross-refs of created/updated pages (do links point to real targets?).
- Compare with referenced pages — flag contradictions with an explicit section.
- Do NOT run a full lint (orphans, global frontmatter, etc.) — that is for .
/wiki-lint
-
Tell the owner to refresh the QMD index (only if QMD is configured for this wiki):
qmd update # picks up changed/new pages qmd embed # only needed if you added/removed a collectionDo not run these for the owner — surface them in the final report.
-
确认源文件:确定用户要求处理的源文件(来自对话上下文或提供的路径)。
-
检查是否已摄入:查看:
raw/index.md- 若已摄入(中已有对应摘要)→ 遵循重新摄入流程(见下文章节)。
wiki/sources/ - 若未摄入→ 执行正常摄入流程。
- 若已摄入(
-
读取源文件:完整读取目录下的源文件(若受上下文长度限制,可分部分读取)。
raw/- 或
.txt转录文件→ 直接读取。.md - 若MCP 可用→ 直接转录视频/音频文件。
video-whisper - 若MCP 可用→ 将PDF转换为markdown格式。
pdf-docling
-
提取并并行展示:
- 识别源文件中3-5个最重要的要点。
- 针对每个要点,运行QMD查询(或回退使用grep)查找相关现有页面——将要点本身作为,简短改写作为
intent:。vec: - 将要点与匹配到的页面一并展示给用户。
- 询问用户是否要重点强调或跳过某些内容。
- 等待用户确认后再继续。
-
确定内容归属:严格区分目标受众:
- —— 所有业务/产品规则(受众:业务人员)。包括定价、用户旅程、政策、盈利模式、隐私/合规规则,以及所有面向客户或合同相关内容。
wiki/business/ - —— 应用级技术文档(受众:开发人员)。包括技术栈、常见问题、部署细节。此处不得包含业务规则——若相关,需链接到
wiki/apps/中的对应内容。wiki/business/ - —— 操作流程文档(受众:运维人员)。
wiki/ops/ - —— 数据模型和schema文档(受众:开发人员)。
wiki/data/ - 若源文件同时包含业务和技术内容,需拆分内容到对应目录,而非全部合并到一个页面。
-
创建/更新wiki页面(基于用户确认):
- 若页面已存在→ 更新(在中添加源文件,修订内容,标记矛盾点)。
sources: - 若页面不存在→ 创建新页面,遵循规范。
wiki/CONVENTIONS.md - 使用标准markdown链接:(而非wiki链接)。
[text](./path.md) - 不得在多个页面重复内容——使用交叉引用。
- 关联仓库需使用相对路径(相对于wiki目录),而非绝对路径。首次引入仓库时,需添加GitHub远程链接
../<repo>/。[github.com/<org>/<repo>](https://github.com/<org>/<repo>)
- 若页面已存在→ 更新(在
-
在中创建源文件摘要:
wiki/sources/<slug>.mdyaml--- title: "Summary — <Source Title>" audience: mixed sources: - raw/<subdir>/<filename> updated: YYYY-MM-DD tags: [source, <relevant-tags>] status: stable ---摘要内容:- 元数据(日期、参与者、时长,若适用)
- 要点总结
- 创建/更新的wiki页面及对应链接
- 未在其他页面体现的洞察(如有)
- 待解决的决策点——若源文件提出需要所有者回答的问题,需明确指出,而非自行决定。
-
并行更新索引:
- —— 将源文件标记为✅ 已摄入,并添加指向摘要的链接。
raw/index.md - —— 在对应表格中添加新页面,更新修改页面的描述。
wiki/index.md - —— 在顶部(标题之后,现有条目之前)记录操作:
wiki/log.md## [YYYY-MM-DD] ingest | <Source Title> - Pages created: ... - Pages updated: ... - Summary: wiki/sources/<slug>.md
-
针对性摄入后检查:
- 验证创建/更新页面的交叉引用(链接是否指向有效目标?)。
- 与引用页面对比——明确标记矛盾点。
- 无需执行完整检查(孤立页面、全局前置元数据等)——该工作由负责。
/wiki-lint
-
提醒所有者刷新QMD索引(仅当wiki已配置QMD时):
qmd update # 识别变更/新增页面 qmd embed # 仅在添加/移除集合时需要执行切勿替所有者执行这些命令——需在最终报告中告知他们。
Re-ingest (already cataloged source)
重新摄入(已编入目录的源文件)
When the source already has :
wiki/sources/<slug>.md- Read the source and the existing summary.
- Compare and identify gaps:
- Concepts/information in the source that are not in the wiki (use QMD with set to each gap candidate to confirm absence)
intent: - Pages that can be expanded
- Contradictions with existing content
- Concepts/information in the source that are not in the wiki (use QMD with
- Present the diagnosis to the human:
## Re-ingest: <Title> ### Identified gaps - ... ### Pages that can be expanded - ... - Wait for approval and execute.
- Update the summary, , and
raw/index.md.wiki/log.md
当源文件已存在对应时:
wiki/sources/<slug>.md- 读取源文件和现有摘要。
- 对比并识别差距:
- 源文件中有但wiki中没有的概念/信息(使用QMD,将每个候选差距作为查询以确认缺失)
intent: - 可扩展的页面
- 与现有内容的矛盾点
- 源文件中有但wiki中没有的概念/信息(使用QMD,将每个候选差距作为
- 向用户展示诊断结果:
## Re-ingest: <Title> ### Identified gaps - ... ### Pages that can be expanded - ... - 等待用户批准后执行。
- 更新摘要、和
raw/index.md。wiki/log.md
Rules
规则
- Never modify files in .
raw/ - One source at a time (unless the user explicitly asks for batch).
- Always use complete YAML frontmatter (see ).
wiki/CONVENTIONS.md - If you find a contradiction with existing pages, flag it explicitly.
- Always update ,
raw/index.md, andwiki/index.md.wiki/log.md - Business rules belong in , never inside product/code repos. If a rule is currently sitting in a product repo (somewhere like
wiki/business/), the ingest should migrate it to../<product>/docs/and leave a cross-ref in the product repo'swiki/business//CLAUDE.mdif appropriate.AGENTS.md - Never run /
qmd embed/qmd updateautomatically — those are owner-run commands.qmd collection add
- 切勿修改目录下的文件。
raw/ - 一次处理一个源文件(除非用户明确要求批量处理)。
- 始终使用完整的YAML前置元数据(详见)。
wiki/CONVENTIONS.md - 若发现与现有页面的矛盾点,需明确标记。
- 务必更新、
raw/index.md和wiki/index.md。wiki/log.md - 业务规则必须放在目录,绝不能放在产品/代码仓库中。如果某个规则当前位于产品仓库(如
wiki/business/),摄入流程需将其迁移到../<product>/docs/,并在产品仓库的wiki/business//CLAUDE.md中添加交叉引用(若合适)。AGENTS.md - 切勿自动执行/
qmd embed/qmd update——这些命令需由所有者执行。qmd collection add