llm-wiki-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

llm-wiki-skill — Multi-Platform Knowledge Base Builder

llm-wiki-skill — 多平台知识库构建工具

Skill by ara.so — Daily 2026 Skills collection.
Build a persistent, interlinked personal knowledge base from URLs, PDFs, markdown files, and raw text. Based on Karpathy's llm-wiki methodology: knowledge is compiled once and maintained, not re-derived from raw docs on every query.
Skill由ara.so开发,属于2026年度每日技能合集。
基于URL、PDF、Markdown文件和纯文本构建持久化、可关联的个人知识库。遵循Karpathy的llm-wiki方法论:知识一次编译持续维护,无需每次查询都从原始文档重新推导。

What It Does

功能特性

  • Ingests articles, tweets, PDFs, YouTube transcripts, WeChat posts, and plain text
  • Routes each source type to the best extraction tool automatically
  • Generates structured wiki pages with
    [[bidirectional links]]
  • Produces entity pages, topic pages, source summaries, and comparisons
  • Outputs Obsidian-compatible local markdown files
  • Detects orphaned pages, broken links, and contradictions via health checks
  • 支持摄入文章、推文、PDF、YouTube字幕、微信公众号文章和纯文本
  • 自动为每种源类型匹配最优提取工具
  • 生成带有
    [[双向链接]]
    的结构化Wiki页面
  • 输出实体页面、主题页面、源摘要和对比分析内容
  • 生成兼容Obsidian的本地Markdown文件
  • 可通过健康检查识别孤立页面、死链和内容矛盾

Installation

安装方式

Recommended: Let Your Agent Install It

推荐方式:让Agent自动安装

Give your agent the repo URL and ask it to install for your platform:
https://github.com/sdyckjq-lab/llm-wiki-skill
给Agent提供仓库地址,让它根据你的平台完成安装:
https://github.com/sdyckjq-lab/llm-wiki-skill

Manual Installation

手动安装

Clone the repo anywhere, then run the installer for your platform:
bash
undefined
克隆仓库到任意位置,然后运行对应平台的安装脚本:
bash
undefined

Claude Code

Claude Code

bash install.sh --platform claude
bash install.sh --platform claude

Codex

Codex

bash install.sh --platform codex
bash install.sh --platform codex

OpenClaw

OpenClaw

bash install.sh --platform openclaw
bash install.sh --platform openclaw

Auto-detect (only if one platform directory exists)

自动检测(仅当存在唯一平台目录时生效)

bash install.sh --platform auto
bash install.sh --platform auto

Custom target directory (OpenClaw non-standard path)

自定义目标目录(OpenClaw非标准路径时使用)

bash install.sh --platform openclaw --target-dir /path/to/your/skills
undefined
bash install.sh --platform openclaw --target-dir /path/to/your/skills
undefined

Default Install Locations

默认安装路径

PlatformPath
Claude Code
~/.claude/skills/llm-wiki
Codex
~/.codex/skills/llm-wiki
OpenClaw
~/.openclaw/skills/llm-wiki
平台路径
Claude Code
~/.claude/skills/llm-wiki
Codex
~/.codex/skills/llm-wiki
OpenClaw
~/.openclaw/skills/llm-wiki

Legacy Claude Setup (existing users)

旧版Claude设置(存量用户适用)

bash
bash setup.sh
bash
bash setup.sh

This is now a compatibility shim for the unified installer

目前该脚本是统一安装器的兼容垫片

undefined
undefined

Prerequisites

前置依赖

bash
undefined
bash
undefined

Check Chrome debug mode is running (needed for web extraction)

确认Chrome调试模式已开启(网页提取所需)

google-chrome --remote-debugging-port=9222 &
google-chrome --remote-debugging-port=9222 &

Check uv is installed (needed for WeChat + YouTube extraction)

确认uv已安装(微信和YouTube内容提取所需)

uv --version
uv --version

Install bun OR npm (one is enough for web extraction deps)

安装bun 或 npm(二选一即可,用于网页提取依赖)

curl -fsSL https://bun.sh/install | bash
curl -fsSL https://bun.sh/install | bash

OR: npm is already available in most environments

或:大部分环境已默认预装npm

undefined
undefined

Platform Entry Points

平台入口文档

After installation, read the platform-specific instructions:
  • Claude Code:
    platforms/claude/CLAUDE.md
  • Codex:
    platforms/codex/AGENTS.md
  • OpenClaw:
    platforms/openclaw/README.md
安装完成后,可查看对应平台的专属说明:
  • Claude Code:
    platforms/claude/CLAUDE.md
  • Codex:
    platforms/codex/AGENTS.md
  • OpenClaw:
    platforms/openclaw/README.md

Knowledge Base Structure

知识库结构

your-wiki/
├── raw/                    # Immutable source material
│   ├── articles/           # Web articles
│   ├── tweets/             # X/Twitter
│   ├── wechat/             # WeChat posts
│   ├── xiaohongshu/        # Xiaohongshu (manual paste only)
│   ├── zhihu/              # Zhihu
│   ├── pdfs/               # PDFs
│   ├── notes/              # Notes
│   └── assets/             # Images, attachments
├── wiki/                   # AI-generated knowledge base
│   ├── entities/           # People, concepts, tools
│   ├── topics/             # Topic pages
│   ├── sources/            # Source summaries
│   ├── comparisons/        # Side-by-side analysis
│   └── synthesis/          # Cross-source synthesis
├── index.md                # Master index
├── log.md                  # Operation log
└── .wiki-schema.md         # Config
your-wiki/
├── raw/                    # 不可变的原始素材
│   ├── articles/           # 网页文章
│   ├── tweets/             # X/Twitter内容
│   ├── wechat/             # 微信公众号文章
│   ├── xiaohongshu/        # 小红书内容(仅支持手动粘贴)
│   ├── zhihu/              # 知乎内容
│   ├── pdfs/               # PDF文件
│   ├── notes/              # 笔记内容
│   └── assets/             # 图片、附件
├── wiki/                   # AI生成的知识库内容
│   ├── entities/           # 人物、概念、工具等实体
│   ├── topics/             # 主题页面
│   ├── sources/            # 源内容摘要
│   ├── comparisons/        # 横向对比分析
│   └── synthesis/          # 跨源内容综合
├── index.md                # 总索引
├── log.md                  # 操作日志
└── .wiki-schema.md         # 配置文件

Core Workflows

核心工作流

Initialize a New Knowledge Base

初始化新知识库

bash
undefined
bash
undefined

Ask your agent:

向你的Agent提问:

"Create a new knowledge base at ~/my-wiki"
"Create a new knowledge base at ~/my-wiki"

The agent will scaffold the directory structure,

Agent会自动生成目录结构,

generate index.md, log.md, and .wiki-schema.md

创建index.md、log.md和.wiki-schema.md文件

undefined
undefined

Ingest a Web Article

摄入网页文章

bash
undefined
bash
undefined

Agent command pattern:

Agent命令格式:

"Add this article to my wiki: https://example.com/article"
"Add this article to my wiki: https://example.com/article"

Under the hood, the agent routes to baoyu-url-to-markdown:

底层会调用baoyu-url-to-markdown工具:

npx baoyu-url-to-markdown https://example.com/article > raw/articles/article-slug.md
undefined
npx baoyu-url-to-markdown https://example.com/article > raw/articles/article-slug.md
undefined

Ingest a YouTube Video

摄入YouTube视频

bash
undefined
bash
undefined

Agent command pattern:

Agent命令格式:

"Digest this YouTube video into my knowledge base: https://youtube.com/watch?v=abc123"
"Digest this YouTube video into my knowledge base: https://youtube.com/watch?v=abc123"

Uses youtube-transcript via uv:

通过uv调用youtube-transcript工具:

uvx youtube-transcript https://youtube.com/watch?v=abc123 > raw/articles/video-slug.md
undefined
uvx youtube-transcript https://youtube.com/watch?v=abc123 > raw/articles/video-slug.md
undefined

Ingest a WeChat Article

摄入微信公众号文章

bash
undefined
bash
undefined

Agent command pattern:

Agent命令格式:

"Add this WeChat article to my wiki: https://mp.weixin.qq.com/s/..."
"Add this WeChat article to my wiki: https://mp.weixin.qq.com/s/..."

Uses wechat-article-to-markdown via uv:

通过uv调用wechat-article-to-markdown工具:

uvx wechat-article-to-markdown https://mp.weixin.qq.com/s/... > raw/wechat/article-slug.md
undefined
uvx wechat-article-to-markdown https://mp.weixin.qq.com/s/... > raw/wechat/article-slug.md
undefined

Ingest a PDF

摄入PDF文件

bash
undefined
bash
undefined

Agent command pattern:

Agent命令格式:

"Process this PDF into my wiki: /path/to/paper.pdf"
"Process this PDF into my wiki: /path/to/paper.pdf"

OR drag a file into the chat

或直接将文件拖入聊天窗口

No external tool needed — goes directly into main pipeline:

无需外部工具,直接进入主处理流程:

cp /path/to/paper.pdf raw/pdfs/paper.pdf
undefined
cp /path/to/paper.pdf raw/pdfs/paper.pdf
undefined

Ingest Raw Text or Notes

摄入纯文本或笔记

bash
undefined
bash
undefined

Just paste text to your agent:

直接将文本粘贴给Agent即可:

"Add these notes to my wiki: [paste content]"
"Add these notes to my wiki: [粘贴内容]"

Agent writes directly to:

Agent会直接写入对应文件:

echo "your content" > raw/notes/note-slug.md
undefined
echo "your content" > raw/notes/note-slug.md
undefined

Batch Process a Folder

批量处理文件夹

bash
undefined
bash
undefined

Agent command pattern:

Agent命令格式:

"Process all files in ~/Downloads/research into my wiki"
"Process all files in ~/Downloads/research into my wiki"

Agent iterates over files and routes each by type

Agent会遍历文件,按类型分别处理

for f in ~/Downloads/research/*; do

agent determines type and processes accordingly

done
undefined
for f in ~/Downloads/research/*; do

agent识别文件类型并执行对应处理逻辑

done
undefined

Health Check

健康检查

bash
undefined
bash
undefined

Agent command pattern:

Agent命令格式:

"Run a health check on my knowledge base"
"Run a health check on my knowledge base"

Agent checks for:

Agent会检查以下项:

- Orphaned pages (no incoming links)

- 孤立页面(无入链的页面)

- Broken [[wiki links]]

- 失效的[[wiki链接]]

- Contradictory information across pages

- 跨页面的内容矛盾

- Missing source summaries

- 缺失的源内容摘要

undefined
undefined

Source Routing Reference

源路由参考

Source TypeTool UsedRequires
Web articles
baoyu-url-to-markdown
Chrome debug mode
X/Twitter
baoyu-url-to-markdown
Chrome debug mode + X login
Zhihu
baoyu-url-to-markdown
Chrome debug mode
WeChat
wechat-article-to-markdown
uv
YouTube
youtube-transcript
uv
XiaohongshuManual pasteNothing
PDF / Markdown / TextDirect pipelineNothing
Source registry lives at:
scripts/source-registry.tsv
Routing logic lives at:
scripts/source-registry.sh
源类型使用工具依赖要求
网页文章
baoyu-url-to-markdown
Chrome调试模式
X/Twitter
baoyu-url-to-markdown
Chrome调试模式 + X账号登录
知乎
baoyu-url-to-markdown
Chrome调试模式
微信公众号
wechat-article-to-markdown
uv
YouTube
youtube-transcript
uv
小红书手动粘贴
PDF / Markdown / 纯文本直接处理
源类型注册表路径:
scripts/source-registry.tsv
路由逻辑代码路径:
scripts/source-registry.sh

Wiki Page Conventions

Wiki页面规范

Entity Page (wiki/entities/andrej-karpathy.md)

实体页面示例(wiki/entities/andrej-karpathy.md)

markdown
undefined
markdown
undefined

Andrej Karpathy

Andrej Karpathy

Overview

概述

Former OpenAI/Tesla researcher, creator of llm-wiki methodology.
前OpenAI/特斯拉研究员,llm-wiki方法论提出者。

Key Ideas

核心观点

  • [[llm-wiki]] — compile knowledge once, maintain over time
  • [[nanoGPT]] — minimal GPT implementation for education
  • [[llm-wiki]] — 一次编译知识,持续迭代维护
  • [[nanoGPT]] — 用于教学的极简GPT实现

Sources

引用来源

  • [[sources/llm-wiki-gist-2024]]
  • [[sources/karpathy-interview-2023]]
  • [[sources/llm-wiki-gist-2024]]
  • [[sources/karpathy-interview-2023]]

Related

相关内容

  • [[topics/language-models]]
  • [[entities/openai]]
undefined
  • [[topics/language-models]]
  • [[entities/openai]]
undefined

Topic Page (wiki/topics/retrieval-augmented-generation.md)

主题页面示例(wiki/topics/retrieval-augmented-generation.md)

markdown
undefined
markdown
undefined

Retrieval-Augmented Generation

Retrieval-Augmented Generation

Summary

摘要

...
...

Key Entities

核心关联实体

  • [[entities/langchain]]
  • [[entities/llamaindex]]
  • [[entities/langchain]]
  • [[entities/llamaindex]]

Comparisons

对比分析

  • [[comparisons/rag-vs-finetuning]]
  • [[comparisons/rag-vs-finetuning]]

Sources

引用来源

  • [[sources/rag-paper-2020]]
undefined
  • [[sources/rag-paper-2020]]
undefined

Source Summary (wiki/sources/article-slug.md)

源摘要页面示例(wiki/sources/article-slug.md)

markdown
undefined
markdown
undefined

Source: Article Title

来源:文章标题

Key Points

核心要点

  1. ...
  2. ...
  1. ...
  2. ...

Entities Mentioned

提及实体

  • [[entities/...]]
  • [[entities/...]]

Raw**: [[raw/articles/article-slug]]

原始内容:[[raw/articles/article-slug]]

undefined
undefined

Troubleshooting

故障排查

Chrome / Web Extraction Fails

Chrome/网页提取失败

bash
undefined
bash
undefined

Start Chrome with remote debugging enabled

启动Chrome并开启远程调试

google-chrome --remote-debugging-port=9222 --no-first-run &
google-chrome --remote-debugging-port=9222 --no-first-run &

Verify it's running

验证服务是否正常运行

For X/Twitter: make sure you're logged in on that Chrome session

针对X/Twitter:确认该Chrome会话已登录X账号

Then retry the extraction

之后重试提取操作即可

undefined
undefined

WeChat or YouTube Extraction Fails

微信或YouTube内容提取失败

bash
undefined
bash
undefined

Install uv if missing

如缺失uv则先安装

curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.cargo/env
curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.cargo/env

Re-run installer to pick up uv

重新运行安装器识别uv

bash install.sh --platform claude # or your platform
bash install.sh --platform claude # 替换为你的平台

Verify uv tools work

验证uv工具是否正常

uvx youtube-transcript --help uvx wechat-article-to-markdown --help
undefined
uvx youtube-transcript --help uvx wechat-article-to-markdown --help
undefined

bun/npm Dependency Install Fails

bun/npm依赖安装失败

bash
undefined
bash
undefined

The installer auto-selects bun or npm — check which is available

安装器会自动选择bun或npm,检查可用的包管理器

which bun && echo "bun found" which npm && echo "npm found"
which bun && echo "bun found" which npm && echo "npm found"

Manually install web extraction deps with npm

用npm手动安装网页提取依赖

npm install -g baoyu-url-to-markdown
undefined
npm install -g baoyu-url-to-markdown
undefined

Codex Legacy Path Compatibility

Codex旧路径兼容问题

bash
undefined
bash
undefined

Old path still supported automatically:

旧路径仍自动支持:

~/.Codex/skills # capital C — installer handles both ~/.codex/skills # lowercase — new default
undefined
~/.Codex/skills # 大写C — 安装器自动兼容两种写法 ~/.codex/skills # 小写c — 新默认路径
undefined

Agent Can't Find Installed Skill

Agent找不到已安装的Skill

bash
undefined
bash
undefined

Verify install location for your platform

验证对应平台的安装路径是否存在

ls ~/.claude/skills/llm-wiki/ # Claude Code ls ~/.codex/skills/llm-wiki/ # Codex ls ~/.openclaw/skills/llm-wiki/ # OpenClaw
ls ~/.claude/skills/llm-wiki/ # Claude Code ls ~/.codex/skills/llm-wiki/ # Codex ls ~/.openclaw/skills/llm-wiki/ # OpenClaw

Re-run installer if directory is missing

如目录缺失则重新运行安装器

bash install.sh --platform <your-platform>
undefined
bash install.sh --platform <你的平台>
undefined

Source Registry Lookup

源注册表查询

bash
undefined
bash
undefined

Check registered sources and routing

查看已注册的源类型和路由规则

cat scripts/source-registry.tsv
cat scripts/source-registry.tsv

Test routing for a URL

测试某个URL的路由结果

bash scripts/source-registry.sh route "https://mp.weixin.qq.com/s/example"
undefined
bash scripts/source-registry.sh route "https://mp.weixin.qq.com/s/example"
undefined

Key Design Principles

核心设计原则

  1. Compile once, maintain — wiki pages are living documents, not ephemeral answers
  2. Bidirectional links — every entity and topic links to related nodes with
    [[name]]
  3. Immutable raw — source files in
    raw/
    are never modified after ingestion
  4. Graceful degradation — if a tool fails, agent prompts for manual paste instead of crashing
  5. Platform-agnostic — same knowledge base works across all supported agents
  6. Obsidian-compatible — open
    your-wiki/
    directly in Obsidian at any time
  1. 一次编译,持续维护 — Wiki页面是活文档,不是临时生成的答案
  2. 双向链接 — 所有实体和主题都通过
    [[名称]]
    关联到相关节点
  3. 原始素材不可变 — 摄入后的
    raw/
    目录下的源文件永不修改
  4. 优雅降级 — 工具失效时Agent会提示手动粘贴,而非直接崩溃
  5. 跨平台兼容 — 同一个知识库可在所有支持的Agent上使用
  6. Obsidian兼容 — 可随时直接用Obsidian打开
    your-wiki/
    目录

Quick Reference for Agents

Agent快捷指令参考

Initialize wiki:     "Create a new wiki at <path>"
Add URL:             "Add <url> to my wiki"
Add file:            "Process <file path> into my wiki"
Add text:            "Add these notes to my wiki: <text>"
Batch process:       "Process all files in <folder> into my wiki"
Health check:        "Check my wiki for broken links and orphans"
Find information:    "What does my wiki say about <topic>"
Update a page:       "Update the [[entity]] page with new info from <source>"
初始化知识库:     "Create a new wiki at <路径>"
添加URL:          "Add <url> to my wiki"
添加文件:         "Process <文件路径> into my wiki"
添加文本:         "Add these notes to my wiki: <文本内容>"
批量处理:         "Process all files in <文件夹路径> into my wiki"
健康检查:         "Check my wiki for broken links and orphans"
查找信息:         "What does my wiki say about <主题>"
更新页面:         "Update the [[实体]] page with new info from <来源>"