llm-wiki-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

llm-wiki-skill — Multi-Platform Knowledge Base Builder

llm-wiki-skill — 多平台知识库构建工具

Skill by ara.so — Daily 2026 Skills collection.

Build a persistent, interlinked personal knowledge base from URLs, PDFs, markdown files, and raw text. Based on Karpathy's llm-wiki methodology: knowledge is compiled once and maintained, not re-derived from raw docs on every query.

Skill由ara.so开发，属于2026年度每日技能合集。

基于URL、PDF、Markdown文件和纯文本构建持久化、可关联的个人知识库。遵循Karpathy的llm-wiki方法论：知识一次编译持续维护，无需每次查询都从原始文档重新推导。

What It Does

功能特性

Ingests articles, tweets, PDFs, YouTube transcripts, WeChat posts, and plain text
Routes each source type to the best extraction tool automatically
Generates structured wiki pages with
```
[[bidirectional links]]
```
Produces entity pages, topic pages, source summaries, and comparisons
Outputs Obsidian-compatible local markdown files
Detects orphaned pages, broken links, and contradictions via health checks

支持摄入文章、推文、PDF、YouTube字幕、微信公众号文章和纯文本
自动为每种源类型匹配最优提取工具
生成带有
```
[[双向链接]]
```
的结构化Wiki页面
输出实体页面、主题页面、源摘要和对比分析内容
生成兼容Obsidian的本地Markdown文件
可通过健康检查识别孤立页面、死链和内容矛盾

Installation

安装方式

Recommended: Let Your Agent Install It

推荐方式：让Agent自动安装

Give your agent the repo URL and ask it to install for your platform:

https://github.com/sdyckjq-lab/llm-wiki-skill

给Agent提供仓库地址，让它根据你的平台完成安装：

https://github.com/sdyckjq-lab/llm-wiki-skill

Manual Installation

手动安装

Clone the repo anywhere, then run the installer for your platform:

bash

undefined

克隆仓库到任意位置，然后运行对应平台的安装脚本：

bash

undefined

Claude Code

bash install.sh --platform claude

Codex

bash install.sh --platform codex

OpenClaw

bash install.sh --platform openclaw

Auto-detect (only if one platform directory exists)

自动检测（仅当存在唯一平台目录时生效）

bash install.sh --platform auto

Custom target directory (OpenClaw non-standard path)

自定义目标目录（OpenClaw非标准路径时使用）

bash install.sh --platform openclaw --target-dir /path/to/your/skills

undefined

bash install.sh --platform openclaw --target-dir /path/to/your/skills

undefined

Default Install Locations

默认安装路径

Platform	Path
Claude Code	`~/.claude/skills/llm-wiki`
Codex	`~/.codex/skills/llm-wiki`
OpenClaw	`~/.openclaw/skills/llm-wiki`

平台	路径
Claude Code	`~/.claude/skills/llm-wiki`
Codex	`~/.codex/skills/llm-wiki`
OpenClaw	`~/.openclaw/skills/llm-wiki`

Legacy Claude Setup (existing users)

旧版Claude设置（存量用户适用）

bash

bash setup.sh

bash

bash setup.sh

This is now a compatibility shim for the unified installer

目前该脚本是统一安装器的兼容垫片

undefined

undefined

Prerequisites

前置依赖

bash

undefined

bash

undefined

Check Chrome debug mode is running (needed for web extraction)

确认Chrome调试模式已开启（网页提取所需）

google-chrome --remote-debugging-port=9222 &

Check uv is installed (needed for WeChat + YouTube extraction)

确认uv已安装（微信和YouTube内容提取所需）

uv --version

Install bun OR npm (one is enough for web extraction deps)

安装bun 或 npm（二选一即可，用于网页提取依赖）

curl -fsSL https://bun.sh/install | bash

OR: npm is already available in most environments

或：大部分环境已默认预装npm

undefined

undefined

Platform Entry Points

平台入口文档

After installation, read the platform-specific instructions:

Claude Code:
```
platforms/claude/CLAUDE.md
```
Codex:
```
platforms/codex/AGENTS.md
```
OpenClaw:
```
platforms/openclaw/README.md
```

安装完成后，可查看对应平台的专属说明：

Claude Code:
```
platforms/claude/CLAUDE.md
```
Codex:
```
platforms/codex/AGENTS.md
```
OpenClaw:
```
platforms/openclaw/README.md
```

Knowledge Base Structure

知识库结构

your-wiki/
├── raw/                    # Immutable source material
│   ├── articles/           # Web articles
│   ├── tweets/             # X/Twitter
│   ├── wechat/             # WeChat posts
│   ├── xiaohongshu/        # Xiaohongshu (manual paste only)
│   ├── zhihu/              # Zhihu
│   ├── pdfs/               # PDFs
│   ├── notes/              # Notes
│   └── assets/             # Images, attachments
├── wiki/                   # AI-generated knowledge base
│   ├── entities/           # People, concepts, tools
│   ├── topics/             # Topic pages
│   ├── sources/            # Source summaries
│   ├── comparisons/        # Side-by-side analysis
│   └── synthesis/          # Cross-source synthesis
├── index.md                # Master index
├── log.md                  # Operation log
└── .wiki-schema.md         # Config

your-wiki/
├── raw/                    # 不可变的原始素材
│   ├── articles/           # 网页文章
│   ├── tweets/             # X/Twitter内容
│   ├── wechat/             # 微信公众号文章
│   ├── xiaohongshu/        # 小红书内容（仅支持手动粘贴）
│   ├── zhihu/              # 知乎内容
│   ├── pdfs/               # PDF文件
│   ├── notes/              # 笔记内容
│   └── assets/             # 图片、附件
├── wiki/                   # AI生成的知识库内容
│   ├── entities/           # 人物、概念、工具等实体
│   ├── topics/             # 主题页面
│   ├── sources/            # 源内容摘要
│   ├── comparisons/        # 横向对比分析
│   └── synthesis/          # 跨源内容综合
├── index.md                # 总索引
├── log.md                  # 操作日志
└── .wiki-schema.md         # 配置文件

Core Workflows

核心工作流

Initialize a New Knowledge Base

初始化新知识库

bash

undefined

bash

undefined

Ask your agent:

向你的Agent提问：

"Create a new knowledge base at ~/my-wiki"

The agent will scaffold the directory structure,

Agent会自动生成目录结构，

generate index.md, log.md, and .wiki-schema.md

创建index.md、log.md和.wiki-schema.md文件

undefined

undefined

Ingest a Web Article

摄入网页文章

bash

undefined

bash

undefined

Agent command pattern:

Agent命令格式：

"Add this article to my wiki: https://example.com/article"

Under the hood, the agent routes to baoyu-url-to-markdown:

底层会调用baoyu-url-to-markdown工具：

npx baoyu-url-to-markdown https://example.com/article > raw/articles/article-slug.md

undefined

npx baoyu-url-to-markdown https://example.com/article > raw/articles/article-slug.md

undefined

Ingest a YouTube Video

摄入YouTube视频

bash

undefined

bash

undefined

Agent command pattern:

Agent命令格式：

"Digest this YouTube video into my knowledge base: https://youtube.com/watch?v=abc123"

Uses youtube-transcript via uv:

通过uv调用youtube-transcript工具：

uvx youtube-transcript https://youtube.com/watch?v=abc123 > raw/articles/video-slug.md

undefined

uvx youtube-transcript https://youtube.com/watch?v=abc123 > raw/articles/video-slug.md

undefined

Ingest a WeChat Article

摄入微信公众号文章

bash

undefined

bash

undefined

Agent command pattern:

Agent命令格式：

"Add this WeChat article to my wiki: https://mp.weixin.qq.com/s/..."

Uses wechat-article-to-markdown via uv:

通过uv调用wechat-article-to-markdown工具：

uvx wechat-article-to-markdown https://mp.weixin.qq.com/s/... > raw/wechat/article-slug.md

undefined

uvx wechat-article-to-markdown https://mp.weixin.qq.com/s/... > raw/wechat/article-slug.md

undefined

Ingest a PDF

摄入PDF文件

bash

undefined

bash

undefined

Agent command pattern:

Agent命令格式：

"Process this PDF into my wiki: /path/to/paper.pdf"

OR drag a file into the chat

或直接将文件拖入聊天窗口

No external tool needed — goes directly into main pipeline:

无需外部工具，直接进入主处理流程：

cp /path/to/paper.pdf raw/pdfs/paper.pdf

undefined

cp /path/to/paper.pdf raw/pdfs/paper.pdf

undefined

Ingest Raw Text or Notes

摄入纯文本或笔记

bash

undefined

bash

undefined

Just paste text to your agent:

直接将文本粘贴给Agent即可：

"Add these notes to my wiki: [paste content]"

"Add these notes to my wiki: [粘贴内容]"

Agent writes directly to:

Agent会直接写入对应文件：

echo "your content" > raw/notes/note-slug.md

undefined

echo "your content" > raw/notes/note-slug.md

undefined

Batch Process a Folder

批量处理文件夹

bash

undefined

bash

undefined

Agent command pattern:

Agent命令格式：

"Process all files in ~/Downloads/research into my wiki"

Agent iterates over files and routes each by type

Agent会遍历文件，按类型分别处理

for f in ~/Downloads/research/*; do

agent determines type and processes accordingly

done

undefined

for f in ~/Downloads/research/*; do

agent识别文件类型并执行对应处理逻辑

done

undefined

Health Check

健康检查

bash

undefined

bash

undefined

Agent command pattern:

Agent命令格式：

"Run a health check on my knowledge base"

Agent checks for:

Agent会检查以下项：

- Orphaned pages (no incoming links)

- 孤立页面（无入链的页面）

- Broken [[wiki links]]

- 失效的[[wiki链接]]

- Contradictory information across pages

- 跨页面的内容矛盾

- Missing source summaries

- 缺失的源内容摘要

undefined

undefined

Source Routing Reference

源路由参考

Source Type	Tool Used	Requires
Web articles	`baoyu-url-to-markdown`	Chrome debug mode
X/Twitter	`baoyu-url-to-markdown`	Chrome debug mode + X login
Zhihu	`baoyu-url-to-markdown`	Chrome debug mode
WeChat	`wechat-article-to-markdown`	`uv`
YouTube	`youtube-transcript`	`uv`
Xiaohongshu	Manual paste	Nothing
PDF / Markdown / Text	Direct pipeline	Nothing

Source registry lives at:

scripts/source-registry.tsv

Routing logic lives at:

scripts/source-registry.sh

源类型	使用工具	依赖要求
网页文章	`baoyu-url-to-markdown`	Chrome调试模式
X/Twitter	`baoyu-url-to-markdown`	Chrome调试模式 + X账号登录
知乎	`baoyu-url-to-markdown`	Chrome调试模式
微信公众号	`wechat-article-to-markdown`	`uv`
YouTube	`youtube-transcript`	`uv`
小红书	手动粘贴	无
PDF / Markdown / 纯文本	直接处理	无

源类型注册表路径：

scripts/source-registry.tsv

路由逻辑代码路径：

scripts/source-registry.sh

Wiki Page Conventions

Wiki页面规范

Entity Page (wiki/entities/andrej-karpathy.md)

实体页面示例（wiki/entities/andrej-karpathy.md）

markdown

undefined

markdown

undefined

Andrej Karpathy

Overview

概述

Former OpenAI/Tesla researcher, creator of llm-wiki methodology.

前OpenAI/特斯拉研究员，llm-wiki方法论提出者。

Key Ideas

核心观点

[[llm-wiki]] — compile knowledge once, maintain over time
[[nanoGPT]] — minimal GPT implementation for education

[[llm-wiki]] — 一次编译知识，持续迭代维护
[[nanoGPT]] — 用于教学的极简GPT实现

Sources

引用来源

[[sources/llm-wiki-gist-2024]]
[[sources/karpathy-interview-2023]]

[[sources/llm-wiki-gist-2024]]
[[sources/karpathy-interview-2023]]

Retrieval-Augmented Generation

Summary

摘要

...

Key Entities

核心关联实体

[[entities/langchain]]
[[entities/llamaindex]]

[[entities/langchain]]
[[entities/llamaindex]]

Comparisons

对比分析

[[comparisons/rag-vs-finetuning]]

[[comparisons/rag-vs-finetuning]]

Sources

引用来源

[[sources/rag-paper-2020]]

undefined

[[sources/rag-paper-2020]]

undefined

Source Summary (wiki/sources/article-slug.md)

源摘要页面示例（wiki/sources/article-slug.md）

markdown

undefined

markdown

undefined

Source: Article Title

来源：文章标题

URL: https://example.com/article
Date ingested: 2026-04-10
Type: web article

URL: https://example.com/article
摄入时间: 2026-04-10
类型: 网页文章

Key Points

核心要点

Entities Mentioned

提及实体

[[entities/...]]

[[entities/...]]

Raw**: [[raw/articles/article-slug]]

原始内容：[[raw/articles/article-slug]]

undefined

undefined

Troubleshooting

故障排查

Chrome / Web Extraction Fails

Chrome/网页提取失败

bash

undefined

bash

undefined

Start Chrome with remote debugging enabled

启动Chrome并开启远程调试

google-chrome --remote-debugging-port=9222 --no-first-run &

Verify it's running

验证服务是否正常运行

curl http://localhost:9222/json/version

For X/Twitter: make sure you're logged in on that Chrome session

针对X/Twitter：确认该Chrome会话已登录X账号

Then retry the extraction

之后重试提取操作即可

undefined

undefined

WeChat or YouTube Extraction Fails

微信或YouTube内容提取失败

bash

undefined

bash

undefined

Install uv if missing

如缺失uv则先安装

curl -LsSf https://astral.sh/uv/install.sh | sh source $HOME/.cargo/env

Re-run installer to pick up uv

重新运行安装器识别uv

bash install.sh --platform claude # or your platform

bash install.sh --platform claude # 替换为你的平台

Verify uv tools work

验证uv工具是否正常

uvx youtube-transcript --help uvx wechat-article-to-markdown --help

undefined

uvx youtube-transcript --help uvx wechat-article-to-markdown --help

undefined

bun/npm Dependency Install Fails

bun/npm依赖安装失败

bash

undefined

bash

undefined

The installer auto-selects bun or npm — check which is available

安装器会自动选择bun或npm，检查可用的包管理器

which bun && echo "bun found" which npm && echo "npm found"

Manually install web extraction deps with npm

用npm手动安装网页提取依赖

npm install -g baoyu-url-to-markdown

undefined

npm install -g baoyu-url-to-markdown

undefined

Codex Legacy Path Compatibility

Codex旧路径兼容问题

bash

undefined

bash

undefined

Old path still supported automatically:

旧路径仍自动支持：

~/.Codex/skills # capital C — installer handles both ~/.codex/skills # lowercase — new default

undefined

~/.Codex/skills # 大写C — 安装器自动兼容两种写法 ~/.codex/skills # 小写c — 新默认路径

undefined

Agent Can't Find Installed Skill

Agent找不到已安装的Skill

bash

undefined

bash

undefined

Verify install location for your platform

验证对应平台的安装路径是否存在

ls ~/.claude/skills/llm-wiki/ # Claude Code ls ~/.codex/skills/llm-wiki/ # Codex ls ~/.openclaw/skills/llm-wiki/ # OpenClaw

Re-run installer if directory is missing

如目录缺失则重新运行安装器

bash install.sh --platform <your-platform>

undefined

bash install.sh --platform <你的平台>

undefined

Source Registry Lookup

源注册表查询

bash

undefined

bash

undefined

Check registered sources and routing

查看已注册的源类型和路由规则

cat scripts/source-registry.tsv

Test routing for a URL

测试某个URL的路由结果

bash scripts/source-registry.sh route "https://mp.weixin.qq.com/s/example"

undefined

bash scripts/source-registry.sh route "https://mp.weixin.qq.com/s/example"

undefined

Key Design Principles

核心设计原则

Compile once, maintain — wiki pages are living documents, not ephemeral answers
Bidirectional links — every entity and topic links to related nodes with
```
[[name]]
```
Immutable raw — source files in
```
raw/
```
are never modified after ingestion
Graceful degradation — if a tool fails, agent prompts for manual paste instead of crashing
Platform-agnostic — same knowledge base works across all supported agents
Obsidian-compatible — open
```
your-wiki/
```
directly in Obsidian at any time

一次编译，持续维护 — Wiki页面是活文档，不是临时生成的答案
双向链接 — 所有实体和主题都通过
```
[[名称]]
```
关联到相关节点
原始素材不可变 — 摄入后的
```
raw/
```
目录下的源文件永不修改
优雅降级 — 工具失效时Agent会提示手动粘贴，而非直接崩溃
跨平台兼容 — 同一个知识库可在所有支持的Agent上使用
Obsidian兼容 — 可随时直接用Obsidian打开
```
your-wiki/
```
目录

Quick Reference for Agents

Agent快捷指令参考

Initialize wiki:     "Create a new wiki at <path>"
Add URL:             "Add <url> to my wiki"
Add file:            "Process <file path> into my wiki"
Add text:            "Add these notes to my wiki: <text>"
Batch process:       "Process all files in <folder> into my wiki"
Health check:        "Check my wiki for broken links and orphans"
Find information:    "What does my wiki say about <topic>"
Update a page:       "Update the [[entity]] page with new info from <source>"

初始化知识库:     "Create a new wiki at <路径>"
添加URL:          "Add <url> to my wiki"
添加文件:         "Process <文件路径> into my wiki"
添加文本:         "Add these notes to my wiki: <文本内容>"
批量处理:         "Process all files in <文件夹路径> into my wiki"
健康检查:         "Check my wiki for broken links and orphans"
查找信息:         "What does my wiki say about <主题>"
更新页面:         "Update the [[实体]] page with new info from <来源>"