wiki-export
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseWiki Export — Knowledge Graph Export
Wiki 导出 — 知识图谱导出
You are exporting the wiki's wikilink graph to structured formats so it can be used in external tools (Gephi, Neo4j, custom scripts, browser visualization).
你正在将wiki的wikilink图谱导出为结构化格式,以便在外部工具(Gephi、Neo4j、自定义脚本、浏览器可视化工具)中使用。
Before You Start
开始之前
- Read to get
.envOBSIDIAN_VAULT_PATH - Confirm the vault has pages to export — if fewer than 5 pages exist, warn the user and stop
- 读取文件获取
.envOBSIDIAN_VAULT_PATH - 确认库中有可导出的页面——如果页面数量少于5个,警告用户并终止操作
Step 1: Build the Node and Edge Lists
步骤1:构建节点和边列表
Glob all files in the vault (excluding , , , , , ).
.md_archives/_raw/.obsidian/index.mdlog.md_insights.mdFor each page, extract from frontmatter:
- — relative path from vault root, without
idextension (e.g..md)concepts/transformers - —
labelfield from frontmatter, or filename if missingtitle - — directory prefix (
category,concepts,entities,skills,references,synthesis, orprojects)journal - — array from frontmatter tags field
tags - — frontmatter
summaryfield if presentsummary
This is your node list.
For each page, Grep the body for to extract all wikilinks:
\[\[.*?\]\]- Parse each or
[[target]]— use the target part only[[target|display]] - Resolve the target to a node id (normalize: lowercase, spaces→hyphens, strip )
.md - Skip links that point outside the node list (broken links)
- Each resolved link becomes an edge:
{source: page_id, target: linked_id, relation: "wikilink", confidence: "EXTRACTED"} - If the linking sentence ends with or
^[inferred], override^[ambiguous]accordinglyconfidence
This is your edge list.
遍历库中所有文件(排除、、、、、)。
.md_archives/_raw/.obsidian/index.mdlog.md_insights.md对每个页面,从frontmatter中提取:
- — 相对于库根目录的路径,不带
id后缀(例如.md)concepts/transformers - — frontmatter中的
label字段,若缺失则使用文件名title - — 目录前缀(
category、concepts、entities、skills、references、synthesis或projects)journal - — frontmatter中tags字段对应的数组
tags - — frontmatter中的
summary字段(若存在)summary
以上内容构成你的节点列表。
对每个页面,在正文里用正则匹配提取所有wikilink:
\[\[.*?\]\]- 解析每个或
[[target]]格式的链接,仅使用target部分[[target|display]] - 将target解析为节点id(标准化规则:转小写、空格替换为连字符、去除后缀)
.md - 跳过指向节点列表外的链接(即无效链接)
- 每个解析成功的链接生成一条边:
{source: page_id, target: linked_id, relation: "wikilink", confidence: "EXTRACTED"} - 如果链接所在句子以或
^[inferred]结尾,对应覆盖^[ambiguous]的值confidence
以上内容构成你的边列表。
Step 2: Assign Community IDs
步骤2:分配社区ID
Group pages into communities by tag clustering:
- Pages sharing the same dominant tag belong to the same community
- Dominant tag = the first tag in the page's frontmatter tags array
- Pages with no tags get community id
null - Number communities starting from 0, ordered by size descending (largest community = 0)
This enables community-based coloring in the HTML visualization and tools like Gephi.
通过标签聚类将页面分组到不同社区:
- 共享相同主导标签的页面属于同一个社区
- 主导标签 = 页面frontmatter的tags数组中的第一个标签
- 无标签的页面社区id为
null - 社区编号从0开始,按社区大小降序排列(最大的社区编号为0)
这一功能支持在HTML可视化文件和Gephi等工具中按社区着色。
Step 3: Write the Output Files
步骤3:写入输出文件
Create at the vault root if it doesn't exist. Write all four files:
wiki-export/如果库根目录下不存在文件夹则创建该目录,写入以下四个文件:
wiki-export/3a. graph.json
graph.json3a. graph.json
graph.jsonNetworkX node_link format — standard for graph tools and scripts:
json
{
"directed": false,
"multigraph": false,
"graph": {
"exported_at": "<ISO timestamp>",
"vault": "<OBSIDIAN_VAULT_PATH>",
"total_nodes": N,
"total_edges": M
},
"nodes": [
{
"id": "concepts/transformers",
"label": "Transformer Architecture",
"category": "concepts",
"tags": ["ml", "architecture"],
"summary": "The attention-based architecture introduced in Attention Is All You Need.",
"community": 0
}
],
"links": [
{
"source": "concepts/transformers",
"target": "entities/vaswani",
"relation": "wikilink",
"confidence": "EXTRACTED"
}
]
}NetworkX node_link格式——是图工具和脚本的标准格式:
json
{
"directed": false,
"multigraph": false,
"graph": {
"exported_at": "<ISO timestamp>",
"vault": "<OBSIDIAN_VAULT_PATH>",
"total_nodes": N,
"total_edges": M
},
"nodes": [
{
"id": "concepts/transformers",
"label": "Transformer Architecture",
"category": "concepts",
"tags": ["ml", "architecture"],
"summary": "The attention-based architecture introduced in Attention Is All You Need.",
"community": 0
}
],
"links": [
{
"source": "concepts/transformers",
"target": "entities/vaswani",
"relation": "wikilink",
"confidence": "EXTRACTED"
}
]
}3b. graph.graphml
graph.graphml3b. graph.graphml
graph.graphmlGraphML XML format — loadable in Gephi, yEd, and Cytoscape:
xml
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/graphml">
<key id="label" for="node" attr.name="label" attr.type="string"/>
<key id="category" for="node" attr.name="category" attr.type="string"/>
<key id="tags" for="node" attr.name="tags" attr.type="string"/>
<key id="community" for="node" attr.name="community" attr.type="int"/>
<key id="relation" for="edge" attr.name="relation" attr.type="string"/>
<key id="confidence" for="edge" attr.name="confidence" attr.type="string"/>
<graph id="wiki" edgedefault="undirected">
<node id="concepts/transformers">
<data key="label">Transformer Architecture</data>
<data key="category">concepts</data>
<data key="tags">ml, architecture</data>
<data key="community">0</data>
</node>
<edge source="concepts/transformers" target="entities/vaswani">
<data key="relation">wikilink</data>
<data key="confidence">EXTRACTED</data>
</edge>
</graph>
</graphml>Write one per page and one per wikilink.
<node><edge>GraphML XML格式——可导入到Gephi、yEd和Cytoscape中使用:
xml
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/graphml">
<key id="label" for="node" attr.name="label" attr.type="string"/>
<key id="category" for="node" attr.name="category" attr.type="string"/>
<key id="tags" for="node" attr.name="tags" attr.type="string"/>
<key id="community" for="node" attr.name="community" attr.type="int"/>
<key id="relation" for="edge" attr.name="relation" attr.type="string"/>
<key id="confidence" for="edge" attr.name="confidence" attr.type="string"/>
<graph id="wiki" edgedefault="undirected">
<node id="concepts/transformers">
<data key="label">Transformer Architecture</data>
<data key="category">concepts</data>
<data key="tags">ml, architecture</data>
<data key="community">0</data>
</node>
<edge source="concepts/transformers" target="entities/vaswani">
<data key="relation">wikilink</data>
<data key="confidence">EXTRACTED</data>
</edge>
</graph>
</graphml>每个页面对应写入一个标签,每个wikilink对应写入一个标签。
<node><edge>3c. cypher.txt
cypher.txt3c. cypher.txt
cypher.txtNeo4j Cypher statements — paste into Neo4j Browser or run with :
MERGEcypher-shellcypher
// Wiki knowledge graph export — <TIMESTAMP>
// Load with: cypher-shell -u neo4j -p password < cypher.txt
// Nodes
MERGE (n:Page {id: "concepts/transformers"}) SET n.label = "Transformer Architecture", n.category = "concepts", n.tags = ["ml","architecture"], n.community = 0;
MERGE (n:Page {id: "entities/vaswani"}) SET n.label = "Ashish Vaswani", n.category = "entities", n.tags = ["person","ml"], n.community = 0;
// Relationships
MATCH (a:Page {id: "concepts/transformers"}), (b:Page {id: "entities/vaswani"}) MERGE (a)-[:WIKILINK {relation: "wikilink", confidence: "EXTRACTED"}]->(b);Write one node statement per page, then one / relationship statement per edge.
MERGEMATCHMERGENeo4j Cypher 语句——可粘贴到Neo4j Browser中,或通过运行:
MERGEcypher-shellcypher
// Wiki knowledge graph export — <TIMESTAMP>
// Load with: cypher-shell -u neo4j -p password < cypher.txt
// Nodes
MERGE (n:Page {id: "concepts/transformers"}) SET n.label = "Transformer Architecture", n.category = "concepts", n.tags = ["ml","architecture"], n.community = 0;
MERGE (n:Page {id: "entities/vaswani"}) SET n.label = "Ashish Vaswani", n.category = "entities", n.tags = ["person","ml"], n.community = 0;
// Relationships
MATCH (a:Page {id: "concepts/transformers"}), (b:Page {id: "entities/vaswani"}) MERGE (a)-[:WIKILINK {relation: "wikilink", confidence: "EXTRACTED"}]->(b);每个页面对应写入一条节点语句,每条边对应写入一条/关系语句。
MERGEMATCHMERGE3d. graph.html
graph.html3d. graph.html
graph.htmlA self-contained interactive visualization using the vis.js CDN (no local dependencies). The user opens this file in any browser — no server needed.
Build the HTML file by:
- Generating a JSON array of node objects for vis.js:
js
{id: "concepts/transformers", label: "Transformer Architecture", color: {background: "#4E79A7"}, size: <degree * 3 + 8>, title: "concepts | #ml #architecture", community: 0}- Color by community (cycle through: ,
#4E79A7,#F28E2B,#E15759,#76B7B2,#59A14F,#EDC948,#B07AA1,#FF9DA7,#9C755F)#BAB0AC - Size by degree (incoming + outgoing link count): , capped at 60
size = degree * 3 + 8 - = tooltip text shown on hover: category, tags, summary (if available)
title
- Generating a JSON array of edge objects for vis.js:
js
{from: "concepts/transformers", to: "entities/vaswani", dashes: false, width: 1, color: {color: "#666", opacity: 0.6}}- for INFERRED edges
dashes: true - for AMBIGUOUS edges
dashes: [4,8]
- Writing the full HTML file:
html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Wiki Knowledge Graph</title>
<script src="https://unpkg.com/vis-network/standalone/umd/vis-network.min.js"></script>
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body { background: #0f0f1a; color: #e0e0e0; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif; display: flex; height: 100vh; }
#graph { flex: 1; }
#sidebar { width: 260px; background: #1a1a2e; border-left: 1px solid #2a2a4e; padding: 14px; overflow-y: auto; font-size: 13px; }
#sidebar h3 { color: #aaa; font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; margin: 0 0 10px; }
#info { margin-bottom: 16px; line-height: 1.6; color: #ccc; }
.legend-item { display: flex; align-items: center; gap: 8px; padding: 3px 0; font-size: 12px; }
.dot { width: 10px; height: 10px; border-radius: 50%; flex-shrink: 0; }
#stats { margin-top: 16px; color: #555; font-size: 11px; }
</style>
</head>
<body>
<div id="graph"></div>
<div id="sidebar">
<h3>Wiki Knowledge Graph</h3>
<div id="info">Click a node to see details.</div>
<h3 style="margin-top:12px">Communities</h3>
<div id="legend"><!-- populated by JS --></div>
<div id="stats"><!-- populated by JS --></div>
</div>
<script>
const NODES_DATA = /* NODES_JSON */;
const EDGES_DATA = /* EDGES_JSON */;
const COMMUNITY_COLORS = ["#4E79A7","#F28E2B","#E15759","#76B7B2","#59A14F","#EDC948","#B07AA1","#FF9DA7","#9C755F","#BAB0AC"];
const nodes = new vis.DataSet(NODES_DATA);
const edges = new vis.DataSet(EDGES_DATA);
const network = new vis.Network(document.getElementById('graph'), {nodes, edges}, {
physics: { solver: 'forceAtlas2Based', forceAtlas2Based: { gravitationalConstant: -60, springLength: 120 }, stabilization: { iterations: 200 } },
interaction: { hover: true, tooltipDelay: 100 },
nodes: { shape: 'dot', borderWidth: 1.5 },
edges: { smooth: { type: 'continuous' }, arrows: { to: { enabled: true, scaleFactor: 0.4 } } }
});
network.once('stabilizationIterationsDone', () => network.setOptions({ physics: { enabled: false } }));
network.on('click', ({nodes: sel}) => {
if (!sel.length) return;
const n = NODES_DATA.find(x => x.id === sel[0]);
if (!n) return;
document.getElementById('info').innerHTML = ``;
});
// Build legend
const communities = {};
NODES_DATA.forEach(n => { if (n.community != null) communities[n.community] = (communities[n.community]||0)+1; });
const leg = document.getElementById('legend');
Object.entries(communities).sort((a,b)=>b[1]-a[1]).forEach(([cid, count]) => {
const color = COMMUNITY_COLORS[cid % COMMUNITY_COLORS.length];
leg.innerHTML += ``;
});
document.getElementById('stats').textContent = `${NODES_DATA.length} pages · ${EDGES_DATA.length} links`;
</script>
</body>
</html>Replace and with the actual JSON arrays you generated in step 1.
/* NODES_JSON *//* EDGES_JSON */使用vis.js CDN实现的独立交互式可视化文件(无本地依赖)。用户可在任意浏览器中打开该文件,无需服务器支持。
按以下步骤构建HTML文件:
- 生成供vis.js使用的节点对象JSON数组:
js
{id: "concepts/transformers", label: "Transformer Architecture", color: {background: "#4E79A7"}, size: <degree * 3 + 8>, title: "concepts | #ml #architecture", community: 0}- 按社区着色(循环使用以下颜色:、
#4E79A7、#F28E2B、#E15759、#76B7B2、#59A14F、#EDC948、#B07AA1、#FF9DA7、#9C755F)#BAB0AC - 按度数(入站+出站链接数)调整大小:,最大不超过60
size = 度数 * 3 + 8 - = 鼠标悬停时显示的提示文本:分类、标签、摘要(若有)
title
- 生成供vis.js使用的边对象JSON数组:
js
{from: "concepts/transformers", to: "entities/vaswani", dashes: false, width: 1, color: {color: "#666", opacity: 0.6}}- INFERRED类型的边设置
dashes: true - AMBIGUOUS类型的边设置
dashes: [4,8]
- 写入完整的HTML文件:
html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Wiki Knowledge Graph</title>
<script src="https://unpkg.com/vis-network/standalone/umd/vis-network.min.js"></script>
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body { background: #0f0f1a; color: #e0e0e0; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif; display: flex; height: 100vh; }
#graph { flex: 1; }
#sidebar { width: 260px; background: #1a1a2e; border-left: 1px solid #2a2a4e; padding: 14px; overflow-y: auto; font-size: 13px; }
#sidebar h3 { color: #aaa; font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; margin: 0 0 10px; }
#info { margin-bottom: 16px; line-height: 1.6; color: #ccc; }
.legend-item { display: flex; align-items: center; gap: 8px; padding: 3px 0; font-size: 12px; }
.dot { width: 10px; height: 10px; border-radius: 50%; flex-shrink: 0; }
#stats { margin-top: 16px; color: #555; font-size: 11px; }
</style>
</head>
<body>
<div id="graph"></div>
<div id="sidebar">
<h3>Wiki Knowledge Graph</h3>
<div id="info">Click a node to see details.</div>
<h3 style="margin-top:12px">Communities</h3>
<div id="legend"><!-- populated by JS --></div>
<div id="stats"><!-- populated by JS --></div>
</div>
<script>
const NODES_DATA = /* NODES_JSON */;
const EDGES_DATA = /* EDGES_JSON */;
const COMMUNITY_COLORS = ["#4E79A7","#F28E2B","#E15759","#76B7B2","#59A14F","#EDC948","#B07AA1","#FF9DA7","#9C755F","#BAB0AC"];
const nodes = new vis.DataSet(NODES_DATA);
const edges = new vis.DataSet(EDGES_DATA);
const network = new vis.Network(document.getElementById('graph'), {nodes, edges}, {
physics: { solver: 'forceAtlas2Based', forceAtlas2Based: { gravitationalConstant: -60, springLength: 120 }, stabilization: { iterations: 200 } },
interaction: { hover: true, tooltipDelay: 100 },
nodes: { shape: 'dot', borderWidth: 1.5 },
edges: { smooth: { type: 'continuous' }, arrows: { to: { enabled: true, scaleFactor: 0.4 } } }
});
network.once('stabilizationIterationsDone', () => network.setOptions({ physics: { enabled: false } }));
network.on('click', ({nodes: sel}) => {
if (!sel.length) return;
const n = NODES_DATA.find(x => x.id === sel[0]);
if (!n) return;
document.getElementById('info').innerHTML = ``;
});
// Build legend
const communities = {};
NODES_DATA.forEach(n => { if (n.community != null) communities[n.community] = (communities[n.community]||0)+1; });
const leg = document.getElementById('legend');
Object.entries(communities).sort((a,b)=>b[1]-a[1]).forEach(([cid, count]) => {
const color = COMMUNITY_COLORS[cid % COMMUNITY_COLORS.length];
leg.innerHTML += ``;
});
document.getElementById('stats').textContent = `${NODES_DATA.length} pages · ${EDGES_DATA.length} links`;
</script>
</body>
</html>将和替换为你在步骤1中生成的实际JSON数组。
/* NODES_JSON *//* EDGES_JSON */Step 4: Print Summary
步骤4:打印总结
Wiki export complete → wiki-export/
graph.json — N nodes, M edges (NetworkX node_link format)
graph.graphml — N nodes, M edges (Gephi / yEd / Cytoscape)
cypher.txt — N MERGE nodes + M MERGE relationships (Neo4j)
graph.html — interactive browser visualization (open in any browser)Wiki export complete → wiki-export/
graph.json — N nodes, M edges (NetworkX node_link format)
graph.graphml — N nodes, M edges (Gephi / yEd / Cytoscape)
cypher.txt — N MERGE nodes + M MERGE relationships (Neo4j)
graph.html — interactive browser visualization (open in any browser)Notes
注意事项
- Re-running is safe — all output files are overwritten on each run
- Broken wikilinks are skipped — only edges to pages that exist in the vault are exported
- The directory should be gitignored if the vault is version-controlled — these are derived artifacts
wiki-export/ - is the primary format — the others are derived from it. If a future tool supports graph queries natively, point it at
graph.jsongraph.json
- 重新运行是安全的——每次运行会覆盖所有输出文件
- 无效wikilink会被跳过——仅导出指向库中存在页面的边
- 如果库使用版本控制,目录应加入gitignore——这些是派生产物
wiki-export/ - 是核心格式——其他格式都基于它生成。如果未来有工具原生支持图查询,可直接使用
graph.jsongraph.json