wiki-export

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Wiki Export — Knowledge Graph Export

Wiki 导出 — 知识图谱导出

You are exporting the wiki's wikilink graph to structured formats so it can be used in external tools (Gephi, Neo4j, custom scripts, browser visualization).
你正在将wiki的wikilink图谱导出为结构化格式,以便在外部工具(Gephi、Neo4j、自定义脚本、浏览器可视化工具)中使用。

Before You Start

开始之前

  1. Read
    .env
    to get
    OBSIDIAN_VAULT_PATH
  2. Confirm the vault has pages to export — if fewer than 5 pages exist, warn the user and stop
  1. 读取
    .env
    文件获取
    OBSIDIAN_VAULT_PATH
  2. 确认库中有可导出的页面——如果页面数量少于5个,警告用户并终止操作

Step 1: Build the Node and Edge Lists

步骤1:构建节点和边列表

Glob all
.md
files in the vault (excluding
_archives/
,
_raw/
,
.obsidian/
,
index.md
,
log.md
,
_insights.md
).
For each page, extract from frontmatter:
  • id
    — relative path from vault root, without
    .md
    extension (e.g.
    concepts/transformers
    )
  • label
    title
    field from frontmatter, or filename if missing
  • category
    — directory prefix (
    concepts
    ,
    entities
    ,
    skills
    ,
    references
    ,
    synthesis
    ,
    projects
    , or
    journal
    )
  • tags
    — array from frontmatter tags field
  • summary
    — frontmatter
    summary
    field if present
This is your node list.
For each page, Grep the body for
\[\[.*?\]\]
to extract all wikilinks:
  • Parse each
    [[target]]
    or
    [[target|display]]
    — use the target part only
  • Resolve the target to a node id (normalize: lowercase, spaces→hyphens, strip
    .md
    )
  • Skip links that point outside the node list (broken links)
  • Each resolved link becomes an edge:
    {source: page_id, target: linked_id, relation: "wikilink", confidence: "EXTRACTED"}
  • If the linking sentence ends with
    ^[inferred]
    or
    ^[ambiguous]
    , override
    confidence
    accordingly
This is your edge list.
遍历库中所有
.md
文件(排除
_archives/
_raw/
.obsidian/
index.md
log.md
_insights.md
)。
对每个页面,从frontmatter中提取:
  • id
    — 相对于库根目录的路径,不带
    .md
    后缀(例如
    concepts/transformers
  • label
    — frontmatter中的
    title
    字段,若缺失则使用文件名
  • category
    — 目录前缀(
    concepts
    entities
    skills
    references
    synthesis
    projects
    journal
  • tags
    — frontmatter中tags字段对应的数组
  • summary
    — frontmatter中的
    summary
    字段(若存在)
以上内容构成你的节点列表
对每个页面,在正文里用正则匹配
\[\[.*?\]\]
提取所有wikilink:
  • 解析每个
    [[target]]
    [[target|display]]
    格式的链接,仅使用target部分
  • 将target解析为节点id(标准化规则:转小写、空格替换为连字符、去除
    .md
    后缀)
  • 跳过指向节点列表外的链接(即无效链接)
  • 每个解析成功的链接生成一条边:
    {source: page_id, target: linked_id, relation: "wikilink", confidence: "EXTRACTED"}
  • 如果链接所在句子以
    ^[inferred]
    ^[ambiguous]
    结尾,对应覆盖
    confidence
    的值
以上内容构成你的边列表

Step 2: Assign Community IDs

步骤2:分配社区ID

Group pages into communities by tag clustering:
  • Pages sharing the same dominant tag belong to the same community
  • Dominant tag = the first tag in the page's frontmatter tags array
  • Pages with no tags get community id
    null
  • Number communities starting from 0, ordered by size descending (largest community = 0)
This enables community-based coloring in the HTML visualization and tools like Gephi.
通过标签聚类将页面分组到不同社区:
  • 共享相同主导标签的页面属于同一个社区
  • 主导标签 = 页面frontmatter的tags数组中的第一个标签
  • 无标签的页面社区id为
    null
  • 社区编号从0开始,按社区大小降序排列(最大的社区编号为0)
这一功能支持在HTML可视化文件和Gephi等工具中按社区着色。

Step 3: Write the Output Files

步骤3:写入输出文件

Create
wiki-export/
at the vault root if it doesn't exist. Write all four files:

如果库根目录下不存在
wiki-export/
文件夹则创建该目录,写入以下四个文件:

3a.
graph.json

3a.
graph.json

NetworkX node_link format — standard for graph tools and scripts:
json
{
  "directed": false,
  "multigraph": false,
  "graph": {
    "exported_at": "<ISO timestamp>",
    "vault": "<OBSIDIAN_VAULT_PATH>",
    "total_nodes": N,
    "total_edges": M
  },
  "nodes": [
    {
      "id": "concepts/transformers",
      "label": "Transformer Architecture",
      "category": "concepts",
      "tags": ["ml", "architecture"],
      "summary": "The attention-based architecture introduced in Attention Is All You Need.",
      "community": 0
    }
  ],
  "links": [
    {
      "source": "concepts/transformers",
      "target": "entities/vaswani",
      "relation": "wikilink",
      "confidence": "EXTRACTED"
    }
  ]
}

NetworkX node_link格式——是图工具和脚本的标准格式:
json
{
  "directed": false,
  "multigraph": false,
  "graph": {
    "exported_at": "<ISO timestamp>",
    "vault": "<OBSIDIAN_VAULT_PATH>",
    "total_nodes": N,
    "total_edges": M
  },
  "nodes": [
    {
      "id": "concepts/transformers",
      "label": "Transformer Architecture",
      "category": "concepts",
      "tags": ["ml", "architecture"],
      "summary": "The attention-based architecture introduced in Attention Is All You Need.",
      "community": 0
    }
  ],
  "links": [
    {
      "source": "concepts/transformers",
      "target": "entities/vaswani",
      "relation": "wikilink",
      "confidence": "EXTRACTED"
    }
  ]
}

3b.
graph.graphml

3b.
graph.graphml

GraphML XML format — loadable in Gephi, yEd, and Cytoscape:
xml
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/graphml">
  <key id="label" for="node" attr.name="label" attr.type="string"/>
  <key id="category" for="node" attr.name="category" attr.type="string"/>
  <key id="tags" for="node" attr.name="tags" attr.type="string"/>
  <key id="community" for="node" attr.name="community" attr.type="int"/>
  <key id="relation" for="edge" attr.name="relation" attr.type="string"/>
  <key id="confidence" for="edge" attr.name="confidence" attr.type="string"/>
  <graph id="wiki" edgedefault="undirected">
    <node id="concepts/transformers">
      <data key="label">Transformer Architecture</data>
      <data key="category">concepts</data>
      <data key="tags">ml, architecture</data>
      <data key="community">0</data>
    </node>
    <edge source="concepts/transformers" target="entities/vaswani">
      <data key="relation">wikilink</data>
      <data key="confidence">EXTRACTED</data>
    </edge>
  </graph>
</graphml>
Write one
<node>
per page and one
<edge>
per wikilink.

GraphML XML格式——可导入到Gephi、yEd和Cytoscape中使用:
xml
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/graphml">
  <key id="label" for="node" attr.name="label" attr.type="string"/>
  <key id="category" for="node" attr.name="category" attr.type="string"/>
  <key id="tags" for="node" attr.name="tags" attr.type="string"/>
  <key id="community" for="node" attr.name="community" attr.type="int"/>
  <key id="relation" for="edge" attr.name="relation" attr.type="string"/>
  <key id="confidence" for="edge" attr.name="confidence" attr.type="string"/>
  <graph id="wiki" edgedefault="undirected">
    <node id="concepts/transformers">
      <data key="label">Transformer Architecture</data>
      <data key="category">concepts</data>
      <data key="tags">ml, architecture</data>
      <data key="community">0</data>
    </node>
    <edge source="concepts/transformers" target="entities/vaswani">
      <data key="relation">wikilink</data>
      <data key="confidence">EXTRACTED</data>
    </edge>
  </graph>
</graphml>
每个页面对应写入一个
<node>
标签,每个wikilink对应写入一个
<edge>
标签。

3c.
cypher.txt

3c.
cypher.txt

Neo4j Cypher
MERGE
statements — paste into Neo4j Browser or run with
cypher-shell
:
cypher
// Wiki knowledge graph export — <TIMESTAMP>
// Load with: cypher-shell -u neo4j -p password < cypher.txt

// Nodes
MERGE (n:Page {id: "concepts/transformers"}) SET n.label = "Transformer Architecture", n.category = "concepts", n.tags = ["ml","architecture"], n.community = 0;
MERGE (n:Page {id: "entities/vaswani"}) SET n.label = "Ashish Vaswani", n.category = "entities", n.tags = ["person","ml"], n.community = 0;

// Relationships
MATCH (a:Page {id: "concepts/transformers"}), (b:Page {id: "entities/vaswani"}) MERGE (a)-[:WIKILINK {relation: "wikilink", confidence: "EXTRACTED"}]->(b);
Write one
MERGE
node statement per page, then one
MATCH
/
MERGE
relationship statement per edge.

Neo4j Cypher
MERGE
语句——可粘贴到Neo4j Browser中,或通过
cypher-shell
运行:
cypher
// Wiki knowledge graph export — <TIMESTAMP>
// Load with: cypher-shell -u neo4j -p password < cypher.txt

// Nodes
MERGE (n:Page {id: "concepts/transformers"}) SET n.label = "Transformer Architecture", n.category = "concepts", n.tags = ["ml","architecture"], n.community = 0;
MERGE (n:Page {id: "entities/vaswani"}) SET n.label = "Ashish Vaswani", n.category = "entities", n.tags = ["person","ml"], n.community = 0;

// Relationships
MATCH (a:Page {id: "concepts/transformers"}), (b:Page {id: "entities/vaswani"}) MERGE (a)-[:WIKILINK {relation: "wikilink", confidence: "EXTRACTED"}]->(b);
每个页面对应写入一条
MERGE
节点语句,每条边对应写入一条
MATCH
/
MERGE
关系语句。

3d.
graph.html

3d.
graph.html

A self-contained interactive visualization using the vis.js CDN (no local dependencies). The user opens this file in any browser — no server needed.
Build the HTML file by:
  1. Generating a JSON array of node objects for vis.js:
js
{id: "concepts/transformers", label: "Transformer Architecture", color: {background: "#4E79A7"}, size: <degree * 3 + 8>, title: "concepts | #ml #architecture", community: 0}
  • Color by community (cycle through:
    #4E79A7
    ,
    #F28E2B
    ,
    #E15759
    ,
    #76B7B2
    ,
    #59A14F
    ,
    #EDC948
    ,
    #B07AA1
    ,
    #FF9DA7
    ,
    #9C755F
    ,
    #BAB0AC
    )
  • Size by degree (incoming + outgoing link count):
    size = degree * 3 + 8
    , capped at 60
  • title
    = tooltip text shown on hover: category, tags, summary (if available)
  1. Generating a JSON array of edge objects for vis.js:
js
{from: "concepts/transformers", to: "entities/vaswani", dashes: false, width: 1, color: {color: "#666", opacity: 0.6}}
  • dashes: true
    for INFERRED edges
  • dashes: [4,8]
    for AMBIGUOUS edges
  1. Writing the full HTML file:
html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Wiki Knowledge Graph</title>
<script src="https://unpkg.com/vis-network/standalone/umd/vis-network.min.js"></script>
<style>
  * { box-sizing: border-box; margin: 0; padding: 0; }
  body { background: #0f0f1a; color: #e0e0e0; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif; display: flex; height: 100vh; }
  #graph { flex: 1; }
  #sidebar { width: 260px; background: #1a1a2e; border-left: 1px solid #2a2a4e; padding: 14px; overflow-y: auto; font-size: 13px; }
  #sidebar h3 { color: #aaa; font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; margin: 0 0 10px; }
  #info { margin-bottom: 16px; line-height: 1.6; color: #ccc; }
  .legend-item { display: flex; align-items: center; gap: 8px; padding: 3px 0; font-size: 12px; }
  .dot { width: 10px; height: 10px; border-radius: 50%; flex-shrink: 0; }
  #stats { margin-top: 16px; color: #555; font-size: 11px; }
</style>
</head>
<body>
<div id="graph"></div>
<div id="sidebar">
  <h3>Wiki Knowledge Graph</h3>
  <div id="info">Click a node to see details.</div>
  <h3 style="margin-top:12px">Communities</h3>
  <div id="legend"><!-- populated by JS --></div>
  <div id="stats"><!-- populated by JS --></div>
</div>
<script>
const NODES_DATA = /* NODES_JSON */;
const EDGES_DATA = /* EDGES_JSON */;
const COMMUNITY_COLORS = ["#4E79A7","#F28E2B","#E15759","#76B7B2","#59A14F","#EDC948","#B07AA1","#FF9DA7","#9C755F","#BAB0AC"];

const nodes = new vis.DataSet(NODES_DATA);
const edges = new vis.DataSet(EDGES_DATA);
const network = new vis.Network(document.getElementById('graph'), {nodes, edges}, {
  physics: { solver: 'forceAtlas2Based', forceAtlas2Based: { gravitationalConstant: -60, springLength: 120 }, stabilization: { iterations: 200 } },
  interaction: { hover: true, tooltipDelay: 100 },
  nodes: { shape: 'dot', borderWidth: 1.5 },
  edges: { smooth: { type: 'continuous' }, arrows: { to: { enabled: true, scaleFactor: 0.4 } } }
});
network.once('stabilizationIterationsDone', () => network.setOptions({ physics: { enabled: false } }));

network.on('click', ({nodes: sel}) => {
  if (!sel.length) return;
  const n = NODES_DATA.find(x => x.id === sel[0]);
  if (!n) return;
  document.getElementById('info').innerHTML = `<b>${n.label}</b><br>Category: ${n.category||'—'}<br>Tags: ${n.tags||'—'}<br>${n.summary ? '<br>'+n.summary : ''}`;
});

// Build legend
const communities = {};
NODES_DATA.forEach(n => { if (n.community != null) communities[n.community] = (communities[n.community]||0)+1; });
const leg = document.getElementById('legend');
Object.entries(communities).sort((a,b)=>b[1]-a[1]).forEach(([cid, count]) => {
  const color = COMMUNITY_COLORS[cid % COMMUNITY_COLORS.length];
  leg.innerHTML += `<div class="legend-item"><div class="dot" style="background:${color}"></div>Community ${cid} (${count})</div>`;
});
document.getElementById('stats').textContent = `${NODES_DATA.length} pages · ${EDGES_DATA.length} links`;
</script>
</body>
</html>
Replace
/* NODES_JSON */
and
/* EDGES_JSON */
with the actual JSON arrays you generated in step 1.

使用vis.js CDN实现的独立交互式可视化文件(无本地依赖)。用户可在任意浏览器中打开该文件,无需服务器支持。
按以下步骤构建HTML文件:
  1. 生成供vis.js使用的节点对象JSON数组:
js
{id: "concepts/transformers", label: "Transformer Architecture", color: {background: "#4E79A7"}, size: <degree * 3 + 8>, title: "concepts | #ml #architecture", community: 0}
  • 按社区着色(循环使用以下颜色:
    #4E79A7
    #F28E2B
    #E15759
    #76B7B2
    #59A14F
    #EDC948
    #B07AA1
    #FF9DA7
    #9C755F
    #BAB0AC
  • 按度数(入站+出站链接数)调整大小:
    size = 度数 * 3 + 8
    ,最大不超过60
  • title
    = 鼠标悬停时显示的提示文本:分类、标签、摘要(若有)
  1. 生成供vis.js使用的边对象JSON数组:
js
{from: "concepts/transformers", to: "entities/vaswani", dashes: false, width: 1, color: {color: "#666", opacity: 0.6}}
  • INFERRED类型的边设置
    dashes: true
  • AMBIGUOUS类型的边设置
    dashes: [4,8]
  1. 写入完整的HTML文件:
html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Wiki Knowledge Graph</title>
<script src="https://unpkg.com/vis-network/standalone/umd/vis-network.min.js"></script>
<style>
  * { box-sizing: border-box; margin: 0; padding: 0; }
  body { background: #0f0f1a; color: #e0e0e0; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", sans-serif; display: flex; height: 100vh; }
  #graph { flex: 1; }
  #sidebar { width: 260px; background: #1a1a2e; border-left: 1px solid #2a2a4e; padding: 14px; overflow-y: auto; font-size: 13px; }
  #sidebar h3 { color: #aaa; font-size: 11px; text-transform: uppercase; letter-spacing: 0.05em; margin: 0 0 10px; }
  #info { margin-bottom: 16px; line-height: 1.6; color: #ccc; }
  .legend-item { display: flex; align-items: center; gap: 8px; padding: 3px 0; font-size: 12px; }
  .dot { width: 10px; height: 10px; border-radius: 50%; flex-shrink: 0; }
  #stats { margin-top: 16px; color: #555; font-size: 11px; }
</style>
</head>
<body>
<div id="graph"></div>
<div id="sidebar">
  <h3>Wiki Knowledge Graph</h3>
  <div id="info">Click a node to see details.</div>
  <h3 style="margin-top:12px">Communities</h3>
  <div id="legend"><!-- populated by JS --></div>
  <div id="stats"><!-- populated by JS --></div>
</div>
<script>
const NODES_DATA = /* NODES_JSON */;
const EDGES_DATA = /* EDGES_JSON */;
const COMMUNITY_COLORS = ["#4E79A7","#F28E2B","#E15759","#76B7B2","#59A14F","#EDC948","#B07AA1","#FF9DA7","#9C755F","#BAB0AC"];

const nodes = new vis.DataSet(NODES_DATA);
const edges = new vis.DataSet(EDGES_DATA);
const network = new vis.Network(document.getElementById('graph'), {nodes, edges}, {
  physics: { solver: 'forceAtlas2Based', forceAtlas2Based: { gravitationalConstant: -60, springLength: 120 }, stabilization: { iterations: 200 } },
  interaction: { hover: true, tooltipDelay: 100 },
  nodes: { shape: 'dot', borderWidth: 1.5 },
  edges: { smooth: { type: 'continuous' }, arrows: { to: { enabled: true, scaleFactor: 0.4 } } }
});
network.once('stabilizationIterationsDone', () => network.setOptions({ physics: { enabled: false } }));

network.on('click', ({nodes: sel}) => {
  if (!sel.length) return;
  const n = NODES_DATA.find(x => x.id === sel[0]);
  if (!n) return;
  document.getElementById('info').innerHTML = `<b>${n.label}</b><br>Category: ${n.category||'—'}<br>Tags: ${n.tags||'—'}<br>${n.summary ? '<br>'+n.summary : ''}`;
});

// Build legend
const communities = {};
NODES_DATA.forEach(n => { if (n.community != null) communities[n.community] = (communities[n.community]||0)+1; });
const leg = document.getElementById('legend');
Object.entries(communities).sort((a,b)=>b[1]-a[1]).forEach(([cid, count]) => {
  const color = COMMUNITY_COLORS[cid % COMMUNITY_COLORS.length];
  leg.innerHTML += `<div class="legend-item"><div class="dot" style="background:${color}"></div>Community ${cid} (${count})</div>`;
});
document.getElementById('stats').textContent = `${NODES_DATA.length} pages · ${EDGES_DATA.length} links`;
</script>
</body>
</html>
/* NODES_JSON */
/* EDGES_JSON */
替换为你在步骤1中生成的实际JSON数组。

Step 4: Print Summary

步骤4:打印总结

Wiki export complete → wiki-export/
  graph.json    — N nodes, M edges (NetworkX node_link format)
  graph.graphml — N nodes, M edges (Gephi / yEd / Cytoscape)
  cypher.txt    — N MERGE nodes + M MERGE relationships (Neo4j)
  graph.html    — interactive browser visualization (open in any browser)
Wiki export complete → wiki-export/
  graph.json    — N nodes, M edges (NetworkX node_link format)
  graph.graphml — N nodes, M edges (Gephi / yEd / Cytoscape)
  cypher.txt    — N MERGE nodes + M MERGE relationships (Neo4j)
  graph.html    — interactive browser visualization (open in any browser)

Notes

注意事项

  • Re-running is safe — all output files are overwritten on each run
  • Broken wikilinks are skipped — only edges to pages that exist in the vault are exported
  • The
    wiki-export/
    directory should be gitignored
    if the vault is version-controlled — these are derived artifacts
  • graph.json
    is the primary format
    — the others are derived from it. If a future tool supports graph queries natively, point it at
    graph.json
  • 重新运行是安全的——每次运行会覆盖所有输出文件
  • 无效wikilink会被跳过——仅导出指向库中存在页面的边
  • 如果库使用版本控制,
    wiki-export/
    目录应加入gitignore
    ——这些是派生产物
  • graph.json
    是核心格式
    ——其他格式都基于它生成。如果未来有工具原生支持图查询,可直接使用
    graph.json