neo4j-gds-skill

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

When to Use

适用场景

Running GDS algorithms on self-managed Neo4j or Aura Pro (embedded plugin)
Projecting named in-memory graphs, running centrality/community/similarity/path/embedding algorithms
Chaining algorithms via
```
mutate
```
mode; building FastRP → KNN pipelines
Memory estimation before large graph operations
GDS Python client (
```
graphdatascience
```
) workflows

在自托管Neo4j或Aura Pro（嵌入式插件）上运行GDS算法
创建命名内存图，运行中心性/社区/相似度/路径/嵌入算法
通过
```
mutate
```
模式串联算法；构建FastRP → KNN管道
大型图操作前的内存估算
GDS Python客户端（
```
graphdatascience
```
）工作流

When NOT to Use

不适用场景

Aura BC / VDC / Free — GDS plugin unavailable →
```
neo4j-aura-graph-analytics-skill
```
Cypher query authoring →
```
neo4j-cypher-skill
```
Driver/connection setup →
```
neo4j-driver-python-skill
```
GraphRAG retrieval →
```
neo4j-graphrag-skill
```

Deployment	Use
Aura Free	Upgrade to Pro or use `neo4j-aura-graph-analytics-skill`
Aura Pro	This skill
Aura BC / VDC	`neo4j-aura-graph-analytics-skill`
Self-managed (Community or Enterprise)	This skill (install GDS plugin)

Aura BC / VDC / Free — 无GDS插件支持 → 使用
```
neo4j-aura-graph-analytics-skill
```
Cypher查询编写 → 使用
```
neo4j-cypher-skill
```
驱动程序/连接设置 → 使用
```
neo4j-driver-python-skill
```
GraphRAG检索 → 使用
```
neo4j-graphrag-skill
```

部署环境	适用方式
Aura Free	升级到Pro版或使用 `neo4j-aura-graph-analytics-skill`
Aura Pro	使用本技能
Aura BC / VDC	使用 `neo4j-aura-graph-analytics-skill`
自托管（社区版或企业版）	使用本技能（需安装GDS插件）

Pre-flight

前置检查

cypher

RETURN gds.version() AS gds_version

Fails with

Unknown function 'gds.version'

→ GDS not installed or wrong tier. Stop, inform user.

bash

pip install graphdatascience              # Python client
pip install graphdatascience[rust_ext]    # 3–10× faster serialization

Compatibility: graphdatascience v1.21 — GDS >= 2.6, Python >= 3.10, Neo4j Driver >= 4.4.12

python

from graphdatascience import GraphDataScience

gds = GraphDataScience("bolt://localhost:7687", auth=("neo4j", "password"))
gds = GraphDataScience("neo4j+s://xxx.databases.neo4j.io", auth=("neo4j", "pw"), aura_ds=True)
print(gds.server_version())

cypher

RETURN gds.version() AS gds_version

若执行失败并提示

Unknown function 'gds.version'

→ 说明GDS未安装或版本层级不符。请停止操作并告知用户。

bash

pip install graphdatascience              # Python客户端
pip install graphdatascience[rust_ext]    # 序列化速度提升3–10倍

兼容性：graphdatascience v1.21 — GDS >= 2.6，Python >= 3.10，Neo4j Driver >= 4.4.12

python

from graphdatascience import GraphDataScience

gds = GraphDataScience("bolt://localhost:7687", auth=("neo4j", "password"))
gds = GraphDataScience("neo4j+s://xxx.databases.neo4j.io", auth=("neo4j", "pw"), aura_ds=True)
print(gds.server_version())

Graph Catalog Operations

图目录操作

Native Projection

原生投影

cypher

CALL gds.graph.project(
  'myGraph',
  ['Person', 'City'],
  { KNOWS: { orientation: 'UNDIRECTED' }, LIVES_IN: {} }
)
YIELD graphName, nodeCount, relationshipCount

python

G, result = gds.graph.project("myGraph", "Person", "KNOWS")

G, result = gds.graph.project(
    "myGraph",
    {"Person": {"properties": ["age", "score"]}, "City": {}},
    {"KNOWS": {"orientation": "UNDIRECTED"}, "LIVES_IN": {"properties": ["since"]}}
)

cypher

CALL gds.graph.project(
  'myGraph',
  ['Person', 'City'],
  { KNOWS: { orientation: 'UNDIRECTED' }, LIVES_IN: {} }
)
YIELD graphName, nodeCount, relationshipCount

python

G, result = gds.graph.project("myGraph", "Person", "KNOWS")

G, result = gds.graph.project(
    "myGraph",
    {"Person": {"properties": ["age", "score"]}, "City": {}},
    {"KNOWS": {"orientation": "UNDIRECTED"}, "LIVES_IN": {"properties": ["since"]}}
)

Cypher Projection (use when native can't express filter/transform)

Cypher投影（当原生投影无法实现过滤/转换时使用）

python

G, result = gds.graph.cypher.project(
    """
    MATCH (source:Person)-[r:KNOWS]->(target:Person)
    WHERE source.active = true
    RETURN gds.graph.project($graph_name, source, target,
        { sourceNodeProperties: source { .score }, relationshipType: 'KNOWS' })
    """,
    database="neo4j", graph_name="activeGraph"
)

Native projection over Cypher projection whenever possible — 5–10× faster on large graphs.

python

G, result = gds.graph.cypher.project(
    """
    MATCH (source:Person)-[r:KNOWS]->(target:Person)
    WHERE source.active = true
    RETURN gds.graph.project($graph_name, source, target,
        { sourceNodeProperties: source { .score }, relationshipType: 'KNOWS' })
    """,
    database="neo4j", graph_name="activeGraph"
)

尽可能优先使用原生投影而非Cypher投影——在大型图上速度快5-10倍。

Weighted Projection (Cypher projection syntax)

加权投影（Cypher投影语法）

cypher

MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-weighted',
  source, target,
  { relationshipProperties: r { .rating } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

cypher

MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-weighted',
  source, target,
  { relationshipProperties: r { .rating } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

Relationship Aggregation (collapse parallel relationships into a weighted edge)

关系聚合（将平行关系合并为加权边）

cypher

MATCH (source:Actor)-[r:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor)
WITH source, target, count(r) AS collabCount
WITH gds.graph.project(
  'actor-network',
  source, target,
  { relationshipProperties: { collabCount: collabCount } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

Use

count(r)

to aggregate multiple parallel relationships into a single weighted edge. Reduces graph size; enables weight-based algorithms.

cypher

MATCH (source:Actor)-[r:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor)
WITH source, target, count(r) AS collabCount
WITH gds.graph.project(
  'actor-network',
  source, target,
  { relationshipProperties: { collabCount: collabCount } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

使用

count(r)

将多条平行关系聚合为单条加权边。可减小图规模，支持基于权重的算法。

Undirected Projection (native syntax)

无向投影（原生语法）

Pass

orientation: 'UNDIRECTED'

per relationship type — or use

undirectedRelationshipTypes: ['*']

in Cypher projection (second config map).

Leiden requires undirected relationships. Community detection and similarity algorithms generally work better on undirected graphs.

为每个关系类型传入

orientation: 'UNDIRECTED'

——或在Cypher投影的第二个配置映射中使用

undirectedRelationshipTypes: ['*']

。

Leiden算法要求使用无向关系。社区检测和相似度算法通常在无向图上表现更好。

Inspect and Drop

检查与删除

python

G.node_count()            # 12_043
G.relationship_count()    # 87_211
G.node_properties("Person")  # lists projected + mutated properties
G.memory_usage()          # "45 MiB"
G.exists()
G.drop()                  # always drop after use — frees JVM heap

G = gds.graph.get("myGraph")          # re-attach to existing projection

with gds.graph.project("tmp", "Person", "KNOWS")[0] as G:
    results = gds.pageRank.stream(G)

python

G.node_count()            # 12_043
G.relationship_count()    # 87_211
G.node_properties("Person")  # 列出已投影和修改的属性
G.memory_usage()          # "45 MiB"
G.exists()
G.drop()                  # 使用后务必删除——释放JVM堆内存

G = gds.graph.get("myGraph")          # 重新关联到已存在的投影

with gds.graph.project("tmp", "Person", "KNOWS")[0] as G:
    results = gds.pageRank.stream(G)

dropped automatically

自动删除

undefined

undefined

Memory Estimation — always run before large projections and algorithms

内存估算——大型投影和算法运行前务必执行

cypher

CALL gds.graph.project.estimate(['Person'], 'KNOWS')
YIELD requiredMemory, bytesMin, bytesMax, nodeCount, relationshipCount

python

est = gds.graph.project.estimate("Person", "KNOWS")
print(est["requiredMemory"])    # e.g. "1234 MiB"

cypher

CALL gds.graph.project.estimate(['Person'], 'KNOWS')
YIELD requiredMemory, bytesMin, bytesMax, nodeCount, relationshipCount

python

est = gds.graph.project.estimate("Person", "KNOWS")
print(est["requiredMemory"])    # 示例："1234 MiB"

Algorithm estimation:

算法内存估算:

est = gds.pageRank.estimate(G, dampingFactor=0.85) print(est["requiredMemory"])

---

est = gds.pageRank.estimate(G, dampingFactor=0.85) print(est["requiredMemory"])

---

Execution Modes

执行模式

Mode	Side effect	Returns	Use when
`stream`	None	Row per node/pair	Inspect results; top-N
`stats`	None	Single aggregate row	Summary/convergence check
`mutate`	Adds property to in-memory graph only	Stats row	Chain algorithms
`write`	Persists property to Neo4j DB	Stats row	Final step — make queryable

Pattern:

stream

to verify →

mutate

to chain →

write

to persist.

mutateProperty

must not already exist in the in-memory graph. After

write

, re-project to use written properties in subsequent GDS calls (in-memory graph does not see DB writes).

模式	副作用	返回结果	适用场景
`stream`	无	每个节点/节点对对应一行	查看结果；获取Top-N数据
`stats`	无	单条聚合行	汇总/收敛性检查
`mutate`	仅向内存图添加属性	统计行	串联算法
`write`	将属性持久化到Neo4j数据库	统计行	最终步骤——使结果可查询

典型流程：

stream

验证结果 →

mutate

串联算法 →

write

持久化数据。

mutateProperty

不能已存在于内存图中。执行

write

后，需重新投影才能在后续GDS调用中使用已写入的属性（内存图无法感知数据库写入操作）。

gds.util.asNode() — Enrich Stream Results

gds.util.asNode() — 丰富Stream模式结果

stream

mode yields

nodeId

(internal GDS integer).

gds.util.asNode(nodeId)

translates it back to the DB node so you can access properties.

cypher

// Single property
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 10

// Multiple properties — convert once with WITH
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS node, score
RETURN node.name AS name, node.born AS born, score
ORDER BY score DESC LIMIT 10

Not needed for

write

mutate

, or

stats

modes — those don't return per-node data.

stream

模式返回

nodeId

（GDS内部整数）。

gds.util.asNode(nodeId)

可将其转换回数据库节点，以便访问节点属性。

cypher

// 单个属性
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 10

// 多个属性——通过WITH一次性转换
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS node, score
RETURN node.name AS name, node.born AS born, score
ORDER BY score DESC LIMIT 10

write

、

mutate

或

stats

模式无需使用该方法——这些模式不返回单节点数据。

Core Algorithms

核心算法

PageRank (centrality)

PageRank（中心性）

cypher

CALL gds.pageRank.stream('myGraph', { dampingFactor: 0.85, maxIterations: 20 })
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score ORDER BY score DESC LIMIT 10
// score: relative influence — not absolute. Compare within same run only.
// didConverge: true means score stabilized; if false, increase maxIterations.

CALL gds.pageRank.write('myGraph', { writeProperty: 'pagerank', dampingFactor: 0.85 })
YIELD nodePropertiesWritten, ranIterations, didConverge

python

pr_df = gds.pageRank.stream(G, dampingFactor=0.85)
gds.pageRank.mutate(G, mutateProperty="pagerank", dampingFactor=0.85)
gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)

cypher

CALL gds.pageRank.stream('myGraph', { dampingFactor: 0.85, maxIterations: 20 })
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score ORDER BY score DESC LIMIT 10
// score：相对影响力——非绝对值。仅可在同一次运行内比较。
// didConverge：true表示分数已稳定；若为false，需增大maxIterations值。

CALL gds.pageRank.write('myGraph', { writeProperty: 'pagerank', dampingFactor: 0.85 })
YIELD nodePropertiesWritten, ranIterations, didConverge

python

pr_df = gds.pageRank.stream(G, dampingFactor=0.85)
gds.pageRank.mutate(G, mutateProperty="pagerank", dampingFactor=0.85)
gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)

Louvain (community detection)

Louvain（社区检测）

cypher

CALL gds.louvain.stream('myGraph', { relationshipWeightProperty: 'weight' })
YIELD nodeId, communityId

CALL gds.louvain.write('myGraph', { writeProperty: 'community' })
YIELD communityCount, modularity

python

louvain_df = gds.louvain.stream(G)
gds.louvain.write(G, writeProperty="community")

Leiden is a refinement of Louvain avoiding poorly connected communities — use when community quality > raw speed.

modularity

in stats result: range -0.5 to 1.0; values > 0.3 indicate meaningful community structure; > 0.7 = strong. Leiden requires undirected relationships in the projection.

cypher

CALL gds.louvain.stream('myGraph', { relationshipWeightProperty: 'weight' })
YIELD nodeId, communityId

CALL gds.louvain.write('myGraph', { writeProperty: 'community' })
YIELD communityCount, modularity

python

louvain_df = gds.louvain.stream(G)
gds.louvain.write(G, writeProperty="community")

Leiden是Louvain的改进版，可避免生成连接稀疏的社区——当社区质量优先于速度时使用。统计结果中的

modularity

：范围为-0.5至1.0；值>0.3表示存在有意义的社区结构；>0.7表示社区结构较强。 Leiden算法要求投影使用无向关系。

WCC — Weakly Connected Components

WCC——弱连通分量

Run WCC first to understand graph structure; partition disconnected graphs before expensive algorithms.

cypher

CALL gds.wcc.stream('myGraph', { minComponentSize: 10 })
YIELD nodeId, componentId

CALL gds.wcc.write('myGraph', { writeProperty: 'componentId' })
YIELD nodePropertiesWritten, componentCount

python

wcc_df = gds.wcc.stream(G)
gds.wcc.write(G, writeProperty="componentId")

先运行WCC以了解图结构；在执行高开销算法前，对不连通的图进行分区。

cypher

CALL gds.wcc.stream('myGraph', { minComponentSize: 10 })
YIELD nodeId, componentId

CALL gds.wcc.write('myGraph', { writeProperty: 'componentId' })
YIELD nodePropertiesWritten, componentCount

python

wcc_df = gds.wcc.stream(G)
gds.wcc.write(G, writeProperty="componentId")

Betweenness Centrality

介数中心性

python

gds.betweenness.stream(G)          # identifies bottleneck/bridge nodes
gds.betweenness.write(G, writeProperty="betweenness")

python

gds.betweenness.stream(G)          # 识别瓶颈/桥接节点
gds.betweenness.write(G, writeProperty="betweenness")

Node Similarity

节点相似度

Jaccard similarity from common neighbors — no node properties required.

python

gds.nodeSimilarity.stream(G, similarityCutoff=0.1, topK=10)
gds.nodeSimilarity.write(G, writeRelationshipType="SIMILAR", writeProperty="score",
                          similarityCutoff=0.1, topK=10)

基于共同邻居计算Jaccard相似度——无需节点属性。

python

gds.nodeSimilarity.stream(G, similarityCutoff=0.1, topK=10)
gds.nodeSimilarity.write(G, writeRelationshipType="SIMILAR", writeProperty="score",
                          similarityCutoff=0.1, topK=10)

FastRP (node embeddings)

FastRP（节点嵌入）

Fast, scalable, production ML pipelines. Set

randomSeed

for reproducibility.

cypher

CALL gds.fastRP.mutate('myGraph', {
  embeddingDimension: 256,
  iterationWeights: [0.0, 1.0, 1.0],
  featureProperties: ['score'],
  propertyRatio: 0.5,
  normalizationStrength: -0.5,
  randomSeed: 42,
  mutateProperty: 'embedding'
})
YIELD nodePropertiesWritten

python

gds.fastRP.mutate(G, embeddingDimension=256, iterationWeights=[0.0, 1.0, 1.0],
                  randomSeed=42, mutateProperty="embedding")
gds.fastRP.write(G, embeddingDimension=256, writeProperty="embedding", randomSeed=42)

快速、可扩展的生产级机器学习管道。设置

randomSeed

以确保结果可复现。

cypher

CALL gds.fastRP.mutate('myGraph', {
  embeddingDimension: 256,
  iterationWeights: [0.0, 1.0, 1.0],
  featureProperties: ['score'],
  propertyRatio: 0.5,
  normalizationStrength: -0.5,
  randomSeed: 42,
  mutateProperty: 'embedding'
})
YIELD nodePropertiesWritten

python

gds.fastRP.mutate(G, embeddingDimension=256, iterationWeights=[0.0, 1.0, 1.0],
                  randomSeed=42, mutateProperty="embedding")
gds.fastRP.write(G, embeddingDimension=256, writeProperty="embedding", randomSeed=42)

KNN — K-Nearest Neighbors

KNN——K近邻

Finds k most similar nodes per node based on node properties (typically embeddings).

cypher

CALL gds.knn.stream('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  sampleRate: 0.5, similarityCutoff: 0.7
})
YIELD node1, node2, similarity

CALL gds.knn.write('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  writeRelationshipType: 'SIMILAR', writeProperty: 'score'
})
YIELD relationshipsWritten

python

knn_df = gds.knn.stream(G, nodeProperties=["embedding"], topK=10)
gds.knn.write(G, nodeProperties=["embedding"], topK=10,
              writeRelationshipType="SIMILAR", writeProperty="score")

根据节点属性（通常为嵌入向量）为每个节点找到k个最相似的节点。

cypher

CALL gds.knn.stream('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  sampleRate: 0.5, similarityCutoff: 0.7
})
YIELD node1, node2, similarity

CALL gds.knn.write('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  writeRelationshipType: 'SIMILAR', writeProperty: 'score'
})
YIELD relationshipsWritten

python

knn_df = gds.knn.stream(G, nodeProperties=["embedding"], topK=10)
gds.knn.write(G, nodeProperties=["embedding"], topK=10,
              writeRelationshipType="SIMILAR", writeProperty="score")

FastRP → KNN Pipeline (recommendation)

FastRP → KNN管道（推荐系统）

python

undefined

python

undefined

1. Project

1. 投影图

G, _ = gds.graph.project("myGraph", "Product", {"BOUGHT_TOGETHER": {"orientation": "UNDIRECTED"}})

2. Estimate memory

2. 估算内存

print(gds.fastRP.estimate(G, embeddingDimension=128)["requiredMemory"])

3. Embed

3. 生成节点嵌入

gds.fastRP.mutate(G, embeddingDimension=128, randomSeed=42, mutateProperty="emb")

4. Similarity

4. 计算相似度

gds.knn.write(G, nodeProperties=["emb"], topK=10, writeRelationshipType="SIMILAR", writeProperty="score")

5. Cleanup — always

5. 清理——务必执行

G.drop()

---

G.drop()

---

Algorithm Selection

算法选择

Goal	Algorithm
Influence via network links	PageRank / ArticleRank
Bottleneck / bridge nodes	Betweenness Centrality
Direct connections	Degree Centrality
Community (general, fast)	Louvain
Community (higher quality)	Leiden
Is graph connected?	WCC (run first)
Similarity from embeddings	KNN
Similarity from neighbors	Node Similarity
Shortest path (positive weights)	Dijkstra / A*
k alternative paths	Yen's
Fast scalable embeddings	FastRP
Feature-rich nodes	GraphSAGE (Beta)

Full algorithm catalog → references/algorithms.md

目标	算法
通过网络链接评估影响力	PageRank / ArticleRank
识别瓶颈/桥接节点	介数中心性
评估直接连接数	度数中心性
社区检测（通用、快速）	Louvain
社区检测（更高质量）	Leiden
图是否连通？	WCC（优先运行）
基于嵌入向量的相似度	KNN
基于邻居的相似度	节点相似度
最短路径（正权重）	Dijkstra / A*
k条备选路径	Yen's
快速可扩展的节点嵌入	FastRP
富特征节点的嵌入	GraphSAGE（Beta版）

完整算法目录 → references/algorithms.md

Common Errors

常见错误

Error	Cause	Fix
`Unknown function 'gds.version'`	GDS not installed / wrong tier	Install plugin; on Aura BC/VDC use `neo4j-aura-graph-analytics-skill`
`Insufficient heap memory` / OOM	Graph too large for available JVM heap	Run `gds.graph.project.estimate` first; increase `dbms.memory.heap.max_size`
`Procedure not found: gds.leiden`	Algorithm not licensed / older GDS	Check `CALL gds.list()` for available procedures; upgrade GDS or use Louvain
`Node property 'X' not found` after mutate	Property not projected or wrong graph name	Verify `G.node_properties("Label")` includes the property; check `mutateProperty` spelling
`Graph 'myGraph' already exists`	Leftover projection from failed run	`CALL gds.graph.drop('myGraph')` or `G.drop()`
`mutateProperty already exists`	Re-running algorithm on same projection	Drop and re-project, or use different `mutateProperty` name
`No algorithm results`	Source/target node not in projection	Verify node labels/rel types match projection; check `G.node_count()`

错误信息	原因	解决方法
`Unknown function 'gds.version'`	GDS未安装/版本层级不符	安装插件；在Aura BC/VDC上使用 `neo4j-aura-graph-analytics-skill`
`Insufficient heap memory` / OOM	图规模超出可用JVM堆内存	先运行 `gds.graph.project.estimate` ；增大 `dbms.memory.heap.max_size`
`Procedure not found: gds.leiden`	算法未授权/GDS版本过旧	通过 `CALL gds.list()` 查看可用过程；升级GDS或使用Louvain算法
执行mutate后提示 `Node property 'X' not found`	属性未被投影或图名称错误	验证 `G.node_properties("Label")` 包含该属性；检查 `mutateProperty` 拼写
`Graph 'myGraph' already exists`	之前运行失败留下的投影未删除	执行 `CALL gds.graph.drop('myGraph')` 或 `G.drop()`
`mutateProperty already exists`	在同一投影上重复运行算法	删除并重新投影，或使用不同的 `mutateProperty` 名称
`No algorithm results`	源/目标节点不在投影中	验证节点标签/关系类型与投影匹配；检查 `G.node_count()`

Full Workflow

完整工作流

python

undefined

python

undefined

0. Verify

0. 验证环境

print(gds.server_version())

1. Estimate

1. 估算内存

est = gds.graph.project.estimate("Person", "KNOWS") print(est["requiredMemory"])

2. Project

2. 投影图

G, _ = gds.graph.project("myGraph", "Person", {"KNOWS": {"orientation": "UNDIRECTED"}}) print(G.node_count(), G.relationship_count())

3. Stream to verify

3. 用Stream模式验证结果

df = gds.pageRank.stream(G) print(df.sort_values("score", ascending=False).head(10))

4. Write when satisfied

4. 确认结果后写入数据库

gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)

5. Drop — frees JVM heap

5. 删除投影——释放JVM堆内存

G.drop()


Built-in test datasets: `gds.graph.load_cora()`, `gds.graph.load_karate_club()`, `gds.graph.load_imdb()`

---

G.drop()


内置测试数据集：`gds.graph.load_cora()`、`gds.graph.load_karate_club()`、`gds.graph.load_imdb()`

---

MCP Tool Mapping

MCP工具映射

Operation	MCP tool
`RETURN gds.version()`	`read-cypher`
`gds.pageRank.stream(...)`	`read-cypher`
`gds.pageRank.write(...)`	`write-cypher`
`gds.graph.drop(...)`	`write-cypher`
List available procedures	`read-cypher` → `CALL gds.list()`

操作	MCP工具
`RETURN gds.version()`	`read-cypher`
`gds.pageRank.stream(...)`	`read-cypher`
`gds.pageRank.write(...)`	`write-cypher`
`gds.graph.drop(...)`	`write-cypher`
列出可用过程	`read-cypher` → `CALL gds.list()`

References

参考资料

references/algorithms.md — full algorithm catalog: all procedures, parameters, tiers, Cypher + Python examples
references/graph-projection.md — projection deep-dive: filtering, heterogeneous graphs, relationship orientation, property types
GDS Manual
Python Client Docs

references/algorithms.md — 完整算法目录：包含所有过程、参数、版本层级、Cypher及Python示例
references/graph-projection.md — 投影深度解析：过滤、异构图、关系方向、属性类型
GDS官方手册
Python客户端文档

Checklist

检查清单

```
gds.version()
```
confirmed — GDS installed and licensed
Memory estimated before large projections and expensive algorithms
Named graph dropped after use (
```
G.drop()
```
or context manager)
Execution mode chosen:
```
stream
```
(inspect) →
```
mutate
```
(chain) →
```
write
```
(persist)
```
writeProperty
```
/
```
mutateProperty
```
checked for collision with existing properties
```
randomSeed
```
set for reproducible embeddings
WCC run first on graphs that may be disconnected
Native projection used over Cypher projection unless filtering/transformation required

已确认
```
gds.version()
```
正常运行——GDS已安装并授权
大型投影和高开销算法运行前已完成内存估算
命名图使用后已删除（
```
G.drop()
```
或上下文管理器）
已选择合适的执行模式：
```
stream
```
（查看）→
```
mutate
```
（串联）→
```
write
```
（持久化）
已检查
```
writeProperty
```
/
```
mutateProperty
```
与现有属性无冲突
已设置
```
randomSeed
```
以确保嵌入结果可复现
对可能不连通的图已优先运行WCC
已优先使用原生投影，仅在需要过滤/转换时使用Cypher投影