neo4j-gds-skill

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

When to Use

适用场景

  • Running GDS algorithms on self-managed Neo4j or Aura Pro (embedded plugin)
  • Projecting named in-memory graphs, running centrality/community/similarity/path/embedding algorithms
  • Chaining algorithms via
    mutate
    mode; building FastRP → KNN pipelines
  • Memory estimation before large graph operations
  • GDS Python client (
    graphdatascience
    ) workflows
  • 在自托管Neo4j或Aura Pro(嵌入式插件)上运行GDS算法
  • 创建命名内存图,运行中心性/社区/相似度/路径/嵌入算法
  • 通过
    mutate
    模式串联算法;构建FastRP → KNN管道
  • 大型图操作前的内存估算
  • GDS Python客户端(
    graphdatascience
    )工作流

When NOT to Use

不适用场景

  • Aura BC / VDC / Free — GDS plugin unavailable →
    neo4j-aura-graph-analytics-skill
  • Cypher query authoring
    neo4j-cypher-skill
  • Driver/connection setup
    neo4j-driver-python-skill
  • GraphRAG retrieval
    neo4j-graphrag-skill
DeploymentUse
Aura FreeUpgrade to Pro or use
neo4j-aura-graph-analytics-skill
Aura ProThis skill
Aura BC / VDC
neo4j-aura-graph-analytics-skill
Self-managed (Community or Enterprise)This skill (install GDS plugin)

  • Aura BC / VDC / Free — 无GDS插件支持 → 使用
    neo4j-aura-graph-analytics-skill
  • Cypher查询编写 → 使用
    neo4j-cypher-skill
  • 驱动程序/连接设置 → 使用
    neo4j-driver-python-skill
  • GraphRAG检索 → 使用
    neo4j-graphrag-skill
部署环境适用方式
Aura Free升级到Pro版或使用
neo4j-aura-graph-analytics-skill
Aura Pro使用本技能
Aura BC / VDC使用
neo4j-aura-graph-analytics-skill
自托管(社区版或企业版)使用本技能(需安装GDS插件)

Pre-flight

前置检查

cypher
RETURN gds.version() AS gds_version
Fails with
Unknown function 'gds.version'
→ GDS not installed or wrong tier. Stop, inform user.
bash
pip install graphdatascience              # Python client
pip install graphdatascience[rust_ext]    # 3–10× faster serialization
Compatibility: graphdatascience v1.21 — GDS >= 2.6, Python >= 3.10, Neo4j Driver >= 4.4.12
python
from graphdatascience import GraphDataScience

gds = GraphDataScience("bolt://localhost:7687", auth=("neo4j", "password"))
gds = GraphDataScience("neo4j+s://xxx.databases.neo4j.io", auth=("neo4j", "pw"), aura_ds=True)
print(gds.server_version())

cypher
RETURN gds.version() AS gds_version
若执行失败并提示
Unknown function 'gds.version'
→ 说明GDS未安装或版本层级不符。请停止操作并告知用户。
bash
pip install graphdatascience              # Python客户端
pip install graphdatascience[rust_ext]    # 序列化速度提升3–10倍
兼容性:graphdatascience v1.21 — GDS >= 2.6,Python >= 3.10,Neo4j Driver >= 4.4.12
python
from graphdatascience import GraphDataScience

gds = GraphDataScience("bolt://localhost:7687", auth=("neo4j", "password"))
gds = GraphDataScience("neo4j+s://xxx.databases.neo4j.io", auth=("neo4j", "pw"), aura_ds=True)
print(gds.server_version())

Graph Catalog Operations

图目录操作

Native Projection

原生投影

cypher
CALL gds.graph.project(
  'myGraph',
  ['Person', 'City'],
  { KNOWS: { orientation: 'UNDIRECTED' }, LIVES_IN: {} }
)
YIELD graphName, nodeCount, relationshipCount
python
G, result = gds.graph.project("myGraph", "Person", "KNOWS")

G, result = gds.graph.project(
    "myGraph",
    {"Person": {"properties": ["age", "score"]}, "City": {}},
    {"KNOWS": {"orientation": "UNDIRECTED"}, "LIVES_IN": {"properties": ["since"]}}
)
cypher
CALL gds.graph.project(
  'myGraph',
  ['Person', 'City'],
  { KNOWS: { orientation: 'UNDIRECTED' }, LIVES_IN: {} }
)
YIELD graphName, nodeCount, relationshipCount
python
G, result = gds.graph.project("myGraph", "Person", "KNOWS")

G, result = gds.graph.project(
    "myGraph",
    {"Person": {"properties": ["age", "score"]}, "City": {}},
    {"KNOWS": {"orientation": "UNDIRECTED"}, "LIVES_IN": {"properties": ["since"]}}
)

Cypher Projection (use when native can't express filter/transform)

Cypher投影(当原生投影无法实现过滤/转换时使用)

python
G, result = gds.graph.cypher.project(
    """
    MATCH (source:Person)-[r:KNOWS]->(target:Person)
    WHERE source.active = true
    RETURN gds.graph.project($graph_name, source, target,
        { sourceNodeProperties: source { .score }, relationshipType: 'KNOWS' })
    """,
    database="neo4j", graph_name="activeGraph"
)
Native projection over Cypher projection whenever possible — 5–10× faster on large graphs.
python
G, result = gds.graph.cypher.project(
    """
    MATCH (source:Person)-[r:KNOWS]->(target:Person)
    WHERE source.active = true
    RETURN gds.graph.project($graph_name, source, target,
        { sourceNodeProperties: source { .score }, relationshipType: 'KNOWS' })
    """,
    database="neo4j", graph_name="activeGraph"
)
尽可能优先使用原生投影而非Cypher投影——在大型图上速度快5-10倍。

Weighted Projection (Cypher projection syntax)

加权投影(Cypher投影语法)

cypher
MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-weighted',
  source, target,
  { relationshipProperties: r { .rating } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
cypher
MATCH (source:User)-[r:RATED]->(target:Movie)
WITH gds.graph.project(
  'user-movie-weighted',
  source, target,
  { relationshipProperties: r { .rating } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount

Relationship Aggregation (collapse parallel relationships into a weighted edge)

关系聚合(将平行关系合并为加权边)

cypher
MATCH (source:Actor)-[r:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor)
WITH source, target, count(r) AS collabCount
WITH gds.graph.project(
  'actor-network',
  source, target,
  { relationshipProperties: { collabCount: collabCount } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
Use
count(r)
to aggregate multiple parallel relationships into a single weighted edge. Reduces graph size; enables weight-based algorithms.
cypher
MATCH (source:Actor)-[r:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(target:Actor)
WITH source, target, count(r) AS collabCount
WITH gds.graph.project(
  'actor-network',
  source, target,
  { relationshipProperties: { collabCount: collabCount } },
  { undirectedRelationshipTypes: ['*'] }
) AS g
RETURN g.graphName, g.nodeCount, g.relationshipCount
使用
count(r)
将多条平行关系聚合为单条加权边。可减小图规模,支持基于权重的算法。

Undirected Projection (native syntax)

无向投影(原生语法)

Pass
orientation: 'UNDIRECTED'
per relationship type — or use
undirectedRelationshipTypes: ['*']
in Cypher projection (second config map).
Leiden requires undirected relationships. Community detection and similarity algorithms generally work better on undirected graphs.
为每个关系类型传入
orientation: 'UNDIRECTED'
——或在Cypher投影的第二个配置映射中使用
undirectedRelationshipTypes: ['*']
Leiden算法要求使用无向关系。社区检测和相似度算法通常在无向图上表现更好。

Inspect and Drop

检查与删除

python
G.node_count()            # 12_043
G.relationship_count()    # 87_211
G.node_properties("Person")  # lists projected + mutated properties
G.memory_usage()          # "45 MiB"
G.exists()
G.drop()                  # always drop after use — frees JVM heap

G = gds.graph.get("myGraph")          # re-attach to existing projection

with gds.graph.project("tmp", "Person", "KNOWS")[0] as G:
    results = gds.pageRank.stream(G)
python
G.node_count()            # 12_043
G.relationship_count()    # 87_211
G.node_properties("Person")  # 列出已投影和修改的属性
G.memory_usage()          # "45 MiB"
G.exists()
G.drop()                  # 使用后务必删除——释放JVM堆内存

G = gds.graph.get("myGraph")          # 重新关联到已存在的投影

with gds.graph.project("tmp", "Person", "KNOWS")[0] as G:
    results = gds.pageRank.stream(G)

dropped automatically

自动删除

undefined
undefined

Memory Estimation — always run before large projections and algorithms

内存估算——大型投影和算法运行前务必执行

cypher
CALL gds.graph.project.estimate(['Person'], 'KNOWS')
YIELD requiredMemory, bytesMin, bytesMax, nodeCount, relationshipCount
python
est = gds.graph.project.estimate("Person", "KNOWS")
print(est["requiredMemory"])    # e.g. "1234 MiB"
cypher
CALL gds.graph.project.estimate(['Person'], 'KNOWS')
YIELD requiredMemory, bytesMin, bytesMax, nodeCount, relationshipCount
python
est = gds.graph.project.estimate("Person", "KNOWS")
print(est["requiredMemory"])    # 示例:"1234 MiB"

Algorithm estimation:

算法内存估算:

est = gds.pageRank.estimate(G, dampingFactor=0.85) print(est["requiredMemory"])

---
est = gds.pageRank.estimate(G, dampingFactor=0.85) print(est["requiredMemory"])

---

Execution Modes

执行模式

ModeSide effectReturnsUse when
stream
NoneRow per node/pairInspect results; top-N
stats
NoneSingle aggregate rowSummary/convergence check
mutate
Adds property to in-memory graph onlyStats rowChain algorithms
write
Persists property to Neo4j DBStats rowFinal step — make queryable
Pattern:
stream
to verify →
mutate
to chain →
write
to persist.
mutateProperty
must not already exist in the in-memory graph. After
write
, re-project to use written properties in subsequent GDS calls (in-memory graph does not see DB writes).

模式副作用返回结果适用场景
stream
每个节点/节点对对应一行查看结果;获取Top-N数据
stats
单条聚合行汇总/收敛性检查
mutate
仅向内存图添加属性统计行串联算法
write
将属性持久化到Neo4j数据库统计行最终步骤——使结果可查询
典型流程:
stream
验证结果 →
mutate
串联算法 →
write
持久化数据。
mutateProperty
不能已存在于内存图中。 执行
write
后,需重新投影才能在后续GDS调用中使用已写入的属性(内存图无法感知数据库写入操作)。

gds.util.asNode() — Enrich Stream Results

gds.util.asNode() — 丰富Stream模式结果

stream
mode yields
nodeId
(internal GDS integer).
gds.util.asNode(nodeId)
translates it back to the DB node so you can access properties.
cypher
// Single property
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 10

// Multiple properties — convert once with WITH
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS node, score
RETURN node.name AS name, node.born AS born, score
ORDER BY score DESC LIMIT 10
Not needed for
write
,
mutate
, or
stats
modes — those don't return per-node data.

stream
模式返回
nodeId
(GDS内部整数)。
gds.util.asNode(nodeId)
可将其转换回数据库节点,以便访问节点属性。
cypher
// 单个属性
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC LIMIT 10

// 多个属性——通过WITH一次性转换
CALL gds.pageRank.stream('myGraph', {})
YIELD nodeId, score
WITH gds.util.asNode(nodeId) AS node, score
RETURN node.name AS name, node.born AS born, score
ORDER BY score DESC LIMIT 10
write
mutate
stats
模式无需使用该方法——这些模式不返回单节点数据。

Core Algorithms

核心算法

PageRank (centrality)

PageRank(中心性)

cypher
CALL gds.pageRank.stream('myGraph', { dampingFactor: 0.85, maxIterations: 20 })
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score ORDER BY score DESC LIMIT 10
// score: relative influence — not absolute. Compare within same run only.
// didConverge: true means score stabilized; if false, increase maxIterations.

CALL gds.pageRank.write('myGraph', { writeProperty: 'pagerank', dampingFactor: 0.85 })
YIELD nodePropertiesWritten, ranIterations, didConverge
python
pr_df = gds.pageRank.stream(G, dampingFactor=0.85)
gds.pageRank.mutate(G, mutateProperty="pagerank", dampingFactor=0.85)
gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)
cypher
CALL gds.pageRank.stream('myGraph', { dampingFactor: 0.85, maxIterations: 20 })
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score ORDER BY score DESC LIMIT 10
// score:相对影响力——非绝对值。仅可在同一次运行内比较。
// didConverge:true表示分数已稳定;若为false,需增大maxIterations值。

CALL gds.pageRank.write('myGraph', { writeProperty: 'pagerank', dampingFactor: 0.85 })
YIELD nodePropertiesWritten, ranIterations, didConverge
python
pr_df = gds.pageRank.stream(G, dampingFactor=0.85)
gds.pageRank.mutate(G, mutateProperty="pagerank", dampingFactor=0.85)
gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)

Louvain (community detection)

Louvain(社区检测)

cypher
CALL gds.louvain.stream('myGraph', { relationshipWeightProperty: 'weight' })
YIELD nodeId, communityId

CALL gds.louvain.write('myGraph', { writeProperty: 'community' })
YIELD communityCount, modularity
python
louvain_df = gds.louvain.stream(G)
gds.louvain.write(G, writeProperty="community")
Leiden is a refinement of Louvain avoiding poorly connected communities — use when community quality > raw speed.
modularity
in stats result: range -0.5 to 1.0; values > 0.3 indicate meaningful community structure; > 0.7 = strong. Leiden requires undirected relationships in the projection.
cypher
CALL gds.louvain.stream('myGraph', { relationshipWeightProperty: 'weight' })
YIELD nodeId, communityId

CALL gds.louvain.write('myGraph', { writeProperty: 'community' })
YIELD communityCount, modularity
python
louvain_df = gds.louvain.stream(G)
gds.louvain.write(G, writeProperty="community")
Leiden是Louvain的改进版,可避免生成连接稀疏的社区——当社区质量优先于速度时使用。 统计结果中的
modularity
:范围为-0.5至1.0;值>0.3表示存在有意义的社区结构;>0.7表示社区结构较强。 Leiden算法要求投影使用无向关系。

WCC — Weakly Connected Components

WCC——弱连通分量

Run WCC first to understand graph structure; partition disconnected graphs before expensive algorithms.
cypher
CALL gds.wcc.stream('myGraph', { minComponentSize: 10 })
YIELD nodeId, componentId

CALL gds.wcc.write('myGraph', { writeProperty: 'componentId' })
YIELD nodePropertiesWritten, componentCount
python
wcc_df = gds.wcc.stream(G)
gds.wcc.write(G, writeProperty="componentId")
先运行WCC以了解图结构;在执行高开销算法前,对不连通的图进行分区。
cypher
CALL gds.wcc.stream('myGraph', { minComponentSize: 10 })
YIELD nodeId, componentId

CALL gds.wcc.write('myGraph', { writeProperty: 'componentId' })
YIELD nodePropertiesWritten, componentCount
python
wcc_df = gds.wcc.stream(G)
gds.wcc.write(G, writeProperty="componentId")

Betweenness Centrality

介数中心性

python
gds.betweenness.stream(G)          # identifies bottleneck/bridge nodes
gds.betweenness.write(G, writeProperty="betweenness")
python
gds.betweenness.stream(G)          # 识别瓶颈/桥接节点
gds.betweenness.write(G, writeProperty="betweenness")

Node Similarity

节点相似度

Jaccard similarity from common neighbors — no node properties required.
python
gds.nodeSimilarity.stream(G, similarityCutoff=0.1, topK=10)
gds.nodeSimilarity.write(G, writeRelationshipType="SIMILAR", writeProperty="score",
                          similarityCutoff=0.1, topK=10)
基于共同邻居计算Jaccard相似度——无需节点属性。
python
gds.nodeSimilarity.stream(G, similarityCutoff=0.1, topK=10)
gds.nodeSimilarity.write(G, writeRelationshipType="SIMILAR", writeProperty="score",
                          similarityCutoff=0.1, topK=10)

FastRP (node embeddings)

FastRP(节点嵌入)

Fast, scalable, production ML pipelines. Set
randomSeed
for reproducibility.
cypher
CALL gds.fastRP.mutate('myGraph', {
  embeddingDimension: 256,
  iterationWeights: [0.0, 1.0, 1.0],
  featureProperties: ['score'],
  propertyRatio: 0.5,
  normalizationStrength: -0.5,
  randomSeed: 42,
  mutateProperty: 'embedding'
})
YIELD nodePropertiesWritten
python
gds.fastRP.mutate(G, embeddingDimension=256, iterationWeights=[0.0, 1.0, 1.0],
                  randomSeed=42, mutateProperty="embedding")
gds.fastRP.write(G, embeddingDimension=256, writeProperty="embedding", randomSeed=42)
快速、可扩展的生产级机器学习管道。设置
randomSeed
以确保结果可复现。
cypher
CALL gds.fastRP.mutate('myGraph', {
  embeddingDimension: 256,
  iterationWeights: [0.0, 1.0, 1.0],
  featureProperties: ['score'],
  propertyRatio: 0.5,
  normalizationStrength: -0.5,
  randomSeed: 42,
  mutateProperty: 'embedding'
})
YIELD nodePropertiesWritten
python
gds.fastRP.mutate(G, embeddingDimension=256, iterationWeights=[0.0, 1.0, 1.0],
                  randomSeed=42, mutateProperty="embedding")
gds.fastRP.write(G, embeddingDimension=256, writeProperty="embedding", randomSeed=42)

KNN — K-Nearest Neighbors

KNN——K近邻

Finds k most similar nodes per node based on node properties (typically embeddings).
cypher
CALL gds.knn.stream('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  sampleRate: 0.5, similarityCutoff: 0.7
})
YIELD node1, node2, similarity

CALL gds.knn.write('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  writeRelationshipType: 'SIMILAR', writeProperty: 'score'
})
YIELD relationshipsWritten
python
knn_df = gds.knn.stream(G, nodeProperties=["embedding"], topK=10)
gds.knn.write(G, nodeProperties=["embedding"], topK=10,
              writeRelationshipType="SIMILAR", writeProperty="score")

根据节点属性(通常为嵌入向量)为每个节点找到k个最相似的节点。
cypher
CALL gds.knn.stream('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  sampleRate: 0.5, similarityCutoff: 0.7
})
YIELD node1, node2, similarity

CALL gds.knn.write('myGraph', {
  nodeProperties: ['embedding'], topK: 10,
  writeRelationshipType: 'SIMILAR', writeProperty: 'score'
})
YIELD relationshipsWritten
python
knn_df = gds.knn.stream(G, nodeProperties=["embedding"], topK=10)
gds.knn.write(G, nodeProperties=["embedding"], topK=10,
              writeRelationshipType="SIMILAR", writeProperty="score")

FastRP → KNN Pipeline (recommendation)

FastRP → KNN管道(推荐系统)

python
undefined
python
undefined

1. Project

1. 投影图

G, _ = gds.graph.project("myGraph", "Product", {"BOUGHT_TOGETHER": {"orientation": "UNDIRECTED"}})
G, _ = gds.graph.project("myGraph", "Product", {"BOUGHT_TOGETHER": {"orientation": "UNDIRECTED"}})

2. Estimate memory

2. 估算内存

print(gds.fastRP.estimate(G, embeddingDimension=128)["requiredMemory"])
print(gds.fastRP.estimate(G, embeddingDimension=128)["requiredMemory"])

3. Embed

3. 生成节点嵌入

gds.fastRP.mutate(G, embeddingDimension=128, randomSeed=42, mutateProperty="emb")
gds.fastRP.mutate(G, embeddingDimension=128, randomSeed=42, mutateProperty="emb")

4. Similarity

4. 计算相似度

gds.knn.write(G, nodeProperties=["emb"], topK=10, writeRelationshipType="SIMILAR", writeProperty="score")
gds.knn.write(G, nodeProperties=["emb"], topK=10, writeRelationshipType="SIMILAR", writeProperty="score")

5. Cleanup — always

5. 清理——务必执行

G.drop()

---
G.drop()

---

Algorithm Selection

算法选择

GoalAlgorithm
Influence via network linksPageRank / ArticleRank
Bottleneck / bridge nodesBetweenness Centrality
Direct connectionsDegree Centrality
Community (general, fast)Louvain
Community (higher quality)Leiden
Is graph connected?WCC (run first)
Similarity from embeddingsKNN
Similarity from neighborsNode Similarity
Shortest path (positive weights)Dijkstra / A*
k alternative pathsYen's
Fast scalable embeddingsFastRP
Feature-rich nodesGraphSAGE (Beta)
Full algorithm catalog → references/algorithms.md

目标算法
通过网络链接评估影响力PageRank / ArticleRank
识别瓶颈/桥接节点介数中心性
评估直接连接数度数中心性
社区检测(通用、快速)Louvain
社区检测(更高质量)Leiden
图是否连通?WCC(优先运行)
基于嵌入向量的相似度KNN
基于邻居的相似度节点相似度
最短路径(正权重)Dijkstra / A*
k条备选路径Yen's
快速可扩展的节点嵌入FastRP
富特征节点的嵌入GraphSAGE(Beta版)
完整算法目录 → references/algorithms.md

Common Errors

常见错误

ErrorCauseFix
Unknown function 'gds.version'
GDS not installed / wrong tierInstall plugin; on Aura BC/VDC use
neo4j-aura-graph-analytics-skill
Insufficient heap memory
/ OOM
Graph too large for available JVM heapRun
gds.graph.project.estimate
first; increase
dbms.memory.heap.max_size
Procedure not found: gds.leiden
Algorithm not licensed / older GDSCheck
CALL gds.list()
for available procedures; upgrade GDS or use Louvain
Node property 'X' not found
after mutate
Property not projected or wrong graph nameVerify
G.node_properties("Label")
includes the property; check
mutateProperty
spelling
Graph 'myGraph' already exists
Leftover projection from failed run
CALL gds.graph.drop('myGraph')
or
G.drop()
mutateProperty already exists
Re-running algorithm on same projectionDrop and re-project, or use different
mutateProperty
name
No algorithm results
Source/target node not in projectionVerify node labels/rel types match projection; check
G.node_count()

错误信息原因解决方法
Unknown function 'gds.version'
GDS未安装/版本层级不符安装插件;在Aura BC/VDC上使用
neo4j-aura-graph-analytics-skill
Insufficient heap memory
/ OOM
图规模超出可用JVM堆内存先运行
gds.graph.project.estimate
;增大
dbms.memory.heap.max_size
Procedure not found: gds.leiden
算法未授权/GDS版本过旧通过
CALL gds.list()
查看可用过程;升级GDS或使用Louvain算法
执行mutate后提示
Node property 'X' not found
属性未被投影或图名称错误验证
G.node_properties("Label")
包含该属性;检查
mutateProperty
拼写
Graph 'myGraph' already exists
之前运行失败留下的投影未删除执行
CALL gds.graph.drop('myGraph')
G.drop()
mutateProperty already exists
在同一投影上重复运行算法删除并重新投影,或使用不同的
mutateProperty
名称
No algorithm results
源/目标节点不在投影中验证节点标签/关系类型与投影匹配;检查
G.node_count()

Full Workflow

完整工作流

python
undefined
python
undefined

0. Verify

0. 验证环境

print(gds.server_version())
print(gds.server_version())

1. Estimate

1. 估算内存

est = gds.graph.project.estimate("Person", "KNOWS") print(est["requiredMemory"])
est = gds.graph.project.estimate("Person", "KNOWS") print(est["requiredMemory"])

2. Project

2. 投影图

G, _ = gds.graph.project("myGraph", "Person", {"KNOWS": {"orientation": "UNDIRECTED"}}) print(G.node_count(), G.relationship_count())
G, _ = gds.graph.project("myGraph", "Person", {"KNOWS": {"orientation": "UNDIRECTED"}}) print(G.node_count(), G.relationship_count())

3. Stream to verify

3. 用Stream模式验证结果

df = gds.pageRank.stream(G) print(df.sort_values("score", ascending=False).head(10))
df = gds.pageRank.stream(G) print(df.sort_values("score", ascending=False).head(10))

4. Write when satisfied

4. 确认结果后写入数据库

gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)
gds.pageRank.write(G, writeProperty="pagerank", dampingFactor=0.85)

5. Drop — frees JVM heap

5. 删除投影——释放JVM堆内存

G.drop()

Built-in test datasets: `gds.graph.load_cora()`, `gds.graph.load_karate_club()`, `gds.graph.load_imdb()`

---
G.drop()

内置测试数据集:`gds.graph.load_cora()`、`gds.graph.load_karate_club()`、`gds.graph.load_imdb()`

---

MCP Tool Mapping

MCP工具映射

OperationMCP tool
RETURN gds.version()
read-cypher
gds.pageRank.stream(...)
read-cypher
gds.pageRank.write(...)
write-cypher
gds.graph.drop(...)
write-cypher
List available procedures
read-cypher
CALL gds.list()

操作MCP工具
RETURN gds.version()
read-cypher
gds.pageRank.stream(...)
read-cypher
gds.pageRank.write(...)
write-cypher
gds.graph.drop(...)
write-cypher
列出可用过程
read-cypher
CALL gds.list()

References

参考资料

  • references/algorithms.md — full algorithm catalog: all procedures, parameters, tiers, Cypher + Python examples
  • references/graph-projection.md — projection deep-dive: filtering, heterogeneous graphs, relationship orientation, property types
  • GDS Manual
  • Python Client Docs

  • references/algorithms.md — 完整算法目录:包含所有过程、参数、版本层级、Cypher及Python示例
  • references/graph-projection.md — 投影深度解析:过滤、异构图、关系方向、属性类型
  • GDS官方手册
  • Python客户端文档

Checklist

检查清单

  • gds.version()
    confirmed — GDS installed and licensed
  • Memory estimated before large projections and expensive algorithms
  • Named graph dropped after use (
    G.drop()
    or context manager)
  • Execution mode chosen:
    stream
    (inspect) →
    mutate
    (chain) →
    write
    (persist)
  • writeProperty
    /
    mutateProperty
    checked for collision with existing properties
  • randomSeed
    set for reproducible embeddings
  • WCC run first on graphs that may be disconnected
  • Native projection used over Cypher projection unless filtering/transformation required
  • 已确认
    gds.version()
    正常运行——GDS已安装并授权
  • 大型投影和高开销算法运行前已完成内存估算
  • 命名图使用后已删除(
    G.drop()
    或上下文管理器)
  • 已选择合适的执行模式:
    stream
    (查看)→
    mutate
    (串联)→
    write
    (持久化)
  • 已检查
    writeProperty
    /
    mutateProperty
    与现有属性无冲突
  • 已设置
    randomSeed
    以确保嵌入结果可复现
  • 对可能不连通的图已优先运行WCC
  • 已优先使用原生投影,仅在需要过滤/转换时使用Cypher投影