knowledge-graph-builder

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Knowledge Graph Builder

知识图谱构建器

Overview

概述

Knowledge graphs make implicit relationships explicit, enabling AI systems to reason about connections, verify facts, and reduce hallucinations. They combine structured entity-relationship modeling with semantic search for powerful knowledge retrieval.
When to use: Complex entity relationships central to the domain, verifying AI-generated facts against structured knowledge, semantic search combined with relationship traversal, recommendation systems, fraud detection, or pattern recognition.
When NOT to use: Simple tabular data (use a relational database), purely document-based search with no relationships (use the
rag-implementer
skill), read-heavy workloads with no traversal needs, or when the team lacks graph modeling expertise. For KB architecture selection and governance, use the
knowledge-base-manager
skill.
知识图谱可将隐含关系显性化,使AI系统能够推理关联、验证事实并减少幻觉。它结合了结构化的实体-关系建模与语义搜索,实现强大的知识检索。
适用场景: 领域核心为复杂实体关系、需基于结构化知识验证AI生成内容、语义搜索结合关系遍历、推荐系统、欺诈检测或模式识别。
不适用场景: 简单表格数据(使用关系型数据库)、无关联的纯文档搜索(使用
rag-implementer
技能)、仅需读取无需遍历的工作负载,或团队缺乏图建模专业知识。如需进行知识库架构选型与治理,请使用
knowledge-base-manager
技能。

Quick Reference

快速参考

PatternApproachKey Points
Ontology firstDefine entity types, relationships, properties before ingesting dataChanging schema later is expensive; validate with domain experts
Entity resolutionDeduplicate aggressively during extraction"Apple Inc" = "Apple" = "Apple Computer" must resolve to one entity
Confidence scoringAttach 0.0-1.0 score + source to every relationshipEnables filtering by reliability, critical for AI grounding
Hybrid architectureGraph traversal (structured) + vector search (semantic)Vector finds candidates, graph expands context via relationships
Incremental buildCore entities first, validate against target queries, then expandAvoid building the full graph before testing with real queries
Database selectionNeo4j (general), Neptune (AWS managed), ArangoDB (multi-model), TigerGraph (massive scale)Match database to scale, infrastructure, and query complexity
模式方法关键点
本体优先在导入数据前定义实体类型、关系、属性后续修改架构成本高昂;需与领域专家共同验证
实体消歧在抽取过程中主动去重"Apple Inc" = "Apple" = "Apple Computer"必须解析为同一个实体
置信度评分为每个关系附加0.0-1.0的评分及来源支持按可靠性过滤,对AI事实锚定至关重要
混合架构图遍历(结构化)+ 向量搜索(语义)向量搜索找到候选对象,图通过关系扩展上下文
增量构建先构建核心实体,针对目标查询验证后再扩展避免在使用真实查询测试前构建完整图谱
数据库选型Neo4j(通用型)、Neptune(AWS托管)、ArangoDB(多模型)、TigerGraph(大规模)根据规模、基础设施和查询复杂度匹配数据库

Common Mistakes

常见误区

MistakeCorrect Pattern
Ingesting entities before designing the ontologyDefine and validate the ontology with domain experts first; changing later is expensive
Skipping entity resolution and deduplicationDeduplicate aggressively so "Apple Inc", "Apple", and "Apple Computer" resolve to one entity
Omitting confidence scores on relationshipsAttach a 0.0-1.0 confidence score and source to every relationship
Using only graph traversal without vector searchImplement hybrid architecture combining graph traversal with semantic vector search
Building the full graph before validating with real queriesStart with core entities, test against target queries, then expand incrementally
Choosing a database before understanding scale requirementsEvaluate query patterns, data volume, and infrastructure constraints before selecting
误区正确做法
在设计本体前导入实体先与领域专家共同定义并验证本体;后续修改成本高昂
跳过实体消歧与去重主动去重,使"Apple Inc"、"Apple"和"Apple Computer"解析为同一个实体
不为关系添加置信度评分为每个关系附加0.0-1.0的置信度评分及来源
仅使用图遍历而不结合向量搜索实现结合图遍历与语义向量搜索的混合架构
在使用真实查询验证前构建完整图谱从核心实体开始,针对目标查询测试后再逐步扩展
在了解规模需求前选择数据库先评估查询模式、数据量和基础设施约束,再进行选型

Delegation

任务委派

  • Extract entities and relationships from unstructured text: Use
    Task
    agent to run NER pipelines and build relationship triples
  • Evaluate graph database options for project requirements: Use
    Explore
    agent to compare Neo4j, Neptune, ArangoDB, and TigerGraph against scale and query needs
  • Design ontology and hybrid architecture for a new domain: Use
    Plan
    agent to define entity types, relationship schemas, and graph-vector integration strategy
  • For hybrid KG+RAG systems, delegate to the
    rag-implementer
    skill
  • For knowledge-graph-powered agent workflows, delegate to the
    agent-patterns
    skill
  • 从非结构化文本中抽取实体与关系:使用
    Task
    agent运行NER流水线并构建关系三元组
  • 根据项目需求评估图数据库选项:使用
    Explore
    agent对比Neo4j、Neptune、ArangoDB和TigerGraph,匹配规模与查询需求
  • 为新领域设计本体与混合架构:使用
    Plan
    agent定义实体类型、关系架构以及图-向量集成策略
  • 对于混合KG+RAG系统,委派给
    rag-implementer
    技能处理
  • 对于基于知识图谱的agent工作流,委派给
    agent-patterns
    技能处理

References

参考资料

  • Ontology Design — Entity types, relationships, properties, RDF schema, validation
  • Database Selection — Neo4j, Neptune, ArangoDB, TigerGraph comparison and setup
  • Entity Extraction — NER pipeline, relationship extraction, LLM-based extraction
  • Hybrid Architecture — Graph + vector integration, hybrid search implementation
  • Query Patterns — Cypher queries, API design, common traversal patterns
  • AI Integration — KG-RAG, hallucination detection, grounded response generation
  • 本体设计 — 实体类型、关系、属性、RDF schema、验证
  • 数据库选型 — Neo4j、Neptune、ArangoDB、TigerGraph对比与配置
  • 实体抽取 — NER流水线、关系抽取、基于LLM的抽取
  • 混合架构 — 图+向量集成、混合搜索实现
  • 查询模式 — Cypher查询、API设计、常见遍历模式
  • AI集成 — KG-RAG、幻觉检测、锚定响应生成