knowledge-graph-builder
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseKnowledge Graph Builder
知识图谱构建器
Overview
概述
Knowledge graphs make implicit relationships explicit, enabling AI systems to reason about connections, verify facts, and reduce hallucinations. They combine structured entity-relationship modeling with semantic search for powerful knowledge retrieval.
When to use: Complex entity relationships central to the domain, verifying AI-generated facts against structured knowledge, semantic search combined with relationship traversal, recommendation systems, fraud detection, or pattern recognition.
When NOT to use: Simple tabular data (use a relational database), purely document-based search with no relationships (use the skill), read-heavy workloads with no traversal needs, or when the team lacks graph modeling expertise. For KB architecture selection and governance, use the skill.
rag-implementerknowledge-base-manager知识图谱可将隐含关系显性化,使AI系统能够推理关联、验证事实并减少幻觉。它结合了结构化的实体-关系建模与语义搜索,实现强大的知识检索。
适用场景: 领域核心为复杂实体关系、需基于结构化知识验证AI生成内容、语义搜索结合关系遍历、推荐系统、欺诈检测或模式识别。
不适用场景: 简单表格数据(使用关系型数据库)、无关联的纯文档搜索(使用技能)、仅需读取无需遍历的工作负载,或团队缺乏图建模专业知识。如需进行知识库架构选型与治理,请使用技能。
rag-implementerknowledge-base-managerQuick Reference
快速参考
| Pattern | Approach | Key Points |
|---|---|---|
| Ontology first | Define entity types, relationships, properties before ingesting data | Changing schema later is expensive; validate with domain experts |
| Entity resolution | Deduplicate aggressively during extraction | "Apple Inc" = "Apple" = "Apple Computer" must resolve to one entity |
| Confidence scoring | Attach 0.0-1.0 score + source to every relationship | Enables filtering by reliability, critical for AI grounding |
| Hybrid architecture | Graph traversal (structured) + vector search (semantic) | Vector finds candidates, graph expands context via relationships |
| Incremental build | Core entities first, validate against target queries, then expand | Avoid building the full graph before testing with real queries |
| Database selection | Neo4j (general), Neptune (AWS managed), ArangoDB (multi-model), TigerGraph (massive scale) | Match database to scale, infrastructure, and query complexity |
| 模式 | 方法 | 关键点 |
|---|---|---|
| 本体优先 | 在导入数据前定义实体类型、关系、属性 | 后续修改架构成本高昂;需与领域专家共同验证 |
| 实体消歧 | 在抽取过程中主动去重 | "Apple Inc" = "Apple" = "Apple Computer"必须解析为同一个实体 |
| 置信度评分 | 为每个关系附加0.0-1.0的评分及来源 | 支持按可靠性过滤,对AI事实锚定至关重要 |
| 混合架构 | 图遍历(结构化)+ 向量搜索(语义) | 向量搜索找到候选对象,图通过关系扩展上下文 |
| 增量构建 | 先构建核心实体,针对目标查询验证后再扩展 | 避免在使用真实查询测试前构建完整图谱 |
| 数据库选型 | Neo4j(通用型)、Neptune(AWS托管)、ArangoDB(多模型)、TigerGraph(大规模) | 根据规模、基础设施和查询复杂度匹配数据库 |
Common Mistakes
常见误区
| Mistake | Correct Pattern |
|---|---|
| Ingesting entities before designing the ontology | Define and validate the ontology with domain experts first; changing later is expensive |
| Skipping entity resolution and deduplication | Deduplicate aggressively so "Apple Inc", "Apple", and "Apple Computer" resolve to one entity |
| Omitting confidence scores on relationships | Attach a 0.0-1.0 confidence score and source to every relationship |
| Using only graph traversal without vector search | Implement hybrid architecture combining graph traversal with semantic vector search |
| Building the full graph before validating with real queries | Start with core entities, test against target queries, then expand incrementally |
| Choosing a database before understanding scale requirements | Evaluate query patterns, data volume, and infrastructure constraints before selecting |
| 误区 | 正确做法 |
|---|---|
| 在设计本体前导入实体 | 先与领域专家共同定义并验证本体;后续修改成本高昂 |
| 跳过实体消歧与去重 | 主动去重,使"Apple Inc"、"Apple"和"Apple Computer"解析为同一个实体 |
| 不为关系添加置信度评分 | 为每个关系附加0.0-1.0的置信度评分及来源 |
| 仅使用图遍历而不结合向量搜索 | 实现结合图遍历与语义向量搜索的混合架构 |
| 在使用真实查询验证前构建完整图谱 | 从核心实体开始,针对目标查询测试后再逐步扩展 |
| 在了解规模需求前选择数据库 | 先评估查询模式、数据量和基础设施约束,再进行选型 |
Delegation
任务委派
- Extract entities and relationships from unstructured text: Use agent to run NER pipelines and build relationship triples
Task - Evaluate graph database options for project requirements: Use agent to compare Neo4j, Neptune, ArangoDB, and TigerGraph against scale and query needs
Explore - Design ontology and hybrid architecture for a new domain: Use agent to define entity types, relationship schemas, and graph-vector integration strategy
Plan - For hybrid KG+RAG systems, delegate to the skill
rag-implementer - For knowledge-graph-powered agent workflows, delegate to the skill
agent-patterns
- 从非结构化文本中抽取实体与关系:使用agent运行NER流水线并构建关系三元组
Task - 根据项目需求评估图数据库选项:使用agent对比Neo4j、Neptune、ArangoDB和TigerGraph,匹配规模与查询需求
Explore - 为新领域设计本体与混合架构:使用agent定义实体类型、关系架构以及图-向量集成策略
Plan - 对于混合KG+RAG系统,委派给技能处理
rag-implementer - 对于基于知识图谱的agent工作流,委派给技能处理
agent-patterns
References
参考资料
- Ontology Design — Entity types, relationships, properties, RDF schema, validation
- Database Selection — Neo4j, Neptune, ArangoDB, TigerGraph comparison and setup
- Entity Extraction — NER pipeline, relationship extraction, LLM-based extraction
- Hybrid Architecture — Graph + vector integration, hybrid search implementation
- Query Patterns — Cypher queries, API design, common traversal patterns
- AI Integration — KG-RAG, hallucination detection, grounded response generation
- 本体设计 — 实体类型、关系、属性、RDF schema、验证
- 数据库选型 — Neo4j、Neptune、ArangoDB、TigerGraph对比与配置
- 实体抽取 — NER流水线、关系抽取、基于LLM的抽取
- 混合架构 — 图+向量集成、混合搜索实现
- 查询模式 — Cypher查询、API设计、常见遍历模式
- AI集成 — KG-RAG、幻觉检测、锚定响应生成