cognee

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Overview

概述

Cognee v0.5.5 — open-source Python AI memory engine that converts raw data into searchable knowledge graphs combining vector search with graph databases.

Source: topoteretes/cognee @ main
Language: Python (async/await throughout)
Forge tier: Deep (AST + gh + QMD)
Public API exports: 22 | Total extracted: 837 (302 functions, 535 classes)
Confidence: 837 T1 (AST-verified), 14 T2 (QMD-enriched), T3 (docs supplemental)

Cognee v0.5.5 — 开源Python AI内存引擎，可将原始数据转换为可搜索的知识图谱，结合向量搜索与图数据库。

源码地址： topoteretes/cognee @ main分支
开发语言： Python（全程使用async/await）
深度级别： Deep（AST + gh + QMD）
公开API导出数量： 22 | 提取总数量： 837（302个函数，535个类）
可信度： 837个T1（AST验证），14个T2（QMD增强），T3（文档补充）

Quick Start

快速开始

python

import cognee
import asyncio

async def main():
    # Reset state
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # 1. Ingest data
    await cognee.add("Cognee turns documents into AI memory.")  # [AST:cognee/api/v1/add/add.py:L22]

    # 2. Build knowledge graph
    await cognee.cognify()  # [AST:cognee/api/v1/cognify/cognify.py:L47]

    # 3. Search
    results = await cognee.search(  # [AST:cognee/api/v1/search/search.py:L26]
        query_text="What does Cognee do?"
    )
    for r in results:
        print(r)

asyncio.run(main())

All core functions are async — must use

await

inside an async context.

[EXT:docs.cognee.ai/getting-started/quickstart]

python

import cognee
import asyncio

async def main():
    # 重置状态
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # 1. 导入数据
    await cognee.add("Cognee turns documents into AI memory.")  # [AST:cognee/api/v1/add/add.py:L22]

    # 2. 构建知识图谱
    await cognee.cognify()  # [AST:cognee/api/v1/cognify/cognify.py:L47]

    # 3. 搜索
    results = await cognee.search(  # [AST:cognee/api/v1/search/search.py:L26]
        query_text="What does Cognee do?"
    )
    for r in results:
        print(r)

asyncio.run(main())

所有核心函数均为异步——必须在异步上下文内使用

await

。

[EXT:docs.cognee.ai/getting-started/quickstart]

Setup Requirements

安装要求

Python: >=3.10, <3.14. Recommended installer:

uv

bash

uv pip install -e "."                    # minimal (SQLite + LanceDB + Kuzu)
uv pip install -e ".[postgres,neo4j]"    # with PostgreSQL + Neo4j

Key installation extras:

postgres

postgres-binary

neo4j

neptune

chromadb

qdrant

redis

ollama

anthropic

gemini

mistral

groq

huggingface

llama-cpp

aws

(S3),

langchain

llama-index

graphiti

baml

dlt

docling

codegraph

scraping

docs

monitoring

(Sentry+Langfuse),

distributed

(Modal),

dev

debug

Minimal .env:

bash

LLM_API_KEY="your_openai_api_key"
LLM_MODEL="openai/gpt-4o-mini"

Defaults (no extra setup): SQLite (relational), LanceDB (vector), Kuzu (graph). All stored in

.venv

by default — override with

DATA_ROOT_DIRECTORY

and

SYSTEM_ROOT_DIRECTORY

Important: If you configure only LLM or only embeddings, the other defaults to OpenAI. Always configure both, or ensure a valid OpenAI API key.

Python版本： >=3.10，<3.14。推荐安装工具：

uv

。

bash

uv pip install -e "."                    # 最小安装（SQLite + LanceDB + Kuzu）
uv pip install -e ".[postgres,neo4j]"    # 包含PostgreSQL + Neo4j的安装

关键扩展安装选项：

postgres

postgres-binary

neo4j

neptune

chromadb

qdrant

redis

ollama

anthropic

gemini

mistral

groq

huggingface

llama-cpp

aws

(S3),

langchain

llama-index

graphiti

baml

dlt

docling

codegraph

scraping

docs

monitoring

(Sentry+Langfuse),

distributed

(Modal),

dev

debug

。

最小化.env配置：

bash

LLM_API_KEY="your_openai_api_key"
LLM_MODEL="openai/gpt-4o-mini"

默认配置（无需额外设置）：SQLite（关系型）、LanceDB（向量）、Kuzu（图）。所有数据默认存储在

.venv

目录下——可通过

DATA_ROOT_DIRECTORY

和

SYSTEM_ROOT_DIRECTORY

覆盖默认路径。

重要提示： 如果你仅配置了LLM或仅配置了嵌入模型，另一项将默认使用OpenAI。请始终同时配置两者，或确保拥有有效的OpenAI API密钥。

Common Workflows

常见工作流

Add and process data:

await cognee.add(data) → await cognee.cognify() → await cognee.search(query_text)

Multi-format ingestion:

await cognee.add(["/path/to/file.pdf", "raw text", open("doc.txt","rb")], dataset_name="my_data")

Ontology-grounded cognify:

config = {"ontology_config": {"ontology_resolver": RDFLibOntologyResolver(ontology_file=path)}}

→

await cognee.cognify(config=config)

[EXT:docs.cognee.ai/guides/ontology-support]

Session-aware search:

await cognee.search(query_text="Q1", session_id="conv_1")

→

await cognee.search(query_text="Follow-up", session_id="conv_1")

[EXT:docs.cognee.ai/guides/sessions]

Custom data models:

class MyEntity(DataPoint): name: str; metadata = {"index_fields": ["name"]}

→

await add_data_points([entity])

[EXT:docs.cognee.ai/guides/custom-data-models]

添加并处理数据：

await cognee.add(data) → await cognee.cognify() → await cognee.search(query_text)

多格式数据导入：

await cognee.add(["/path/to/file.pdf", "raw text", open("doc.txt","rb")], dataset_name="my_data")

基于本体的Cognify处理：

config = {"ontology_config": {"ontology_resolver": RDFLibOntologyResolver(ontology_file=path)}}

→

await cognee.cognify(config=config)

[EXT:docs.cognee.ai/guides/ontology-support]

会话感知搜索：

await cognee.search(query_text="Q1", session_id="conv_1")

→

await cognee.search(query_text="Follow-up", session_id="conv_1")

[EXT:docs.cognee.ai/guides/sessions]

自定义数据模型：

class MyEntity(DataPoint): name: str; metadata = {"index_fields": ["name"]}

→

await add_data_points([entity])

[EXT:docs.cognee.ai/guides/custom-data-models]

Key API Summary

核心API汇总

Function	Purpose	Key Params
`add()`	Ingest text, files, binary data	`data` , `dataset_name` , `user`
`cognify()`	Build knowledge graph from ingested data	`datasets` , `graph_model` , `chunker` , `temporal_cognify`
`search()`	Query knowledge graph	`query_text` , `query_type` , `top_k` , `session_id`
`memify()`	Enrich existing graph with custom tasks	`extraction_tasks` , `enrichment_tasks` , `data`
`config.*`	Runtime configuration (LLM, DB, vectors)	static methods
`datasets.*`	List, inspect, delete datasets	static methods
`prune.*`	Clean up data and system resources	`prune_data()` , `prune_system()`
`update()`	Update existing data items	`data_id` , `data` , `dataset_id`
`session.*`	Session history and feedback	`get_session()` , `add_feedback()`
`run_custom_pipeline()`	Execute custom task pipelines	`tasks` , `data` , `dataset`
`SearchType`	Enum of 14 search modes	`GRAPH_COMPLETION` (default)
`visualize_graph()`	Render knowledge graph to HTML	`destination_file_path`
`enable_tracing()`	Enable OpenTelemetry tracing	`console_output`
`run_migrations()`	Run Alembic database migrations	—
`start_ui()`	Launch local Cognee UI (frontend + backend + MCP servers)	`pid_callback` , `port` , `start_backend` , `start_mcp`
`cognee_network_visualization()`	Render knowledge graph to interactive HTML	`graph_data` , `destination_file_path`
`pipelines`	Module re-export: Task, run_tasks, run_tasks_parallel, run_pipeline	—

函数	用途	关键参数
`add()`	导入文本、文件、二进制数据	`data` , `dataset_name` , `user`
`cognify()`	从导入的数据构建知识图谱	`datasets` , `graph_model` , `chunker` , `temporal_cognify`
`search()`	查询知识图谱	`query_text` , `query_type` , `top_k` , `session_id`
`memify()`	用自定义任务增强现有图谱	`extraction_tasks` , `enrichment_tasks` , `data`
`config.*`	运行时配置（LLM、数据库、向量存储）	静态方法
`datasets.*`	列出、查看、删除数据集	静态方法
`prune.*`	清理数据和系统资源	`prune_data()` , `prune_system()`
`update()`	更新现有数据项	`data_id` , `data` , `dataset_id`
`session.*`	会话历史与反馈	`get_session()` , `add_feedback()`
`run_custom_pipeline()`	执行自定义任务流水线	`tasks` , `data` , `dataset`
`SearchType`	包含14种搜索模式的枚举类	`GRAPH_COMPLETION` （默认）
`visualize_graph()`	将知识图谱渲染为HTML	`destination_file_path`
`enable_tracing()`	启用OpenTelemetry追踪	`console_output`
`run_migrations()`	运行Alembic数据库迁移	—
`start_ui()`	启动本地Cognee UI（前端 + 后端 + MCP服务器）	`pid_callback` , `port` , `start_backend` , `start_mcp`
`cognee_network_visualization()`	将知识图谱渲染为交互式HTML	`graph_data` , `destination_file_path`
`pipelines`	模块重导出：Task, run_tasks, run_tasks_parallel, run_pipeline	—

LLM Provider Configuration

LLM提供商配置

Configure via

.env

— provider-specific examples:

bash

undefined

可通过

.env

文件配置——以下是不同提供商的示例：

bash

undefined

Azure OpenAI

LLM_PROVIDER="azure" LLM_MODEL="azure/gpt-4o-mini" LLM_ENDPOINT="https://YOUR-RESOURCE.openai.azure.com/openai/deployments/gpt-4o-mini" LLM_API_KEY="your_key" LLM_API_VERSION="2024-12-01-preview"

Anthropic (requires: pip install cognee[anthropic])

Anthropic（需安装：pip install cognee[anthropic]）

LLM_PROVIDER="anthropic" LLM_MODEL="claude-3-5-sonnet-20241022" LLM_API_KEY="your_key"

Ollama (requires: pip install cognee[ollama])

Ollama（需安装：pip install cognee[ollama]）

LLM_PROVIDER="ollama" LLM_MODEL="llama3.1:8b" LLM_ENDPOINT="http://localhost:11434/v1" LLM_API_KEY="ollama" EMBEDDING_PROVIDER="ollama" EMBEDDING_MODEL="nomic-embed-text:latest" EMBEDDING_ENDPOINT="http://localhost:11434/api/embed" HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"

AWS Bedrock (requires: pip install cognee[aws])

AWS Bedrock（需安装：pip install cognee[aws]）

LLM_PROVIDER="bedrock" LLM_MODEL="anthropic.claude-3-sonnet-20240229-v1:0" AWS_REGION="us-east-1"

Custom / OpenRouter / vLLM

自定义 / OpenRouter / vLLM

LLM_PROVIDER="custom" LLM_MODEL="openrouter/google/gemini-2.0-flash-lite-preview-02-05:free" LLM_ENDPOINT="https://openrouter.ai/api/v1"


**Rate limiting:** `LLM_RATE_LIMIT_ENABLED=true`, `LLM_RATE_LIMIT_REQUESTS=60`, `LLM_RATE_LIMIT_INTERVAL=60`

**Structured output:** `STRUCTURED_OUTPUT_FRAMEWORK="instructor"` (default) or `"baml"` (requires `cognee[baml]`). Override instructor mode: `LLM_INSTRUCTOR_MODE="json_schema_mode"`.

LLM_PROVIDER="custom" LLM_MODEL="openrouter/google/gemini-2.0-flash-lite-preview-02-05:free" LLM_ENDPOINT="https://openrouter.ai/api/v1"


**速率限制：** `LLM_RATE_LIMIT_ENABLED=true`, `LLM_RATE_LIMIT_REQUESTS=60`, `LLM_RATE_LIMIT_INTERVAL=60`

**结构化输出：** `STRUCTURED_OUTPUT_FRAMEWORK="instructor"`（默认）或`"baml"`（需安装`cognee[baml]`）。覆盖instructor模式：`LLM_INSTRUCTOR_MODE="json_schema_mode"`。

Database Switching

数据库切换

bash

undefined

bash

undefined

PostgreSQL (requires: pip install cognee[postgres])

PostgreSQL（需安装：pip install cognee[postgres]）

DB_PROVIDER=postgres DB_HOST=localhost DB_PORT=5432 DB_USERNAME=cognee DB_PASSWORD=cognee DB_NAME=cognee_db

PGVector (requires: pip install cognee[postgres])

PGVector（需安装：pip install cognee[postgres]）

VECTOR_DB_PROVIDER=pgvector VECTOR_DB_URL=postgresql://cognee:cognee@localhost:5432/cognee_db

Neo4j (requires: pip install cognee[neo4j])

Neo4j（需安装：pip install cognee[neo4j]）

GRAPH_DATABASE_PROVIDER=neo4j GRAPH_DATABASE_URL=bolt://localhost:7687 GRAPH_DATABASE_USERNAME=neo4j GRAPH_DATABASE_PASSWORD=yourpassword

S3 storage (requires: pip install cognee[aws])

S3存储（需安装：pip install cognee[aws]）

STORAGE_BACKEND="s3" STORAGE_BUCKET_NAME="your-bucket" DATA_ROOT_DIRECTORY="s3://your-bucket/cognee/data"

undefined

STORAGE_BACKEND="s3" STORAGE_BUCKET_NAME="your-bucket" DATA_ROOT_DIRECTORY="s3://your-bucket/cognee/data"

undefined

Security Environment Variables

安全环境变量

bash

ACCEPT_LOCAL_FILE_PATH=True     # Allow local file paths in add()
ALLOW_HTTP_REQUESTS=True        # Allow HTTP fetches
ALLOW_CYPHER_QUERY=True         # Allow raw Cypher in SearchType.CYPHER
REQUIRE_AUTHENTICATION=False    # Enable API auth
ENABLE_BACKEND_ACCESS_CONTROL=True  # Multi-tenant dataset isolation

bash

ACCEPT_LOCAL_FILE_PATH=True     # 允许在add()中使用本地文件路径
ALLOW_HTTP_REQUESTS=True        # 允许HTTP请求
ALLOW_CYPHER_QUERY=True         # 允许在SearchType.CYPHER中使用原始Cypher语句
REQUIRE_AUTHENTICATION=False    # 启用API认证
ENABLE_BACKEND_ACCESS_CONTROL=True  # 多租户数据集隔离

Extension Patterns

扩展模式

Custom pipeline task:

python

from cognee.modules.pipelines.tasks.Task import Task

async def my_task(data):
    return process(data)

task = Task(my_task)

Direct database access:

python

from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.infrastructure.databases.vector import get_vector_engine

graph_engine = await get_graph_engine()
vector_engine = await get_vector_engine()

LLM Gateway (structured output):

python

from cognee.infrastructure.llm.get_llm_client import get_llm_client

llm_client = get_llm_client()
response = await llm_client.acreate_structured_output(
    text_input="prompt", system_prompt="instructions", response_model=YourPydanticModel
)

自定义流水线任务：

python

from cognee.modules.pipelines.tasks.Task import Task

async def my_task(data):
    return process(data)

task = Task(my_task)

直接数据库访问：

python

from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.infrastructure.databases.vector import get_vector_engine

graph_engine = await get_graph_engine()
vector_engine = await get_vector_engine()

LLM网关（结构化输出）：

python

from cognee.infrastructure.llm.get_llm_client import get_llm_client

llm_client = get_llm_client()
response = await llm_client.acreate_structured_output(
    text_input="prompt", system_prompt="instructions", response_model=YourPydanticModel
)

MCP Server Transport Modes

MCP服务器传输模式

bash

python src/server.py                    # stdio (default)
python src/server.py --transport sse    # SSE
python src/server.py --transport http --host 127.0.0.1 --port 8000 --path /mcp

bash

python src/server.py                    # 标准输入输出（默认）
python src/server.py --transport sse    # SSE
python src/server.py --transport http --host 127.0.0.1 --port 8000 --path /mcp

API mode (connect to running Cognee API):

API模式（连接到运行中的Cognee API）：

python src/server.py --transport sse --api-url http://localhost:8000 --api-token TOKEN


Docker: `docker run -e TRANSPORT_MODE=sse --env-file .env -p 8000:8000 cognee/cognee-mcp:main`

python src/server.py --transport sse --api-url http://localhost:8000 --api-token TOKEN


Docker命令：`docker run -e TRANSPORT_MODE=sse --env-file .env -p 8000:8000 cognee/cognee-mcp:main`

Common Troubleshooting

常见问题排查

Ollama + OpenAI embeddings NoDataError — configure both LLM and embedding to same provider, or set
```
HUGGINGFACE_TOKENIZER
```
LM Studio structured output — set
```
LLM_INSTRUCTOR_MODE="json_schema_mode"
```
Default provider fallback — configuring only LLM or only embeddings defaults the other to OpenAI
Permission denied on search — returns empty list (not error) to prevent info leakage; check dataset permissions
Docker DB connections — use
```
DB_HOST=host.docker.internal
```
for local databases

Debug logging —

LITELLM_LOG="DEBUG"

ENV="development"

TELEMETRY_DISABLED=1

Ollama + OpenAI嵌入出现NoDataError — 将LLM和嵌入模型配置为同一提供商，或设置
```
HUGGINGFACE_TOKENIZER
```
LM Studio结构化输出问题 — 设置
```
LLM_INSTRUCTOR_MODE="json_schema_mode"
```
默认提供商回退 — 仅配置LLM或仅配置嵌入模型时，另一项将默认使用OpenAI
搜索权限被拒绝 — 返回空列表（而非错误）以防止信息泄露；检查数据集权限
Docker数据库连接问题 — 对于本地数据库，使用
```
DB_HOST=host.docker.internal
```

调试日志 —

LITELLM_LOG="DEBUG"

ENV="development"

TELEMETRY_DISABLED=1

Migration & Deprecation Warnings

迁移与弃用警告

delete()
deprecated since v0.3.9 — use

datasets.delete_data()

instead

[SRC:cognee/api/v1/delete/__init__.py:L13]

memify()
default pipeline changed — coding rules replaced with triplet embedding (Mar 2026)
```
[QMD:cognee-temporal:prs.md]
```
update()
bug — PATCH updates timestamps but GET raw may return old data
```
[QMD:cognee-temporal:issues.md]
```
visualize_graph()
frontend — open bug #2442: missing component in UI mode
```
[QMD:cognee-temporal:issues.md]
```
start_ui()
v0.5.5 bug — pip-installed frontend fails with 500 errors due to missing npm deps (react-markdown, ngraph.graph)
```
[QMD:cognee-temporal:issues.md]
```

See Full API Reference for migration details.

delete()
自v0.3.9起弃用——请使用

datasets.delete_data()

替代

[SRC:cognee/api/v1/delete/__init__.py:L13]

memify()
默认流水线变更 — 编码规则已被三元组嵌入替代（2026年3月）
```
[QMD:cognee-temporal:prs.md]
```
update()
bug — PATCH更新会修改时间戳，但GET原始数据可能返回旧数据
```
[QMD:cognee-temporal:issues.md]
```
visualize_graph()
前端问题 — 开放bug #2442：UI模式下缺少组件
```
[QMD:cognee-temporal:issues.md]
```
start_ui()
v0.5.5 bug — 通过pip安装的前端因缺少npm依赖（react-markdown、ngraph.graph）而出现500错误
```
[QMD:cognee-temporal:issues.md]
```

有关迁移详情，请参阅完整API参考文档。

Key Types

核心类型定义

SearchType (14 modes)

[AST:cognee/modules/search/types/SearchType.py:L4]

GRAPH_COMPLETION

(default),

RAG_COMPLETION

CHUNKS

SUMMARIES

TRIPLET_COMPLETION

GRAPH_SUMMARY_COMPLETION

CYPHER

NATURAL_LANGUAGE

GRAPH_COMPLETION_COT

GRAPH_COMPLETION_CONTEXT_EXTENSION

FEELING_LUCKY

TEMPORAL

CODING_RULES

CHUNKS_LEXICAL

Task — wraps any callable (async/sync/generator) for pipeline execution

[AST:cognee/modules/pipelines/tasks/task.py]

DataPoint — base class for custom graph nodes; inherit and add typed fields

[EXT:docs.cognee.ai/guides/custom-data-models]

ChunkStrategy —

PARAGRAPH

SENTENCE

LANGCHAIN_CHARACTER

[AST:cognee/shared/data_models.py:L83]

SearchType（14种模式）

[AST:cognee/modules/search/types/SearchType.py:L4]

GRAPH_COMPLETION

（默认）,

RAG_COMPLETION

CHUNKS

SUMMARIES

TRIPLET_COMPLETION

GRAPH_SUMMARY_COMPLETION

CYPHER

NATURAL_LANGUAGE

GRAPH_COMPLETION_COT

GRAPH_COMPLETION_CONTEXT_EXTENSION

FEELING_LUCKY

TEMPORAL

CODING_RULES

CHUNKS_LEXICAL

Task — 为流水线执行包装任何可调用对象（异步/同步/生成器）

[AST:cognee/modules/pipelines/tasks/task.py]

DataPoint — 自定义图节点的基类；继承并添加类型化字段

[EXT:docs.cognee.ai/guides/custom-data-models]

ChunkStrategy —

PARAGRAPH

SENTENCE

LANGCHAIN_CHARACTER

[AST:cognee/shared/data_models.py:L83]

Architecture at a Glance

架构概览

Storage: Relational (SQLite/Postgres) + Vector (LanceDB/PGVector/Qdrant/Redis/ChromaDB/FalkorDB) + Graph (Kuzu/Neo4j/Neptune/Memgraph)
LLM Providers: OpenAI, Azure OpenAI, Gemini, Anthropic, Ollama, custom (vLLM)
Embedding: OpenAI, Azure, Gemini, Mistral, Ollama, Fastembed
Pipeline: Tasks → Pipelines → run_pipeline (orchestration with async generators)
Observability: OpenTelemetry tracing via
```
enable_tracing()
```
/
```
disable_tracing()
```
MCP Server: 7 tools (cognify, search, list_data, delete, prune, cognify_status, save_interaction)
```
[AST:cognee-mcp/src/server.py]
```

存储层： 关系型（SQLite/Postgres） + 向量（LanceDB/PGVector/Qdrant/Redis/ChromaDB/FalkorDB） + 图（Kuzu/Neo4j/Neptune/Memgraph）
LLM提供商： OpenAI、Azure OpenAI、Gemini、Anthropic、Ollama、自定义（vLLM）
嵌入模型： OpenAI、Azure、Gemini、Mistral、Ollama、Fastembed
流水线： 任务 → 流水线 → run_pipeline（使用异步生成器进行编排）
可观测性： 通过
```
enable_tracing()
```
/
```
disable_tracing()
```
实现OpenTelemetry追踪
MCP服务器： 7个工具（cognify、search、list_data、delete、prune、cognify_status、save_interaction）
```
[AST:cognee-mcp/src/server.py]
```

CLI

命令行工具（CLI）

bash

cognee --add "data"          # Ingest data
cognee --cognify             # Build knowledge graph
cognee --search "query"      # Search
cognee --debug               # Enable debug logging
cognee --ui                  # Launch local UI

[AST:cognee/cli/_cognee.py:L32]

bash

cognee --add "data"          # 导入数据
cognee --cognify             # 构建知识图谱
cognee --search "query"      # 搜索
cognee --debug               # 启用调试日志
cognee --ui                  # 启动本地UI

[AST:cognee/cli/_cognee.py:L32]

Full API Reference

完整API参考

See references/full-api-reference.md for complete signatures with parameters, return types, and T2 annotations for all 22 exports.

请参阅references/full-api-reference.md获取包含参数、返回类型及所有22个导出项的T2注释的完整签名。

Full Type Definitions

完整类型定义

SearchType

[AST:cognee/modules/search/types/SearchType.py:L4]

SearchType

[AST:cognee/modules/search/types/SearchType.py:L4]

python

class SearchType(str, Enum):
    SUMMARIES = "SUMMARIES"               # Vector similarity on TextSummary nodes
    CHUNKS = "CHUNKS"                     # Vector similarity on DocumentChunk nodes
    RAG_COMPLETION = "RAG_COMPLETION"     # LLM-backed with chunk context
    TRIPLET_COMPLETION = "TRIPLET_COMPLETION"  # Graph triplet-based retrieval
    GRAPH_COMPLETION = "GRAPH_COMPLETION" # Default — LLM + graph traversal
    GRAPH_SUMMARY_COMPLETION = "GRAPH_SUMMARY_COMPLETION"
    CYPHER = "CYPHER"                     # Raw Cypher query
    NATURAL_LANGUAGE = "NATURAL_LANGUAGE" # NL → Cypher translation
    GRAPH_COMPLETION_COT = "GRAPH_COMPLETION_COT"  # Chain-of-thought graph
    GRAPH_COMPLETION_CONTEXT_EXTENSION = "GRAPH_COMPLETION_CONTEXT_EXTENSION"
    FEELING_LUCKY = "FEELING_LUCKY"       # Single best result
    TEMPORAL = "TEMPORAL"                 # Time-aware search
    CODING_RULES = "CODING_RULES"         # Code rule retrieval
    CHUNKS_LEXICAL = "CHUNKS_LEXICAL"     # BM25 keyword search on chunks

python

class SearchType(str, Enum):
    SUMMARIES = "SUMMARIES"               # 针对TextSummary节点的向量相似度
    CHUNKS = "CHUNKS"                     # 针对DocumentChunk节点的向量相似度
    RAG_COMPLETION = "RAG_COMPLETION"     # 基于LLM并结合文本块上下文
    TRIPLET_COMPLETION = "TRIPLET_COMPLETION"  # 基于图三元组的检索
    GRAPH_COMPLETION = "GRAPH_COMPLETION" # 默认——LLM + 图遍历
    GRAPH_SUMMARY_COMPLETION = "GRAPH_SUMMARY_COMPLETION"
    CYPHER = "CYPHER"                     # 原始Cypher查询
    NATURAL_LANGUAGE = "NATURAL_LANGUAGE" # 自然语言转Cypher翻译
    GRAPH_COMPLETION_COT = "GRAPH_COMPLETION_COT"  # 思维链图检索
    GRAPH_COMPLETION_CONTEXT_EXTENSION = "GRAPH_COMPLETION_CONTEXT_EXTENSION"
    FEELING_LUCKY = "FEELING_LUCKY"       # 返回单个最佳结果
    TEMPORAL = "TEMPORAL"                 # 时间感知搜索
    CODING_RULES = "CODING_RULES"         # 编码规则检索
    CHUNKS_LEXICAL = "CHUNKS_LEXICAL"     # 针对文本块的BM25关键词搜索

DataPoint

[EXT:docs.cognee.ai/guides/custom-data-models]

DataPoint

[EXT:docs.cognee.ai/guides/custom-data-models]

Base class for all graph nodes. Inherits from Pydantic BaseModel. Set

metadata = {"index_fields": ["field"]}

for vector indexing. Use

Edge(weight, relationship_type)

for weighted relationships.

Notable subclasses:

DocumentChunk

TextSummary

CodeSummary

DatabaseSchema

SchemaTable

TranslatedContent

GraphitiNode

WebPage

所有图节点的基类，继承自Pydantic BaseModel。设置

metadata = {"index_fields": ["field"]}

以进行向量索引。使用

Edge(weight, relationship_type)

定义带权重的关系。

主要子类：

DocumentChunk

TextSummary

CodeSummary

DatabaseSchema

SchemaTable

TranslatedContent

GraphitiNode

WebPage

。

Task

[AST:cognee/modules/pipelines/tasks/task.py]

Task

[AST:cognee/modules/pipelines/tasks/task.py]

python

class Task:
    def __init__(self, executable, *args, task_config=None, **kwargs)

Wraps any callable (async/sync function, generator, async generator). Use

task_config={"batch_size": N}

for parallel processing. Decorate with

@task_summary("Processed {n} items")

for pipeline reporting.

python

class Task:
    def __init__(self, executable, *args, task_config=None, **kwargs)

包装任何可调用对象（异步/同步函数、生成器、异步生成器）以用于流水线执行。使用

task_config={"batch_size": N}

进行并行处理。使用

@task_summary("Processed {n} items")

装饰器进行流水线报告。

Pipeline Exports

[SRC:cognee/modules/pipelines/__init__.py:L1]

流水线导出项

[SRC:cognee/modules/pipelines/__init__.py:L1]

Task

run_tasks

run_tasks_parallel

run_pipeline

Task

run_tasks

run_tasks_parallel

run_pipeline

Full Integration Patterns

完整集成模式

Co-import Patterns

协同导入模式

pydantic —
```
BaseModel
```
for graph models,
```
BaseSettings
```
for config
sqlalchemy — relational storage layer (async sessions)
fastapi — HTTP API server for deployment
uuid — dataset and data item identifiers
asyncio — all core operations are async

pydantic —
```
BaseModel
```
用于图模型，
```
BaseSettings
```
用于配置
sqlalchemy — 关系型存储层（异步会话）
fastapi — 用于部署的HTTP API服务器
uuid — 数据集和数据项标识符
asyncio — 所有核心操作均为异步

MCP Server Integration

[AST:cognee-mcp/src/server.py]

MCP服务器集成

[AST:cognee-mcp/src/server.py]

7 MCP tools:

cognify(data, graph_model_file, graph_model_name, custom_prompt)

search(search_query, search_type, top_k)

save_interaction(...)

list_data(dataset_id)

delete(data_id, dataset_id, mode)

prune()

cognify_status()

Runs via

FastMCP("Cognee")

. Supports SSE and streamable HTTP transports with CORS.

7个MCP工具：

cognify(data, graph_model_file, graph_model_name, custom_prompt)

search(search_query, search_type, top_k)

save_interaction(...)

list_data(dataset_id)

delete(data_id, dataset_id, mode)

prune()

cognify_status()

。

通过

FastMCP("Cognee")

运行。支持SSE和可流式传输的HTTP传输，并兼容CORS。

Provider Configuration

[EXT:docs.cognee.ai/setup-configuration/overview]

提供商配置

[EXT:docs.cognee.ai/setup-configuration/overview]

Configure via

.env

cognee.config.*

methods:

LLM:
```
LLM_API_KEY
```
,
```
LLM_MODEL
```
,
```
LLM_PROVIDER
```
(openai/azure/gemini/anthropic/ollama/custom)
Embedding:
```
EMBEDDING_PROVIDER
```
,
```
EMBEDDING_MODEL
```
Vector:
```
VECTOR_DB_PROVIDER
```
(lancedb/pgvector/qdrant/redis/chromadb/falkordb)
Graph:
```
GRAPH_DB_PROVIDER
```
(kuzu/neo4j/neptune/memgraph)
Debug:
```
LOG_LEVEL=DEBUG
```
,
```
TELEMETRY_DISABLED=true
```

通过

.env

或

cognee.config.*

方法配置：

LLM：
```
LLM_API_KEY
```
,
```
LLM_MODEL
```
,
```
LLM_PROVIDER
```
（openai/azure/gemini/anthropic/ollama/custom）
嵌入模型：
```
EMBEDDING_PROVIDER
```
,
```
EMBEDDING_MODEL
```
向量存储：
```
VECTOR_DB_PROVIDER
```
（lancedb/pgvector/qdrant/redis/chromadb/falkordb）
图存储：
```
GRAPH_DB_PROVIDER
```
（kuzu/neo4j/neptune/memgraph）
调试：
```
LOG_LEVEL=DEBUG
```
,
```
TELEMETRY_DISABLED=true
```