cognee

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Overview

概述

Cognee v0.5.5 — open-source Python AI memory engine that converts raw data into searchable knowledge graphs combining vector search with graph databases.
  • Source: topoteretes/cognee @ main
  • Language: Python (async/await throughout)
  • Forge tier: Deep (AST + gh + QMD)
  • Public API exports: 22 | Total extracted: 837 (302 functions, 535 classes)
  • Confidence: 837 T1 (AST-verified), 14 T2 (QMD-enriched), T3 (docs supplemental)
Cognee v0.5.5 — 开源Python AI内存引擎,可将原始数据转换为可搜索的知识图谱,结合向量搜索与图数据库。
  • 源码地址: topoteretes/cognee @ main分支
  • 开发语言: Python(全程使用async/await)
  • 深度级别: Deep(AST + gh + QMD)
  • 公开API导出数量: 22 | 提取总数量: 837(302个函数,535个类)
  • 可信度: 837个T1(AST验证),14个T2(QMD增强),T3(文档补充)

Quick Start

快速开始

python
import cognee
import asyncio

async def main():
    # Reset state
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # 1. Ingest data
    await cognee.add("Cognee turns documents into AI memory.")  # [AST:cognee/api/v1/add/add.py:L22]

    # 2. Build knowledge graph
    await cognee.cognify()  # [AST:cognee/api/v1/cognify/cognify.py:L47]

    # 3. Search
    results = await cognee.search(  # [AST:cognee/api/v1/search/search.py:L26]
        query_text="What does Cognee do?"
    )
    for r in results:
        print(r)

asyncio.run(main())
All core functions are async — must use
await
inside an async context.
[EXT:docs.cognee.ai/getting-started/quickstart]
<!-- [MANUAL:additional-notes] -->
python
import cognee
import asyncio

async def main():
    # 重置状态
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # 1. 导入数据
    await cognee.add("Cognee turns documents into AI memory.")  # [AST:cognee/api/v1/add/add.py:L22]

    # 2. 构建知识图谱
    await cognee.cognify()  # [AST:cognee/api/v1/cognify/cognify.py:L47]

    # 3. 搜索
    results = await cognee.search(  # [AST:cognee/api/v1/search/search.py:L26]
        query_text="What does Cognee do?"
    )
    for r in results:
        print(r)

asyncio.run(main())
所有核心函数均为异步——必须在异步上下文内使用
await
[EXT:docs.cognee.ai/getting-started/quickstart]
<!-- [MANUAL:additional-notes] -->

Setup Requirements

安装要求

Python: >=3.10, <3.14. Recommended installer:
uv
.
bash
uv pip install -e "."                    # minimal (SQLite + LanceDB + Kuzu)
uv pip install -e ".[postgres,neo4j]"    # with PostgreSQL + Neo4j
Key installation extras:
postgres
/
postgres-binary
,
neo4j
,
neptune
,
chromadb
,
qdrant
,
redis
,
ollama
,
anthropic
,
gemini
,
mistral
,
groq
,
huggingface
,
llama-cpp
,
aws
(S3),
langchain
,
llama-index
,
graphiti
,
baml
,
dlt
,
docling
,
codegraph
,
scraping
,
docs
,
monitoring
(Sentry+Langfuse),
distributed
(Modal),
dev
,
debug
.
Minimal .env:
bash
LLM_API_KEY="your_openai_api_key"
LLM_MODEL="openai/gpt-4o-mini"
Defaults (no extra setup): SQLite (relational), LanceDB (vector), Kuzu (graph). All stored in
.venv
by default — override with
DATA_ROOT_DIRECTORY
and
SYSTEM_ROOT_DIRECTORY
.
Important: If you configure only LLM or only embeddings, the other defaults to OpenAI. Always configure both, or ensure a valid OpenAI API key.
<!-- [/MANUAL:additional-notes] -->
Python版本: >=3.10,<3.14。推荐安装工具:
uv
bash
uv pip install -e "."                    # 最小安装(SQLite + LanceDB + Kuzu)
uv pip install -e ".[postgres,neo4j]"    # 包含PostgreSQL + Neo4j的安装
关键扩展安装选项:
postgres
/
postgres-binary
,
neo4j
,
neptune
,
chromadb
,
qdrant
,
redis
,
ollama
,
anthropic
,
gemini
,
mistral
,
groq
,
huggingface
,
llama-cpp
,
aws
(S3),
langchain
,
llama-index
,
graphiti
,
baml
,
dlt
,
docling
,
codegraph
,
scraping
,
docs
,
monitoring
(Sentry+Langfuse),
distributed
(Modal),
dev
,
debug
最小化.env配置:
bash
LLM_API_KEY="your_openai_api_key"
LLM_MODEL="openai/gpt-4o-mini"
默认配置(无需额外设置):SQLite(关系型)、LanceDB(向量)、Kuzu(图)。所有数据默认存储在
.venv
目录下——可通过
DATA_ROOT_DIRECTORY
SYSTEM_ROOT_DIRECTORY
覆盖默认路径。
重要提示: 如果你仅配置了LLM或仅配置了嵌入模型,另一项将默认使用OpenAI。请始终同时配置两者,或确保拥有有效的OpenAI API密钥。
<!-- [/MANUAL:additional-notes] -->

Common Workflows

常见工作流

Add and process data:
await cognee.add(data) → await cognee.cognify() → await cognee.search(query_text)
Multi-format ingestion:
await cognee.add(["/path/to/file.pdf", "raw text", open("doc.txt","rb")], dataset_name="my_data")
Ontology-grounded cognify:
config = {"ontology_config": {"ontology_resolver": RDFLibOntologyResolver(ontology_file=path)}}
await cognee.cognify(config=config)
[EXT:docs.cognee.ai/guides/ontology-support]
Session-aware search:
await cognee.search(query_text="Q1", session_id="conv_1")
await cognee.search(query_text="Follow-up", session_id="conv_1")
[EXT:docs.cognee.ai/guides/sessions]
Custom data models:
class MyEntity(DataPoint): name: str; metadata = {"index_fields": ["name"]}
await add_data_points([entity])
[EXT:docs.cognee.ai/guides/custom-data-models]
添加并处理数据:
await cognee.add(data) → await cognee.cognify() → await cognee.search(query_text)
多格式数据导入:
await cognee.add(["/path/to/file.pdf", "raw text", open("doc.txt","rb")], dataset_name="my_data")
基于本体的Cognify处理:
config = {"ontology_config": {"ontology_resolver": RDFLibOntologyResolver(ontology_file=path)}}
await cognee.cognify(config=config)
[EXT:docs.cognee.ai/guides/ontology-support]
会话感知搜索:
await cognee.search(query_text="Q1", session_id="conv_1")
await cognee.search(query_text="Follow-up", session_id="conv_1")
[EXT:docs.cognee.ai/guides/sessions]
自定义数据模型:
class MyEntity(DataPoint): name: str; metadata = {"index_fields": ["name"]}
await add_data_points([entity])
[EXT:docs.cognee.ai/guides/custom-data-models]

Key API Summary

核心API汇总

FunctionPurposeKey Params
add()
Ingest text, files, binary data
data
,
dataset_name
,
user
cognify()
Build knowledge graph from ingested data
datasets
,
graph_model
,
chunker
,
temporal_cognify
search()
Query knowledge graph
query_text
,
query_type
,
top_k
,
session_id
memify()
Enrich existing graph with custom tasks
extraction_tasks
,
enrichment_tasks
,
data
config.*
Runtime configuration (LLM, DB, vectors)static methods
datasets.*
List, inspect, delete datasetsstatic methods
prune.*
Clean up data and system resources
prune_data()
,
prune_system()
update()
Update existing data items
data_id
,
data
,
dataset_id
session.*
Session history and feedback
get_session()
,
add_feedback()
run_custom_pipeline()
Execute custom task pipelines
tasks
,
data
,
dataset
SearchType
Enum of 14 search modes
GRAPH_COMPLETION
(default)
visualize_graph()
Render knowledge graph to HTML
destination_file_path
enable_tracing()
Enable OpenTelemetry tracing
console_output
run_migrations()
Run Alembic database migrations
start_ui()
Launch local Cognee UI (frontend + backend + MCP servers)
pid_callback
,
port
,
start_backend
,
start_mcp
cognee_network_visualization()
Render knowledge graph to interactive HTML
graph_data
,
destination_file_path
pipelines
Module re-export: Task, run_tasks, run_tasks_parallel, run_pipeline
<!-- [MANUAL:additional-notes] -->
函数用途关键参数
add()
导入文本、文件、二进制数据
data
,
dataset_name
,
user
cognify()
从导入的数据构建知识图谱
datasets
,
graph_model
,
chunker
,
temporal_cognify
search()
查询知识图谱
query_text
,
query_type
,
top_k
,
session_id
memify()
用自定义任务增强现有图谱
extraction_tasks
,
enrichment_tasks
,
data
config.*
运行时配置(LLM、数据库、向量存储)静态方法
datasets.*
列出、查看、删除数据集静态方法
prune.*
清理数据和系统资源
prune_data()
,
prune_system()
update()
更新现有数据项
data_id
,
data
,
dataset_id
session.*
会话历史与反馈
get_session()
,
add_feedback()
run_custom_pipeline()
执行自定义任务流水线
tasks
,
data
,
dataset
SearchType
包含14种搜索模式的枚举类
GRAPH_COMPLETION
(默认)
visualize_graph()
将知识图谱渲染为HTML
destination_file_path
enable_tracing()
启用OpenTelemetry追踪
console_output
run_migrations()
运行Alembic数据库迁移
start_ui()
启动本地Cognee UI(前端 + 后端 + MCP服务器)
pid_callback
,
port
,
start_backend
,
start_mcp
cognee_network_visualization()
将知识图谱渲染为交互式HTML
graph_data
,
destination_file_path
pipelines
模块重导出:Task, run_tasks, run_tasks_parallel, run_pipeline
<!-- [MANUAL:additional-notes] -->

LLM Provider Configuration

LLM提供商配置

Configure via
.env
— provider-specific examples:
bash
undefined
可通过
.env
文件配置——以下是不同提供商的示例:
bash
undefined

Azure OpenAI

Azure OpenAI

LLM_PROVIDER="azure" LLM_MODEL="azure/gpt-4o-mini" LLM_ENDPOINT="https://YOUR-RESOURCE.openai.azure.com/openai/deployments/gpt-4o-mini" LLM_API_KEY="your_key" LLM_API_VERSION="2024-12-01-preview"
LLM_PROVIDER="azure" LLM_MODEL="azure/gpt-4o-mini" LLM_ENDPOINT="https://YOUR-RESOURCE.openai.azure.com/openai/deployments/gpt-4o-mini" LLM_API_KEY="your_key" LLM_API_VERSION="2024-12-01-preview"

Anthropic (requires: pip install cognee[anthropic])

Anthropic(需安装:pip install cognee[anthropic])

LLM_PROVIDER="anthropic" LLM_MODEL="claude-3-5-sonnet-20241022" LLM_API_KEY="your_key"
LLM_PROVIDER="anthropic" LLM_MODEL="claude-3-5-sonnet-20241022" LLM_API_KEY="your_key"

Ollama (requires: pip install cognee[ollama])

Ollama(需安装:pip install cognee[ollama])

LLM_PROVIDER="ollama" LLM_MODEL="llama3.1:8b" LLM_ENDPOINT="http://localhost:11434/v1" LLM_API_KEY="ollama" EMBEDDING_PROVIDER="ollama" EMBEDDING_MODEL="nomic-embed-text:latest" EMBEDDING_ENDPOINT="http://localhost:11434/api/embed" HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"
LLM_PROVIDER="ollama" LLM_MODEL="llama3.1:8b" LLM_ENDPOINT="http://localhost:11434/v1" LLM_API_KEY="ollama" EMBEDDING_PROVIDER="ollama" EMBEDDING_MODEL="nomic-embed-text:latest" EMBEDDING_ENDPOINT="http://localhost:11434/api/embed" HUGGINGFACE_TOKENIZER="nomic-ai/nomic-embed-text-v1.5"

AWS Bedrock (requires: pip install cognee[aws])

AWS Bedrock(需安装:pip install cognee[aws])

LLM_PROVIDER="bedrock" LLM_MODEL="anthropic.claude-3-sonnet-20240229-v1:0" AWS_REGION="us-east-1"
LLM_PROVIDER="bedrock" LLM_MODEL="anthropic.claude-3-sonnet-20240229-v1:0" AWS_REGION="us-east-1"

Custom / OpenRouter / vLLM

自定义 / OpenRouter / vLLM

LLM_PROVIDER="custom" LLM_MODEL="openrouter/google/gemini-2.0-flash-lite-preview-02-05:free" LLM_ENDPOINT="https://openrouter.ai/api/v1"

**Rate limiting:** `LLM_RATE_LIMIT_ENABLED=true`, `LLM_RATE_LIMIT_REQUESTS=60`, `LLM_RATE_LIMIT_INTERVAL=60`

**Structured output:** `STRUCTURED_OUTPUT_FRAMEWORK="instructor"` (default) or `"baml"` (requires `cognee[baml]`). Override instructor mode: `LLM_INSTRUCTOR_MODE="json_schema_mode"`.
LLM_PROVIDER="custom" LLM_MODEL="openrouter/google/gemini-2.0-flash-lite-preview-02-05:free" LLM_ENDPOINT="https://openrouter.ai/api/v1"

**速率限制:** `LLM_RATE_LIMIT_ENABLED=true`, `LLM_RATE_LIMIT_REQUESTS=60`, `LLM_RATE_LIMIT_INTERVAL=60`

**结构化输出:** `STRUCTURED_OUTPUT_FRAMEWORK="instructor"`(默认)或`"baml"`(需安装`cognee[baml]`)。覆盖instructor模式:`LLM_INSTRUCTOR_MODE="json_schema_mode"`。

Database Switching

数据库切换

bash
undefined
bash
undefined

PostgreSQL (requires: pip install cognee[postgres])

PostgreSQL(需安装:pip install cognee[postgres])

DB_PROVIDER=postgres DB_HOST=localhost DB_PORT=5432 DB_USERNAME=cognee DB_PASSWORD=cognee DB_NAME=cognee_db
DB_PROVIDER=postgres DB_HOST=localhost DB_PORT=5432 DB_USERNAME=cognee DB_PASSWORD=cognee DB_NAME=cognee_db

PGVector (requires: pip install cognee[postgres])

PGVector(需安装:pip install cognee[postgres])

VECTOR_DB_PROVIDER=pgvector VECTOR_DB_URL=postgresql://cognee:cognee@localhost:5432/cognee_db
VECTOR_DB_PROVIDER=pgvector VECTOR_DB_URL=postgresql://cognee:cognee@localhost:5432/cognee_db

Neo4j (requires: pip install cognee[neo4j])

Neo4j(需安装:pip install cognee[neo4j])

GRAPH_DATABASE_PROVIDER=neo4j GRAPH_DATABASE_URL=bolt://localhost:7687 GRAPH_DATABASE_USERNAME=neo4j GRAPH_DATABASE_PASSWORD=yourpassword
GRAPH_DATABASE_PROVIDER=neo4j GRAPH_DATABASE_URL=bolt://localhost:7687 GRAPH_DATABASE_USERNAME=neo4j GRAPH_DATABASE_PASSWORD=yourpassword

S3 storage (requires: pip install cognee[aws])

S3存储(需安装:pip install cognee[aws])

STORAGE_BACKEND="s3" STORAGE_BUCKET_NAME="your-bucket" DATA_ROOT_DIRECTORY="s3://your-bucket/cognee/data"
undefined
STORAGE_BACKEND="s3" STORAGE_BUCKET_NAME="your-bucket" DATA_ROOT_DIRECTORY="s3://your-bucket/cognee/data"
undefined

Security Environment Variables

安全环境变量

bash
ACCEPT_LOCAL_FILE_PATH=True     # Allow local file paths in add()
ALLOW_HTTP_REQUESTS=True        # Allow HTTP fetches
ALLOW_CYPHER_QUERY=True         # Allow raw Cypher in SearchType.CYPHER
REQUIRE_AUTHENTICATION=False    # Enable API auth
ENABLE_BACKEND_ACCESS_CONTROL=True  # Multi-tenant dataset isolation
bash
ACCEPT_LOCAL_FILE_PATH=True     # 允许在add()中使用本地文件路径
ALLOW_HTTP_REQUESTS=True        # 允许HTTP请求
ALLOW_CYPHER_QUERY=True         # 允许在SearchType.CYPHER中使用原始Cypher语句
REQUIRE_AUTHENTICATION=False    # 启用API认证
ENABLE_BACKEND_ACCESS_CONTROL=True  # 多租户数据集隔离

Extension Patterns

扩展模式

Custom pipeline task:
python
from cognee.modules.pipelines.tasks.Task import Task

async def my_task(data):
    return process(data)

task = Task(my_task)
Direct database access:
python
from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.infrastructure.databases.vector import get_vector_engine

graph_engine = await get_graph_engine()
vector_engine = await get_vector_engine()
LLM Gateway (structured output):
python
from cognee.infrastructure.llm.get_llm_client import get_llm_client

llm_client = get_llm_client()
response = await llm_client.acreate_structured_output(
    text_input="prompt", system_prompt="instructions", response_model=YourPydanticModel
)
自定义流水线任务:
python
from cognee.modules.pipelines.tasks.Task import Task

async def my_task(data):
    return process(data)

task = Task(my_task)
直接数据库访问:
python
from cognee.infrastructure.databases.graph import get_graph_engine
from cognee.infrastructure.databases.vector import get_vector_engine

graph_engine = await get_graph_engine()
vector_engine = await get_vector_engine()
LLM网关(结构化输出):
python
from cognee.infrastructure.llm.get_llm_client import get_llm_client

llm_client = get_llm_client()
response = await llm_client.acreate_structured_output(
    text_input="prompt", system_prompt="instructions", response_model=YourPydanticModel
)

MCP Server Transport Modes

MCP服务器传输模式

bash
python src/server.py                    # stdio (default)
python src/server.py --transport sse    # SSE
python src/server.py --transport http --host 127.0.0.1 --port 8000 --path /mcp
bash
python src/server.py                    # 标准输入输出(默认)
python src/server.py --transport sse    # SSE
python src/server.py --transport http --host 127.0.0.1 --port 8000 --path /mcp

API mode (connect to running Cognee API):

API模式(连接到运行中的Cognee API):

python src/server.py --transport sse --api-url http://localhost:8000 --api-token TOKEN

Docker: `docker run -e TRANSPORT_MODE=sse --env-file .env -p 8000:8000 cognee/cognee-mcp:main`
python src/server.py --transport sse --api-url http://localhost:8000 --api-token TOKEN

Docker命令:`docker run -e TRANSPORT_MODE=sse --env-file .env -p 8000:8000 cognee/cognee-mcp:main`

Common Troubleshooting

常见问题排查

  • Ollama + OpenAI embeddings NoDataError — configure both LLM and embedding to same provider, or set
    HUGGINGFACE_TOKENIZER
  • LM Studio structured output — set
    LLM_INSTRUCTOR_MODE="json_schema_mode"
  • Default provider fallback — configuring only LLM or only embeddings defaults the other to OpenAI
  • Permission denied on search — returns empty list (not error) to prevent info leakage; check dataset permissions
  • Docker DB connections — use
    DB_HOST=host.docker.internal
    for local databases
  • Debug logging
    LITELLM_LOG="DEBUG"
    ,
    ENV="development"
    ,
    TELEMETRY_DISABLED=1
<!-- [/MANUAL:additional-notes] -->
  • Ollama + OpenAI嵌入出现NoDataError — 将LLM和嵌入模型配置为同一提供商,或设置
    HUGGINGFACE_TOKENIZER
  • LM Studio结构化输出问题 — 设置
    LLM_INSTRUCTOR_MODE="json_schema_mode"
  • 默认提供商回退 — 仅配置LLM或仅配置嵌入模型时,另一项将默认使用OpenAI
  • 搜索权限被拒绝 — 返回空列表(而非错误)以防止信息泄露;检查数据集权限
  • Docker数据库连接问题 — 对于本地数据库,使用
    DB_HOST=host.docker.internal
  • 调试日志
    LITELLM_LOG="DEBUG"
    ,
    ENV="development"
    ,
    TELEMETRY_DISABLED=1
<!-- [/MANUAL:additional-notes] -->

Migration & Deprecation Warnings

迁移与弃用警告

  • delete()
    deprecated since v0.3.9 — use
    datasets.delete_data()
    instead
    [SRC:cognee/api/v1/delete/__init__.py:L13]
  • memify()
    default pipeline changed
    — coding rules replaced with triplet embedding (Mar 2026)
    [QMD:cognee-temporal:prs.md]
  • update()
    bug
    — PATCH updates timestamps but GET raw may return old data
    [QMD:cognee-temporal:issues.md]
  • visualize_graph()
    frontend
    — open bug #2442: missing component in UI mode
    [QMD:cognee-temporal:issues.md]
  • start_ui()
    v0.5.5 bug
    — pip-installed frontend fails with 500 errors due to missing npm deps (react-markdown, ngraph.graph)
    [QMD:cognee-temporal:issues.md]
See Full API Reference for migration details.
  • delete()
    自v0.3.9起弃用——请使用
    datasets.delete_data()
    替代
    [SRC:cognee/api/v1/delete/__init__.py:L13]
  • memify()
    默认流水线变更
    — 编码规则已被三元组嵌入替代(2026年3月)
    [QMD:cognee-temporal:prs.md]
  • update()
    bug
    — PATCH更新会修改时间戳,但GET原始数据可能返回旧数据
    [QMD:cognee-temporal:issues.md]
  • visualize_graph()
    前端问题
    — 开放bug #2442:UI模式下缺少组件
    [QMD:cognee-temporal:issues.md]
  • start_ui()
    v0.5.5 bug
    — 通过pip安装的前端因缺少npm依赖(react-markdown、ngraph.graph)而出现500错误
    [QMD:cognee-temporal:issues.md]
有关迁移详情,请参阅完整API参考文档。

Key Types

核心类型定义

SearchType (14 modes)
[AST:cognee/modules/search/types/SearchType.py:L4]
:
GRAPH_COMPLETION
(default),
RAG_COMPLETION
,
CHUNKS
,
SUMMARIES
,
TRIPLET_COMPLETION
,
GRAPH_SUMMARY_COMPLETION
,
CYPHER
,
NATURAL_LANGUAGE
,
GRAPH_COMPLETION_COT
,
GRAPH_COMPLETION_CONTEXT_EXTENSION
,
FEELING_LUCKY
,
TEMPORAL
,
CODING_RULES
,
CHUNKS_LEXICAL
Task — wraps any callable (async/sync/generator) for pipeline execution
[AST:cognee/modules/pipelines/tasks/task.py]
DataPoint — base class for custom graph nodes; inherit and add typed fields
[EXT:docs.cognee.ai/guides/custom-data-models]
ChunkStrategy
PARAGRAPH
,
SENTENCE
,
LANGCHAIN_CHARACTER
[AST:cognee/shared/data_models.py:L83]
SearchType(14种模式)
[AST:cognee/modules/search/types/SearchType.py:L4]
:
GRAPH_COMPLETION
(默认),
RAG_COMPLETION
,
CHUNKS
,
SUMMARIES
,
TRIPLET_COMPLETION
,
GRAPH_SUMMARY_COMPLETION
,
CYPHER
,
NATURAL_LANGUAGE
,
GRAPH_COMPLETION_COT
,
GRAPH_COMPLETION_CONTEXT_EXTENSION
,
FEELING_LUCKY
,
TEMPORAL
,
CODING_RULES
,
CHUNKS_LEXICAL
Task — 为流水线执行包装任何可调用对象(异步/同步/生成器)
[AST:cognee/modules/pipelines/tasks/task.py]
DataPoint — 自定义图节点的基类;继承并添加类型化字段
[EXT:docs.cognee.ai/guides/custom-data-models]
ChunkStrategy
PARAGRAPH
,
SENTENCE
,
LANGCHAIN_CHARACTER
[AST:cognee/shared/data_models.py:L83]

Architecture at a Glance

架构概览

  • Storage: Relational (SQLite/Postgres) + Vector (LanceDB/PGVector/Qdrant/Redis/ChromaDB/FalkorDB) + Graph (Kuzu/Neo4j/Neptune/Memgraph)
  • LLM Providers: OpenAI, Azure OpenAI, Gemini, Anthropic, Ollama, custom (vLLM)
  • Embedding: OpenAI, Azure, Gemini, Mistral, Ollama, Fastembed
  • Pipeline: Tasks → Pipelines → run_pipeline (orchestration with async generators)
  • Observability: OpenTelemetry tracing via
    enable_tracing()
    /
    disable_tracing()
  • MCP Server: 7 tools (cognify, search, list_data, delete, prune, cognify_status, save_interaction)
    [AST:cognee-mcp/src/server.py]
  • 存储层: 关系型(SQLite/Postgres) + 向量(LanceDB/PGVector/Qdrant/Redis/ChromaDB/FalkorDB) + 图(Kuzu/Neo4j/Neptune/Memgraph)
  • LLM提供商: OpenAI、Azure OpenAI、Gemini、Anthropic、Ollama、自定义(vLLM)
  • 嵌入模型: OpenAI、Azure、Gemini、Mistral、Ollama、Fastembed
  • 流水线: 任务 → 流水线 → run_pipeline(使用异步生成器进行编排)
  • 可观测性: 通过
    enable_tracing()
    /
    disable_tracing()
    实现OpenTelemetry追踪
  • MCP服务器: 7个工具(cognify、search、list_data、delete、prune、cognify_status、save_interaction)
    [AST:cognee-mcp/src/server.py]

CLI

命令行工具(CLI)

bash
cognee --add "data"          # Ingest data
cognee --cognify             # Build knowledge graph
cognee --search "query"      # Search
cognee --debug               # Enable debug logging
cognee --ui                  # Launch local UI
[AST:cognee/cli/_cognee.py:L32]
bash
cognee --add "data"          # 导入数据
cognee --cognify             # 构建知识图谱
cognee --search "query"      # 搜索
cognee --debug               # 启用调试日志
cognee --ui                  # 启动本地UI
[AST:cognee/cli/_cognee.py:L32]

Full API Reference

完整API参考

See references/full-api-reference.md for complete signatures with parameters, return types, and T2 annotations for all 22 exports.
请参阅references/full-api-reference.md获取包含参数、返回类型及所有22个导出项的T2注释的完整签名。

Full Type Definitions

完整类型定义

SearchType
[AST:cognee/modules/search/types/SearchType.py:L4]

SearchType
[AST:cognee/modules/search/types/SearchType.py:L4]

python
class SearchType(str, Enum):
    SUMMARIES = "SUMMARIES"               # Vector similarity on TextSummary nodes
    CHUNKS = "CHUNKS"                     # Vector similarity on DocumentChunk nodes
    RAG_COMPLETION = "RAG_COMPLETION"     # LLM-backed with chunk context
    TRIPLET_COMPLETION = "TRIPLET_COMPLETION"  # Graph triplet-based retrieval
    GRAPH_COMPLETION = "GRAPH_COMPLETION" # Default — LLM + graph traversal
    GRAPH_SUMMARY_COMPLETION = "GRAPH_SUMMARY_COMPLETION"
    CYPHER = "CYPHER"                     # Raw Cypher query
    NATURAL_LANGUAGE = "NATURAL_LANGUAGE" # NL → Cypher translation
    GRAPH_COMPLETION_COT = "GRAPH_COMPLETION_COT"  # Chain-of-thought graph
    GRAPH_COMPLETION_CONTEXT_EXTENSION = "GRAPH_COMPLETION_CONTEXT_EXTENSION"
    FEELING_LUCKY = "FEELING_LUCKY"       # Single best result
    TEMPORAL = "TEMPORAL"                 # Time-aware search
    CODING_RULES = "CODING_RULES"         # Code rule retrieval
    CHUNKS_LEXICAL = "CHUNKS_LEXICAL"     # BM25 keyword search on chunks
python
class SearchType(str, Enum):
    SUMMARIES = "SUMMARIES"               # 针对TextSummary节点的向量相似度
    CHUNKS = "CHUNKS"                     # 针对DocumentChunk节点的向量相似度
    RAG_COMPLETION = "RAG_COMPLETION"     # 基于LLM并结合文本块上下文
    TRIPLET_COMPLETION = "TRIPLET_COMPLETION"  # 基于图三元组的检索
    GRAPH_COMPLETION = "GRAPH_COMPLETION" # 默认——LLM + 图遍历
    GRAPH_SUMMARY_COMPLETION = "GRAPH_SUMMARY_COMPLETION"
    CYPHER = "CYPHER"                     # 原始Cypher查询
    NATURAL_LANGUAGE = "NATURAL_LANGUAGE" # 自然语言转Cypher翻译
    GRAPH_COMPLETION_COT = "GRAPH_COMPLETION_COT"  # 思维链图检索
    GRAPH_COMPLETION_CONTEXT_EXTENSION = "GRAPH_COMPLETION_CONTEXT_EXTENSION"
    FEELING_LUCKY = "FEELING_LUCKY"       # 返回单个最佳结果
    TEMPORAL = "TEMPORAL"                 # 时间感知搜索
    CODING_RULES = "CODING_RULES"         # 编码规则检索
    CHUNKS_LEXICAL = "CHUNKS_LEXICAL"     # 针对文本块的BM25关键词搜索

DataPoint
[EXT:docs.cognee.ai/guides/custom-data-models]

DataPoint
[EXT:docs.cognee.ai/guides/custom-data-models]

Base class for all graph nodes. Inherits from Pydantic BaseModel. Set
metadata = {"index_fields": ["field"]}
for vector indexing. Use
Edge(weight, relationship_type)
for weighted relationships.
Notable subclasses:
DocumentChunk
,
TextSummary
,
CodeSummary
,
DatabaseSchema
,
SchemaTable
,
TranslatedContent
,
GraphitiNode
,
WebPage
.
所有图节点的基类,继承自Pydantic BaseModel。设置
metadata = {"index_fields": ["field"]}
以进行向量索引。使用
Edge(weight, relationship_type)
定义带权重的关系。
主要子类:
DocumentChunk
,
TextSummary
,
CodeSummary
,
DatabaseSchema
,
SchemaTable
,
TranslatedContent
,
GraphitiNode
,
WebPage

Task
[AST:cognee/modules/pipelines/tasks/task.py]

Task
[AST:cognee/modules/pipelines/tasks/task.py]

python
class Task:
    def __init__(self, executable, *args, task_config=None, **kwargs)
Wraps any callable (async/sync function, generator, async generator). Use
task_config={"batch_size": N}
for parallel processing. Decorate with
@task_summary("Processed {n} items")
for pipeline reporting.
python
class Task:
    def __init__(self, executable, *args, task_config=None, **kwargs)
包装任何可调用对象(异步/同步函数、生成器、异步生成器)以用于流水线执行。使用
task_config={"batch_size": N}
进行并行处理。使用
@task_summary("Processed {n} items")
装饰器进行流水线报告。

Pipeline Exports
[SRC:cognee/modules/pipelines/__init__.py:L1]

流水线导出项
[SRC:cognee/modules/pipelines/__init__.py:L1]

Task
,
run_tasks
,
run_tasks_parallel
,
run_pipeline
Task
,
run_tasks
,
run_tasks_parallel
,
run_pipeline

Full Integration Patterns

完整集成模式

Co-import Patterns

协同导入模式

  • pydantic
    BaseModel
    for graph models,
    BaseSettings
    for config
  • sqlalchemy — relational storage layer (async sessions)
  • fastapi — HTTP API server for deployment
  • uuid — dataset and data item identifiers
  • asyncio — all core operations are async
  • pydantic
    BaseModel
    用于图模型,
    BaseSettings
    用于配置
  • sqlalchemy — 关系型存储层(异步会话)
  • fastapi — 用于部署的HTTP API服务器
  • uuid — 数据集和数据项标识符
  • asyncio — 所有核心操作均为异步

MCP Server Integration
[AST:cognee-mcp/src/server.py]

MCP服务器集成
[AST:cognee-mcp/src/server.py]

7 MCP tools:
cognify(data, graph_model_file, graph_model_name, custom_prompt)
,
search(search_query, search_type, top_k)
,
save_interaction(...)
,
list_data(dataset_id)
,
delete(data_id, dataset_id, mode)
,
prune()
,
cognify_status()
.
Runs via
FastMCP("Cognee")
. Supports SSE and streamable HTTP transports with CORS.
7个MCP工具:
cognify(data, graph_model_file, graph_model_name, custom_prompt)
,
search(search_query, search_type, top_k)
,
save_interaction(...)
,
list_data(dataset_id)
,
delete(data_id, dataset_id, mode)
,
prune()
,
cognify_status()
通过
FastMCP("Cognee")
运行。支持SSE和可流式传输的HTTP传输,并兼容CORS。

Provider Configuration
[EXT:docs.cognee.ai/setup-configuration/overview]

提供商配置
[EXT:docs.cognee.ai/setup-configuration/overview]

Configure via
.env
or
cognee.config.*
methods:
  • LLM:
    LLM_API_KEY
    ,
    LLM_MODEL
    ,
    LLM_PROVIDER
    (openai/azure/gemini/anthropic/ollama/custom)
  • Embedding:
    EMBEDDING_PROVIDER
    ,
    EMBEDDING_MODEL
  • Vector:
    VECTOR_DB_PROVIDER
    (lancedb/pgvector/qdrant/redis/chromadb/falkordb)
  • Graph:
    GRAPH_DB_PROVIDER
    (kuzu/neo4j/neptune/memgraph)
  • Debug:
    LOG_LEVEL=DEBUG
    ,
    TELEMETRY_DISABLED=true
通过
.env
cognee.config.*
方法配置:
  • LLM:
    LLM_API_KEY
    ,
    LLM_MODEL
    ,
    LLM_PROVIDER
    (openai/azure/gemini/anthropic/ollama/custom)
  • 嵌入模型:
    EMBEDDING_PROVIDER
    ,
    EMBEDDING_MODEL
  • 向量存储:
    VECTOR_DB_PROVIDER
    (lancedb/pgvector/qdrant/redis/chromadb/falkordb)
  • 图存储:
    GRAPH_DB_PROVIDER
    (kuzu/neo4j/neptune/memgraph)
  • 调试:
    LOG_LEVEL=DEBUG
    ,
    TELEMETRY_DISABLED=true