gpt-researcher

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

GPT Researcher Development Skill

GPT Researcher 开发技能

GPT Researcher is an LLM-based autonomous agent using a planner-executor-publisher pattern with parallelized agent work for speed and reliability.
GPT Researcher是基于LLM的自主Agent,采用规划-执行-发布模式,通过多Agent并行工作提升速度与可靠性。

Quick Start

快速开始

Basic Python Usage

Python基础用法

python
from gpt_researcher import GPTResearcher
import asyncio

async def main():
    researcher = GPTResearcher(
        query="What are the latest AI developments?",
        report_type="research_report",  # or detailed_report, deep, outline_report
        report_source="web",            # or local, hybrid
    )
    await researcher.conduct_research()
    report = await researcher.write_report()
    print(report)

asyncio.run(main())
python
from gpt_researcher import GPTResearcher
import asyncio

async def main():
    researcher = GPTResearcher(
        query="What are the latest AI developments?",
        report_type="research_report",  # 或 detailed_report、deep、outline_report
        report_source="web",            # 或 local、hybrid
    )
    await researcher.conduct_research()
    report = await researcher.write_report()
    print(report)

asyncio.run(main())

Run Servers

启动服务

bash
undefined
bash
undefined

Backend

后端

python -m uvicorn backend.server.server:app --reload --port 8000
python -m uvicorn backend.server.server:app --reload --port 8000

Frontend

前端

cd frontend/nextjs && npm install && npm run dev

---
cd frontend/nextjs && npm install && npm run dev

---

Key File Locations

关键文件位置

NeedPrimary FileKey Classes
Main orchestrator
gpt_researcher/agent.py
GPTResearcher
Research logic
gpt_researcher/skills/researcher.py
ResearchConductor
Report writing
gpt_researcher/skills/writer.py
ReportGenerator
All prompts
gpt_researcher/prompts.py
PromptFamily
Configuration
gpt_researcher/config/config.py
Config
Config defaults
gpt_researcher/config/variables/default.py
DEFAULT_CONFIG
API server
backend/server/app.py
FastAPI
app
Search engines
gpt_researcher/retrievers/
Various retrievers

用途主要文件关键类
主协调器
gpt_researcher/agent.py
GPTResearcher
研究逻辑
gpt_researcher/skills/researcher.py
ResearchConductor
报告撰写
gpt_researcher/skills/writer.py
ReportGenerator
所有提示词
gpt_researcher/prompts.py
PromptFamily
配置文件
gpt_researcher/config/config.py
Config
默认配置
gpt_researcher/config/variables/default.py
DEFAULT_CONFIG
API服务
backend/server/app.py
FastAPI
app
搜索引擎
gpt_researcher/retrievers/
各类检索器

Architecture Overview

架构概述

User Query → GPTResearcher.__init__()
         choose_agent() → (agent_type, role_prompt)
         ResearchConductor.conduct_research()
           ├── plan_research() → sub_queries
           ├── For each sub_query:
           │     └── _process_sub_query() → context
           └── Aggregate contexts
         [Optional] ImageGenerator.plan_and_generate_images()
         ReportGenerator.write_report() → Markdown report
For detailed architecture diagrams: See references/architecture.md

用户查询 → GPTResearcher.__init__()
         choose_agent() → (agent_type, role_prompt)
         ResearchConductor.conduct_research()
           ├── plan_research() → 子查询
           ├── 针对每个子查询:
           │     └── _process_sub_query() → 上下文
           └── 聚合上下文
         [可选] ImageGenerator.plan_and_generate_images()
         ReportGenerator.write_report() → Markdown报告
详细架构图: 查看 references/architecture.md

Core Patterns

核心模式

Adding a New Feature (8-Step Pattern)

新增功能(8步模式)

  1. Config → Add to
    gpt_researcher/config/variables/default.py
  2. Provider → Create in
    gpt_researcher/llm_provider/my_feature/
  3. Skill → Create in
    gpt_researcher/skills/my_feature.py
  4. Agent → Integrate in
    gpt_researcher/agent.py
  5. Prompts → Update
    gpt_researcher/prompts.py
  6. WebSocket → Events via
    stream_output()
  7. Frontend → Handle events in
    useWebSocket.ts
  8. Docs → Create
    docs/docs/gpt-researcher/gptr/my_feature.md
For complete feature addition guide with Image Generation case study: See references/adding-features.md
  1. 配置 → 添加至
    gpt_researcher/config/variables/default.py
  2. 提供方 → 在
    gpt_researcher/llm_provider/my_feature/
    中创建
  3. 技能 → 在
    gpt_researcher/skills/my_feature.py
    中创建
  4. Agent → 在
    gpt_researcher/agent.py
    中集成
  5. 提示词 → 更新
    gpt_researcher/prompts.py
  6. WebSocket → 通过
    stream_output()
    触发事件
  7. 前端 → 在
    useWebSocket.ts
    中处理事件
  8. 文档 → 创建
    docs/docs/gpt-researcher/gptr/my_feature.md
含图片生成案例的完整功能添加指南: 查看 references/adding-features.md

Adding a New Retriever

新增检索器

python
undefined
python
undefined

1. Create: gpt_researcher/retrievers/my_retriever/my_retriever.py

1. 创建: gpt_researcher/retrievers/my_retriever/my_retriever.py

class MyRetriever: def init(self, query: str, headers: dict = None): self.query = query
async def search(self, max_results: int = 10) -> list[dict]:
    # Return: [{"title": str, "href": str, "body": str}]
    pass
class MyRetriever: def init(self, query: str, headers: dict = None): self.query = query
async def search(self, max_results: int = 10) -> list[dict]:
    # 返回格式: [{"title": str, "href": str, "body": str}]
    pass

2. Register in gpt_researcher/actions/retriever.py

2. 在 gpt_researcher/actions/retriever.py 中注册

case "my_retriever": from gpt_researcher.retrievers.my_retriever import MyRetriever return MyRetriever
case "my_retriever": from gpt_researcher.retrievers.my_retriever import MyRetriever return MyRetriever

3. Export in gpt_researcher/retrievers/init.py

3. 在 gpt_researcher/retrievers/init.py 中导出


**For complete retriever documentation**: See [references/retrievers.md](references/retrievers.md)

---

**完整检索器文档**: 查看 [references/retrievers.md](references/retrievers.md)

---

Configuration

配置

Config keys are lowercased when accessed:
python
undefined
配置键在访问时会自动转为小写:
python
undefined

In default.py: "SMART_LLM": "gpt-4o"

在 default.py 中: "SMART_LLM": "gpt-4o"

Access as: self.cfg.smart_llm # lowercase!

访问方式: self.cfg.smart_llm # 必须小写!


Priority: Environment Variables → JSON Config File → Default Values

**For complete configuration reference**: See [references/config-reference.md](references/config-reference.md)

---

优先级: 环境变量 → JSON配置文件 → 默认值

**完整配置参考**: 查看 [references/config-reference.md](references/config-reference.md)

---

Common Integration Points

常见集成点

WebSocket Streaming

WebSocket流式传输

python
class WebSocketHandler:
    async def send_json(self, data):
        print(f"[{data['type']}] {data.get('output', '')}")

researcher = GPTResearcher(query="...", websocket=WebSocketHandler())
python
class WebSocketHandler:
    async def send_json(self, data):
        print(f"[{data['type']}] {data.get('output', '')}")

researcher = GPTResearcher(query="...", websocket=WebSocketHandler())

MCP Data Sources

MCP数据源

python
researcher = GPTResearcher(
    query="Open source AI projects",
    mcp_configs=[{
        "name": "github",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-github"],
        "env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
    }],
    mcp_strategy="deep",  # or "fast", "disabled"
)
For MCP integration details: See references/mcp.md
python
researcher = GPTResearcher(
    query="Open source AI projects",
    mcp_configs=[{
        "name": "github",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-github"],
        "env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
    }],
    mcp_strategy="deep",  # 或 "fast"、"disabled"
)
MCP集成详情: 查看 references/mcp.md

Deep Research Mode

深度研究模式

python
researcher = GPTResearcher(
    query="Comprehensive analysis of quantum computing",
    report_type="deep",  # Triggers recursive tree-like exploration
)
For deep research configuration: See references/deep-research.md

python
researcher = GPTResearcher(
    query="Comprehensive analysis of quantum computing",
    report_type="deep",  # 触发递归树状探索
)
深度研究配置: 查看 references/deep-research.md

Error Handling

错误处理

Always use graceful degradation in skills:
python
async def execute(self, ...):
    if not self.is_enabled():
        return []  # Don't crash
    
    try:
        result = await self.provider.execute(...)
        return result
    except Exception as e:
        await stream_output("logs", "error", f"⚠️ {e}", self.websocket)
        return []  # Graceful degradation

技能中需始终采用优雅降级策略:
python
async def execute(self, ...):
    if not self.is_enabled():
        return []  # 避免崩溃
    
    try:
        result = await self.provider.execute(...)
        return result
    except Exception as e:
        await stream_output("logs", "error", f"⚠️ {e}", self.websocket)
        return []  # 优雅降级

Critical Gotchas

常见误区

❌ Mistake✅ Correct
config.MY_VAR
config.my_var
(lowercased)
Editing pip-installed package
pip install -e .
Forgetting async/awaitAll research methods are async
websocket.send_json()
on None
Check
if websocket:
first
Not registering retrieverAdd to
retriever.py
match statement

❌ 错误做法✅ 正确做法
config.MY_VAR
config.my_var
(必须小写)
直接修改pip安装的包使用
pip install -e .
遗漏async/await所有研究方法均为异步
对None调用
websocket.send_json()
先检查
if websocket:
未注册检索器添加至
retriever.py
的match语句

Reference Documentation

参考文档

TopicFile
System architecture & diagramsreferences/architecture.md
Core components & signaturesreferences/components.md
Research flow & data flowreferences/flows.md
Prompt systemreferences/prompts.md
Retriever systemreferences/retrievers.md
MCP integrationreferences/mcp.md
Deep research modereferences/deep-research.md
Multi-agent systemreferences/multi-agents.md
Adding features guidereferences/adding-features.md
Advanced patternsreferences/advanced-patterns.md
REST & WebSocket APIreferences/api-reference.md
Configuration variablesreferences/config-reference.md
主题文件
系统架构与图表references/architecture.md
核心组件与签名references/components.md
研究流程与数据流references/flows.md
提示词系统references/prompts.md
检索器系统references/retrievers.md
MCP集成references/mcp.md
深度研究模式references/deep-research.md
多Agent系统references/multi-agents.md
新增功能指南references/adding-features.md
高级模式references/advanced-patterns.md
REST与WebSocket APIreferences/api-reference.md
配置变量references/config-reference.md