gpt-researcher
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGPT Researcher Development Skill
GPT Researcher 开发技能
GPT Researcher is an LLM-based autonomous agent using a planner-executor-publisher pattern with parallelized agent work for speed and reliability.
GPT Researcher是基于LLM的自主Agent,采用规划-执行-发布模式,通过多Agent并行工作提升速度与可靠性。
Quick Start
快速开始
Basic Python Usage
Python基础用法
python
from gpt_researcher import GPTResearcher
import asyncio
async def main():
researcher = GPTResearcher(
query="What are the latest AI developments?",
report_type="research_report", # or detailed_report, deep, outline_report
report_source="web", # or local, hybrid
)
await researcher.conduct_research()
report = await researcher.write_report()
print(report)
asyncio.run(main())python
from gpt_researcher import GPTResearcher
import asyncio
async def main():
researcher = GPTResearcher(
query="What are the latest AI developments?",
report_type="research_report", # 或 detailed_report、deep、outline_report
report_source="web", # 或 local、hybrid
)
await researcher.conduct_research()
report = await researcher.write_report()
print(report)
asyncio.run(main())Run Servers
启动服务
bash
undefinedbash
undefinedBackend
后端
python -m uvicorn backend.server.server:app --reload --port 8000
python -m uvicorn backend.server.server:app --reload --port 8000
Frontend
前端
cd frontend/nextjs && npm install && npm run dev
---cd frontend/nextjs && npm install && npm run dev
---Key File Locations
关键文件位置
| Need | Primary File | Key Classes |
|---|---|---|
| Main orchestrator | | |
| Research logic | | |
| Report writing | | |
| All prompts | | |
| Configuration | | |
| Config defaults | | |
| API server | | FastAPI |
| Search engines | | Various retrievers |
| 用途 | 主要文件 | 关键类 |
|---|---|---|
| 主协调器 | | |
| 研究逻辑 | | |
| 报告撰写 | | |
| 所有提示词 | | |
| 配置文件 | | |
| 默认配置 | | |
| API服务 | | FastAPI |
| 搜索引擎 | | 各类检索器 |
Architecture Overview
架构概述
User Query → GPTResearcher.__init__()
│
▼
choose_agent() → (agent_type, role_prompt)
│
▼
ResearchConductor.conduct_research()
├── plan_research() → sub_queries
├── For each sub_query:
│ └── _process_sub_query() → context
└── Aggregate contexts
│
▼
[Optional] ImageGenerator.plan_and_generate_images()
│
▼
ReportGenerator.write_report() → Markdown reportFor detailed architecture diagrams: See references/architecture.md
用户查询 → GPTResearcher.__init__()
│
▼
choose_agent() → (agent_type, role_prompt)
│
▼
ResearchConductor.conduct_research()
├── plan_research() → 子查询
├── 针对每个子查询:
│ └── _process_sub_query() → 上下文
└── 聚合上下文
│
▼
[可选] ImageGenerator.plan_and_generate_images()
│
▼
ReportGenerator.write_report() → Markdown报告详细架构图: 查看 references/architecture.md
Core Patterns
核心模式
Adding a New Feature (8-Step Pattern)
新增功能(8步模式)
- Config → Add to
gpt_researcher/config/variables/default.py - Provider → Create in
gpt_researcher/llm_provider/my_feature/ - Skill → Create in
gpt_researcher/skills/my_feature.py - Agent → Integrate in
gpt_researcher/agent.py - Prompts → Update
gpt_researcher/prompts.py - WebSocket → Events via
stream_output() - Frontend → Handle events in
useWebSocket.ts - Docs → Create
docs/docs/gpt-researcher/gptr/my_feature.md
For complete feature addition guide with Image Generation case study: See references/adding-features.md
- 配置 → 添加至
gpt_researcher/config/variables/default.py - 提供方 → 在 中创建
gpt_researcher/llm_provider/my_feature/ - 技能 → 在 中创建
gpt_researcher/skills/my_feature.py - Agent → 在 中集成
gpt_researcher/agent.py - 提示词 → 更新
gpt_researcher/prompts.py - WebSocket → 通过 触发事件
stream_output() - 前端 → 在 中处理事件
useWebSocket.ts - 文档 → 创建
docs/docs/gpt-researcher/gptr/my_feature.md
含图片生成案例的完整功能添加指南: 查看 references/adding-features.md
Adding a New Retriever
新增检索器
python
undefinedpython
undefined1. Create: gpt_researcher/retrievers/my_retriever/my_retriever.py
1. 创建: gpt_researcher/retrievers/my_retriever/my_retriever.py
class MyRetriever:
def init(self, query: str, headers: dict = None):
self.query = query
async def search(self, max_results: int = 10) -> list[dict]:
# Return: [{"title": str, "href": str, "body": str}]
passclass MyRetriever:
def init(self, query: str, headers: dict = None):
self.query = query
async def search(self, max_results: int = 10) -> list[dict]:
# 返回格式: [{"title": str, "href": str, "body": str}]
pass2. Register in gpt_researcher/actions/retriever.py
2. 在 gpt_researcher/actions/retriever.py 中注册
case "my_retriever":
from gpt_researcher.retrievers.my_retriever import MyRetriever
return MyRetriever
case "my_retriever":
from gpt_researcher.retrievers.my_retriever import MyRetriever
return MyRetriever
3. Export in gpt_researcher/retrievers/init.py
3. 在 gpt_researcher/retrievers/init.py 中导出
**For complete retriever documentation**: See [references/retrievers.md](references/retrievers.md)
---
**完整检索器文档**: 查看 [references/retrievers.md](references/retrievers.md)
---Configuration
配置
Config keys are lowercased when accessed:
python
undefined配置键在访问时会自动转为小写:
python
undefinedIn default.py: "SMART_LLM": "gpt-4o"
在 default.py 中: "SMART_LLM": "gpt-4o"
Access as: self.cfg.smart_llm # lowercase!
访问方式: self.cfg.smart_llm # 必须小写!
Priority: Environment Variables → JSON Config File → Default Values
**For complete configuration reference**: See [references/config-reference.md](references/config-reference.md)
---
优先级: 环境变量 → JSON配置文件 → 默认值
**完整配置参考**: 查看 [references/config-reference.md](references/config-reference.md)
---Common Integration Points
常见集成点
WebSocket Streaming
WebSocket流式传输
python
class WebSocketHandler:
async def send_json(self, data):
print(f"[{data['type']}] {data.get('output', '')}")
researcher = GPTResearcher(query="...", websocket=WebSocketHandler())python
class WebSocketHandler:
async def send_json(self, data):
print(f"[{data['type']}] {data.get('output', '')}")
researcher = GPTResearcher(query="...", websocket=WebSocketHandler())MCP Data Sources
MCP数据源
python
researcher = GPTResearcher(
query="Open source AI projects",
mcp_configs=[{
"name": "github",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
}],
mcp_strategy="deep", # or "fast", "disabled"
)For MCP integration details: See references/mcp.md
python
researcher = GPTResearcher(
query="Open source AI projects",
mcp_configs=[{
"name": "github",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
}],
mcp_strategy="deep", # 或 "fast"、"disabled"
)MCP集成详情: 查看 references/mcp.md
Deep Research Mode
深度研究模式
python
researcher = GPTResearcher(
query="Comprehensive analysis of quantum computing",
report_type="deep", # Triggers recursive tree-like exploration
)For deep research configuration: See references/deep-research.md
python
researcher = GPTResearcher(
query="Comprehensive analysis of quantum computing",
report_type="deep", # 触发递归树状探索
)深度研究配置: 查看 references/deep-research.md
Error Handling
错误处理
Always use graceful degradation in skills:
python
async def execute(self, ...):
if not self.is_enabled():
return [] # Don't crash
try:
result = await self.provider.execute(...)
return result
except Exception as e:
await stream_output("logs", "error", f"⚠️ {e}", self.websocket)
return [] # Graceful degradation技能中需始终采用优雅降级策略:
python
async def execute(self, ...):
if not self.is_enabled():
return [] # 避免崩溃
try:
result = await self.provider.execute(...)
return result
except Exception as e:
await stream_output("logs", "error", f"⚠️ {e}", self.websocket)
return [] # 优雅降级Critical Gotchas
常见误区
| ❌ Mistake | ✅ Correct |
|---|---|
| |
| Editing pip-installed package | |
| Forgetting async/await | All research methods are async |
| Check |
| Not registering retriever | Add to |
| ❌ 错误做法 | ✅ 正确做法 |
|---|---|
| |
| 直接修改pip安装的包 | 使用 |
| 遗漏async/await | 所有研究方法均为异步 |
对None调用 | 先检查 |
| 未注册检索器 | 添加至 |
Reference Documentation
参考文档
| Topic | File |
|---|---|
| System architecture & diagrams | references/architecture.md |
| Core components & signatures | references/components.md |
| Research flow & data flow | references/flows.md |
| Prompt system | references/prompts.md |
| Retriever system | references/retrievers.md |
| MCP integration | references/mcp.md |
| Deep research mode | references/deep-research.md |
| Multi-agent system | references/multi-agents.md |
| Adding features guide | references/adding-features.md |
| Advanced patterns | references/advanced-patterns.md |
| REST & WebSocket API | references/api-reference.md |
| Configuration variables | references/config-reference.md |
| 主题 | 文件 |
|---|---|
| 系统架构与图表 | references/architecture.md |
| 核心组件与签名 | references/components.md |
| 研究流程与数据流 | references/flows.md |
| 提示词系统 | references/prompts.md |
| 检索器系统 | references/retrievers.md |
| MCP集成 | references/mcp.md |
| 深度研究模式 | references/deep-research.md |
| 多Agent系统 | references/multi-agents.md |
| 新增功能指南 | references/adding-features.md |
| 高级模式 | references/advanced-patterns.md |
| REST与WebSocket API | references/api-reference.md |
| 配置变量 | references/config-reference.md |