agentic-data-scientist
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgentic Data Scientist
Agentic Data Scientist
Skill by ara.so — AI Agent Skills collection.
Agentic Data Scientist is an adaptive multi-agent framework that automates complex data science tasks using a sophisticated workflow with planning, execution, validation, and self-correction. Built on Google's Agent Development Kit (ADK) and Claude Agent SDK, it separates planning from execution and continuously validates work against success criteria.
由ara.so提供的Skill——AI Agent技能集合。
Agentic Data Scientist是一个自适应多Agent框架,它借助规划、执行、验证和自我修正的复杂工作流来自动化复杂的数据科学任务。它基于Google的Agent Development Kit (ADK)和Claude Agent SDK构建,将规划与执行分离,并持续根据成功标准验证工作成果。
What It Does
功能特性
- Orchestrated Mode: Full multi-agent workflow with planning, iterative execution, validation, and adaptive replanning
- Simple Mode: Direct coding without planning overhead for quick tasks
- Multi-Agent Architecture: Specialized agents for planning, coding, reviewing, validation, and summarization
- Continuous Validation: Tracks progress against success criteria at every stage
- Self-Correcting: Adapts plans based on discoveries during execution
- MCP Integration: Access to tools via Model Context Protocol servers
- Claude Scientific Skills: 380+ advanced scientific computing skills available to coding agent
- 编排模式:包含规划、迭代执行、验证和自适应重规划的完整多Agent工作流
- 简易模式:无需规划开销,直接编码完成快速任务
- 多Agent架构:专为规划、编码、审核、验证和总结设计的专用Agent
- 持续验证:在每个阶段跟踪进度是否符合成功标准
- 自我修正:根据执行过程中的发现调整计划
- MCP集成:通过Model Context Protocol服务器访问工具
- Claude科学技能:编码Agent可使用380+高级科学计算技能
Installation
安装
bash
undefinedbash
undefinedInstall globally with uv
使用uv全局安装
uv tool install agentic-data-scientist
uv tool install agentic-data-scientist
Or use directly with uvx (no installation)
或直接使用uvx(无需安装)
uvx agentic-data-scientist --mode simple "your query"
undefineduvx agentic-data-scientist --mode simple "your query"
undefinedPrerequisites
前置条件
Required:
- Claude Code CLI (for coding agent):
bash
npm install -g @anthropic-ai/claude-code- API Keys (set as environment variables):
bash
export OPENROUTER_API_KEY="your_openrouter_key" # For planning/review agents
export ANTHROPIC_API_KEY="your_anthropic_key" # For coding agentGet keys from:
- OpenRouter: https://openrouter.ai/keys
- Anthropic: https://console.anthropic.com/
Optional:
bash
undefined必填项:
- Claude Code CLI(供编码Agent使用):
bash
npm install -g @anthropic-ai/claude-code- API密钥(设置为环境变量):
bash
export OPENROUTER_API_KEY="your_openrouter_key" # 供规划/审核Agent使用
export ANTHROPIC_API_KEY="your_anthropic_key" # 供编码Agent使用获取密钥地址:
- OpenRouter: https://openrouter.ai/keys
- Anthropic: https://console.anthropic.com/
可选项:
bash
undefinedDisable network access (web search, URL fetching)
禁用网络访问(网页搜索、URL获取)
export DISABLE_NETWORK_ACCESS=true
undefinedexport DISABLE_NETWORK_ACCESS=true
undefinedConfiguration
配置
Create a file in your project directory:
.envbash
undefined在项目目录中创建文件:
.envbash
undefinedRequired
必填项
OPENROUTER_API_KEY=your_openrouter_key
ANTHROPIC_API_KEY=your_anthropic_key
OPENROUTER_API_KEY=your_openrouter_key
ANTHROPIC_API_KEY=your_anthropic_key
Optional
可选项
DISABLE_NETWORK_ACCESS=false # Set to true to disable web tools
undefinedDISABLE_NETWORK_ACCESS=false # 设置为true以禁用网络工具
undefinedKey Commands
核心命令
Basic Usage
基础用法
You must specify for every command:
--modebash
undefined每次命令必须指定参数:
--modebash
undefinedOrchestrated mode: Full multi-agent workflow
编排模式:完整多Agent工作流
agentic-data-scientist "Perform differential expression analysis"
--mode orchestrated
--files data.csv
--mode orchestrated
--files data.csv
agentic-data-scientist "Perform differential expression analysis"
--mode orchestrated
--files data.csv
--mode orchestrated
--files data.csv
Simple mode: Direct coding, no planning
简易模式:直接编码,无需规划
agentic-data-scientist "Write a CSV parser"
--mode simple
--mode simple
undefinedagentic-data-scientist "Write a CSV parser"
--mode simple
--mode simple
undefinedFile Handling
文件处理
bash
undefinedbash
undefinedSingle file
单个文件
agentic-data-scientist "Analyze dataset"
--mode orchestrated
--files data.csv
--mode orchestrated
--files data.csv
agentic-data-scientist "Analyze dataset"
--mode orchestrated
--files data.csv
--mode orchestrated
--files data.csv
Multiple files
多个文件
agentic-data-scientist "Compare datasets"
--mode orchestrated
-f data1.csv -f data2.csv -f metadata.json
--mode orchestrated
-f data1.csv -f data2.csv -f metadata.json
agentic-data-scientist "Compare datasets"
--mode orchestrated
-f data1.csv -f data2.csv -f metadata.json
--mode orchestrated
-f data1.csv -f data2.csv -f metadata.json
Directory upload (recursive)
目录上传(递归)
agentic-data-scientist "Analyze all CSVs in folder"
--mode orchestrated
--files ./data_folder/
--mode orchestrated
--files ./data_folder/
undefinedagentic-data-scientist "Analyze all CSVs in folder"
--mode orchestrated
--files ./data_folder/
--mode orchestrated
--files ./data_folder/
undefinedWorking Directory Options
工作目录选项
bash
undefinedbash
undefinedDefault: ./agentic_output/ (preserved after completion)
默认:./agentic_output/(任务完成后保留)
agentic-data-scientist "Analyze data"
--mode orchestrated
--files data.csv
--mode orchestrated
--files data.csv
agentic-data-scientist "Analyze data"
--mode orchestrated
--files data.csv
--mode orchestrated
--files data.csv
Custom working directory
自定义工作目录
agentic-data-scientist "Generate report"
--mode orchestrated
--files data.csv
--working-dir ./my_analysis
--mode orchestrated
--files data.csv
--working-dir ./my_analysis
agentic-data-scientist "Generate report"
--mode orchestrated
--files data.csv
--working-dir ./my_analysis
--mode orchestrated
--files data.csv
--working-dir ./my_analysis
Temporary directory (auto-cleanup)
临时目录(自动清理)
agentic-data-scientist "Quick exploration"
--mode simple
--files data.csv
--temp-dir
--mode simple
--files data.csv
--temp-dir
agentic-data-scientist "Quick exploration"
--mode simple
--files data.csv
--temp-dir
--mode simple
--files data.csv
--temp-dir
Force keep files (override temp-dir cleanup)
强制保留文件(覆盖临时目录清理规则)
agentic-data-scientist "Analysis"
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
undefinedagentic-data-scientist "Analysis"
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
undefinedLogging and Debugging
日志与调试
bash
undefinedbash
undefinedCustom log file location
自定义日志文件位置
agentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--log-file ./analysis.log
--mode orchestrated
--files data.csv
--log-file ./analysis.log
agentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--log-file ./analysis.log
--mode orchestrated
--files data.csv
--log-file ./analysis.log
Verbose logging
详细日志
agentic-data-scientist "Debug issue"
--mode simple
--verbose
--mode simple
--verbose
undefinedagentic-data-scientist "Debug issue"
--mode simple
--verbose
--mode simple
--verbose
undefinedReal-World Examples
实际应用示例
Example 1: Complex Data Analysis (Orchestrated Mode)
示例1:复杂数据分析(编排模式)
bash
undefinedbash
undefinedComprehensive analysis with multiple stages
包含多个阶段的全面分析
agentic-data-scientist
"Perform exploratory data analysis on sales data,
identify trends, create visualizations,
and build a predictive model for future sales"
--mode orchestrated
--files sales_2024.csv
--working-dir ./sales_analysis
--log-file analysis.log
"Perform exploratory data analysis on sales data,
identify trends, create visualizations,
and build a predictive model for future sales"
--mode orchestrated
--files sales_2024.csv
--working-dir ./sales_analysis
--log-file analysis.log
**What happens:**
1. **Planning Phase**: Creates detailed plan with stages (EDA, visualization, modeling)
2. **Execution Phase**: Implements each stage iteratively with validation
3. **Validation**: Checks success criteria after each stage
4. **Adaptation**: Adjusts plan based on discoveries (e.g., data quality issues)
5. **Summary**: Generates comprehensive report with all findingsagentic-data-scientist
"Perform exploratory data analysis on sales data,
identify trends, create visualizations,
and build a predictive model for future sales"
--mode orchestrated
--files sales_2024.csv
--working-dir ./sales_analysis
--log-file analysis.log
"Perform exploratory data analysis on sales data,
identify trends, create visualizations,
and build a predictive model for future sales"
--mode orchestrated
--files sales_2024.csv
--working-dir ./sales_analysis
--log-file analysis.log
**执行流程:**
1. **规划阶段**:创建包含多个阶段(探索性数据分析、可视化、建模)的详细计划
2. **执行阶段**:迭代执行每个阶段并进行验证
3. **验证环节**:每个阶段完成后检查是否符合成功标准
4. **自适应调整**:根据执行中的发现(如数据质量问题)调整计划
5. **总结环节**:生成包含所有发现的全面报告Example 2: Quick Scripting (Simple Mode)
示例2:快速脚本编写(简易模式)
bash
undefinedbash
undefinedFast coding without planning overhead
无需规划开销的快速编码
agentic-data-scientist
"Write a Python script that reads multiple CSV files,
merges them on a common ID column,
and exports to Excel with formatting"
--mode simple
--files data1.csv data2.csv data3.csv
--temp-dir
"Write a Python script that reads multiple CSV files,
merges them on a common ID column,
and exports to Excel with formatting"
--mode simple
--files data1.csv data2.csv data3.csv
--temp-dir
**What happens:**
- Direct execution with coding agent (no planning phase)
- Quick turnaround for straightforward tasks
- Temporary directory auto-cleanupagentic-data-scientist
"Write a Python script that reads multiple CSV files,
merges them on a common ID column,
and exports to Excel with formatting"
--mode simple
--files data1.csv data2.csv data3.csv
--temp-dir
"Write a Python script that reads multiple CSV files,
merges them on a common ID column,
and exports to Excel with formatting"
--mode simple
--files data1.csv data2.csv data3.csv
--temp-dir
**执行流程:**
- 直接通过编码Agent执行(无规划阶段)
- 简单任务快速完成
- 临时目录自动清理Example 3: Multi-File Statistical Analysis
示例3:多文件统计分析
bash
undefinedbash
undefinedCompare multiple datasets
对比多个数据集
agentic-data-scientist
"Compare the distribution of features across treatment groups,
perform statistical tests (t-test, ANOVA),
and generate publication-ready plots"
--mode orchestrated
-f control.csv
-f treatment_a.csv
-f treatment_b.csv
--working-dir ./stats_analysis
"Compare the distribution of features across treatment groups,
perform statistical tests (t-test, ANOVA),
and generate publication-ready plots"
--mode orchestrated
-f control.csv
-f treatment_a.csv
-f treatment_b.csv
--working-dir ./stats_analysis
undefinedagentic-data-scientist
"Compare the distribution of features across treatment groups,
perform statistical tests (t-test, ANOVA),
and generate publication-ready plots"
--mode orchestrated
-f control.csv
-f treatment_a.csv
-f treatment_b.csv
--working-dir ./stats_analysis
"Compare the distribution of features across treatment groups,
perform statistical tests (t-test, ANOVA),
and generate publication-ready plots"
--mode orchestrated
-f control.csv
-f treatment_a.csv
-f treatment_b.csv
--working-dir ./stats_analysis
undefinedExample 4: Directory-Based Analysis
示例4:基于目录的分析
bash
undefinedbash
undefinedProcess all files in a directory
处理目录中的所有文件
agentic-data-scientist
"Analyze all patient data files in the folder,
aggregate results, and create summary statistics"
--mode orchestrated
--files ./patient_data/
--working-dir ./patient_analysis
"Analyze all patient data files in the folder,
aggregate results, and create summary statistics"
--mode orchestrated
--files ./patient_data/
--working-dir ./patient_analysis
undefinedagentic-data-scientist
"Analyze all patient data files in the folder,
aggregate results, and create summary statistics"
--mode orchestrated
--files ./patient_data/
--working-dir ./patient_analysis
"Analyze all patient data files in the folder,
aggregate results, and create summary statistics"
--mode orchestrated
--files ./patient_data/
--working-dir ./patient_analysis
undefinedPython API Usage
Python API使用方法
For programmatic access, use the Python API:
python
from agentic_data_scientist.cli import main
import sys如需程序化调用,可使用Python API:
python
from agentic_data_scientist.cli import main
import sysPrepare arguments
准备参数
sys.argv = [
'agentic-data-scientist',
'Perform clustering analysis on customer data',
'--mode', 'orchestrated',
'--files', 'customers.csv',
'--working-dir', './clustering_output'
]
sys.argv = [
'agentic-data-scientist',
'Perform clustering analysis on customer data',
'--mode', 'orchestrated',
'--files', 'customers.csv',
'--working-dir', './clustering_output'
]
Run
运行
main()
Or use the workflow directly:
```python
import asyncio
from pathlib import Path
from agentic_data_scientist.workflow import create_workflow
async def run_analysis():
# Create workflow
workflow = create_workflow(
query="Analyze customer segments",
mode="orchestrated",
files=[Path("customers.csv")],
working_dir=Path("./output"),
disable_network=False
)
# Execute
result = await workflow.execute()
print(result)
asyncio.run(run_analysis())main()
或者直接使用工作流:
```python
import asyncio
from pathlib import Path
from agentic_data_scientist.workflow import create_workflow
async def run_analysis():
# 创建工作流
workflow = create_workflow(
query="Analyze customer segments",
mode="orchestrated",
files=[Path("customers.csv")],
working_dir=Path("./output"),
disable_network=False
)
# 执行
result = await workflow.execute()
print(result)
asyncio.run(run_analysis())Common Patterns
常见使用模式
Pattern 1: Iterative Data Exploration
模式1:迭代式数据探索
bash
undefinedbash
undefinedStart with simple mode for quick exploration
使用简易模式快速探索
agentic-data-scientist
"Load dataset and show basic statistics"
--mode simple
--files data.csv
"Load dataset and show basic statistics"
--mode simple
--files data.csv
agentic-data-scientist
"Load dataset and show basic statistics"
--mode simple
--files data.csv
"Load dataset and show basic statistics"
--mode simple
--files data.csv
Then use orchestrated mode for deep analysis
然后使用编排模式进行深度分析
agentic-data-scientist
"Perform full statistical analysis including outlier detection,
correlation analysis, and clustering"
--mode orchestrated
--files data.csv
--working-dir ./deep_analysis
"Perform full statistical analysis including outlier detection,
correlation analysis, and clustering"
--mode orchestrated
--files data.csv
--working-dir ./deep_analysis
undefinedagentic-data-scientist
"Perform full statistical analysis including outlier detection,
correlation analysis, and clustering"
--mode orchestrated
--files data.csv
--working-dir ./deep_analysis
"Perform full statistical analysis including outlier detection,
correlation analysis, and clustering"
--mode orchestrated
--files data.csv
--working-dir ./deep_analysis
undefinedPattern 2: Pipeline Development
模式2:Pipeline开发
bash
undefinedbash
undefinedUse orchestrated mode to develop a complete pipeline
使用编排模式开发完整Pipeline
agentic-data-scientist
"Create a data processing pipeline that: \
"Create a data processing pipeline that: \
- Cleans and normalizes raw data \
- Engineers new features \
- Splits into train/test \
- Trains multiple models \
- Evaluates and selects best model \
- Exports model and metrics"
--mode orchestrated
--files raw_data.csv
--working-dir ./ml_pipeline
undefinedagentic-data-scientist
"Create a data processing pipeline that: \
"Create a data processing pipeline that: \
- Cleans and normalizes raw data \
- Engineers new features \
- Splits into train/test \
- Trains multiple models \
- Evaluates and selects best model \
- Exports model and metrics"
--mode orchestrated
--files raw_data.csv
--working-dir ./ml_pipeline
undefinedPattern 3: Report Generation
模式3:报告生成
bash
undefinedbash
undefinedGenerate comprehensive reports
生成全面报告
agentic-data-scientist
"Analyze quarterly sales data and create an executive report
with visualizations, key metrics, and recommendations"
--mode orchestrated
--files q1_sales.csv q2_sales.csv q3_sales.csv q4_sales.csv
--working-dir ./quarterly_report
"Analyze quarterly sales data and create an executive report
with visualizations, key metrics, and recommendations"
--mode orchestrated
--files q1_sales.csv q2_sales.csv q3_sales.csv q4_sales.csv
--working-dir ./quarterly_report
undefinedagentic-data-scientist
"Analyze quarterly sales data and create an executive report
with visualizations, key metrics, and recommendations"
--mode orchestrated
--files q1_sales.csv q2_sales.csv q3_sales.csv q4_sales.csv
--working-dir ./quarterly_report
"Analyze quarterly sales data and create an executive report
with visualizations, key metrics, and recommendations"
--mode orchestrated
--files q1_sales.csv q2_sales.csv q3_sales.csv q4_sales.csv
--working-dir ./quarterly_report
undefinedPattern 4: Debugging with Verbose Logs
模式4:使用详细日志调试
bash
undefinedbash
undefinedEnable verbose logging for troubleshooting
启用详细日志进行故障排查
agentic-data-scientist
"Complex analysis task"
--mode orchestrated
--files data.csv
--verbose
--log-file debug.log
--keep-files
"Complex analysis task"
--mode orchestrated
--files data.csv
--verbose
--log-file debug.log
--keep-files
undefinedagentic-data-scientist
"Complex analysis task"
--mode orchestrated
--files data.csv
--verbose
--log-file debug.log
--keep-files
"Complex analysis task"
--mode orchestrated
--files data.csv
--verbose
--log-file debug.log
--keep-files
undefinedMulti-Agent Workflow Details
多Agent工作流详情
Agent Roles
Agent角色
- Plan Maker: Creates comprehensive plans with stages and success criteria
- Plan Reviewer: Validates plans are complete before execution
- Plan Parser: Converts plans to structured executable stages
- Stage Orchestrator: Manages execution cycle and adaptation
- Coding Agent: Implements stages (powered by Claude Code with 380+ scientific skills)
- Review Agent: Validates implementations against requirements
- Criteria Checker: Tracks progress against success criteria
- Stage Reflector: Adapts remaining stages based on learnings
- Summary Agent: Synthesizes work into final report
- Plan Maker(规划生成Agent):创建包含阶段和成功标准的全面计划
- Plan Reviewer(规划审核Agent):执行前验证计划是否完整
- Plan Parser(规划解析Agent):将计划转换为结构化可执行阶段
- Stage Orchestrator(阶段编排Agent):管理执行周期和自适应调整
- Coding Agent(编码Agent):实现各个阶段(由具备380+科学技能的Claude Code驱动)
- Review Agent(审核Agent):验证实现是否符合需求
- Criteria Checker(标准校验Agent):跟踪进度是否符合成功标准
- Stage Reflector(阶段反思Agent):根据执行经验调整剩余阶段
- Summary Agent(总结Agent):将工作成果整合为最终报告
Workflow Phases
工作流阶段
Planning Phase:
User Query → Plan Maker → Plan Reviewer → Plan Parser → Structured PlanExecution Phase (per stage):
Stage → Coding Agent → Review Agent → Criteria Checker → Stage ReflectorSummary Phase:
All Completed Stages → Summary Agent → Final Report规划阶段:
用户查询 → Plan Maker → Plan Reviewer → Plan Parser → 结构化计划执行阶段(每个阶段):
阶段 → Coding Agent → Review Agent → Criteria Checker → Stage Reflector总结阶段:
所有已完成阶段 → Summary Agent → 最终报告Troubleshooting
故障排查
API Key Errors
API密钥错误
bash
undefinedbash
undefinedVerify keys are set
验证密钥是否已设置
echo $OPENROUTER_API_KEY
echo $ANTHROPIC_API_KEY
echo $OPENROUTER_API_KEY
echo $ANTHROPIC_API_KEY
Set them if missing
若缺失则设置
export OPENROUTER_API_KEY="your_key"
export ANTHROPIC_API_KEY="your_key"
undefinedexport OPENROUTER_API_KEY="your_key"
export ANTHROPIC_API_KEY="your_key"
undefinedClaude Code Not Found
找不到Claude Code
bash
undefinedbash
undefinedInstall Claude Code CLI
安装Claude Code CLI
npm install -g @anthropic-ai/claude-code
npm install -g @anthropic-ai/claude-code
Verify installation
验证安装
claude-code --version
undefinedclaude-code --version
undefinedNetwork Access Issues
网络访问问题
bash
undefinedbash
undefinedDisable network tools if causing problems
若网络工具引发问题,可禁用网络访问
export DISABLE_NETWORK_ACCESS=true
export DISABLE_NETWORK_ACCESS=true
Or in .env file
或在.env文件中设置
echo "DISABLE_NETWORK_ACCESS=true" >> .env
undefinedecho "DISABLE_NETWORK_ACCESS=true" >> .env
undefinedFile Upload Failures
文件上传失败
bash
undefinedbash
undefinedVerify file exists
验证文件是否存在
ls -la data.csv
ls -la data.csv
Use absolute paths
使用绝对路径
agentic-data-scientist "Analyze"
--mode orchestrated
--files /absolute/path/to/data.csv
--mode orchestrated
--files /absolute/path/to/data.csv
agentic-data-scientist "Analyze"
--mode orchestrated
--files /absolute/path/to/data.csv
--mode orchestrated
--files /absolute/path/to/data.csv
Check directory permissions for recursive upload
检查目录权限(递归上传时)
ls -la ./data_folder/
undefinedls -la ./data_folder/
undefinedWorking Directory Issues
工作目录问题
bash
undefinedbash
undefinedEnsure directory is writable
确保目录可写
mkdir -p ./output
chmod 755 ./output
mkdir -p ./output
chmod 755 ./output
Use temp directory if permission issues
若存在权限问题,使用临时目录
agentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--temp-dir
--mode orchestrated
--files data.csv
--temp-dir
undefinedagentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--temp-dir
--mode orchestrated
--files data.csv
--temp-dir
undefinedExecution Hanging
执行停滞
bash
undefinedbash
undefinedUse verbose mode to see what's happening
使用详细模式查看执行状态
agentic-data-scientist "Query"
--mode orchestrated
--files data.csv
--verbose
--mode orchestrated
--files data.csv
--verbose
agentic-data-scientist "Query"
--mode orchestrated
--files data.csv
--verbose
--mode orchestrated
--files data.csv
--verbose
Try simple mode to isolate planning vs execution issues
尝试使用简易模式,区分是规划还是执行环节的问题
agentic-data-scientist "Query"
--mode simple
--files data.csv
--mode simple
--files data.csv
undefinedagentic-data-scientist "Query"
--mode simple
--files data.csv
--mode simple
--files data.csv
undefinedOutput Not Preserved
输出未保留
bash
undefinedbash
undefinedDefault behavior preserves files in ./agentic_output/
默认行为会将文件保留在./agentic_output/目录下
ls -la ./agentic_output/
ls -la ./agentic_output/
Explicitly set working directory
显式设置工作目录
agentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--working-dir ./my_output
--mode orchestrated
--files data.csv
--working-dir ./my_output
agentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--working-dir ./my_output
--mode orchestrated
--files data.csv
--working-dir ./my_output
Use --keep-files to override temp-dir cleanup
使用--keep-files参数覆盖临时目录的清理规则
agentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
undefinedagentic-data-scientist "Analyze"
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
--mode orchestrated
--files data.csv
--temp-dir
--keep-files
undefinedMode Selection Guide
模式选择指南
Use Orchestrated Mode when:
- Task is complex with multiple stages
- Need thorough planning and validation
- Quality and completeness are critical
- Task requires iterative refinement
- Want comprehensive final report
Use Simple Mode when:
- Quick scripting or one-off tasks
- Simple question answering
- Prototyping or exploration
- Want fast turnaround
- Don't need multi-stage workflow
选择编排模式的场景:
- 任务复杂,包含多个阶段
- 需要完善的规划和验证
- 对结果质量和完整性要求较高
- 任务需要迭代优化
- 需要生成全面的最终报告
选择简易模式的场景:
- 快速脚本编写或一次性任务
- 简单问题解答
- 原型开发或探索性工作
- 追求快速交付
- 无需多阶段工作流
Advanced Configuration
高级配置
Custom Prompts
自定义提示词
Extend the framework by customizing agent prompts:
python
from agentic_data_scientist.prompts import PLAN_MAKER_PROMPT通过自定义Agent提示词扩展框架功能:
python
from agentic_data_scientist.prompts import PLAN_MAKER_PROMPTModify prompts for domain-specific needs
根据领域需求修改提示词
custom_prompt = PLAN_MAKER_PROMPT + """
Additional domain context:
- Focus on genomics data
- Use bioinformatics best practices """
undefinedcustom_prompt = PLAN_MAKER_PROMPT + """
额外领域上下文:
- 聚焦基因组学数据
- 遵循生物信息学最佳实践 """
undefinedMCP Server Integration
MCP服务器集成
The framework supports Model Context Protocol for custom tools:
python
undefined框架支持通过Model Context Protocol集成自定义工具:
python
undefinedConfigure MCP servers in your workflow
在工作流中配置MCP服务器
Agents automatically gain access to tools
Agent会自动获取工具访问权限
undefinedundefinedAccess to Claude Scientific Skills
访问Claude科学技能
The coding agent has access to 380+ scientific computing skills including:
- Statistical analysis
- Machine learning
- Data visualization
- Bioinformatics
- Scientific computing libraries
These are automatically available during execution phase.
编码Agent可访问380+科学计算技能,包括:
- 统计分析
- 机器学习
- 数据可视化
- 生物信息学
- 科学计算库
这些技能在执行阶段会自动可用。