setup
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese/ar:setup — Create New Experiment
/ar:setup — 创建新实验
Set up a new autoresearch experiment with all required configuration.
设置包含所有必要配置的新自动研究实验。
Usage
使用方法
/ar:setup # Interactive mode
/ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower
/ar:setup --list # Show existing experiments
/ar:setup --list-evaluators # Show available evaluators/ar:setup # 交互式模式
/ar:setup engineering api-speed src/api.py "pytest bench.py" p50_ms lower
/ar:setup --list # 显示现有实验
/ar:setup --list-evaluators # 显示可用评估器What It Does
功能说明
If arguments provided
若提供参数
Pass them directly to the setup script:
bash
python {skill_path}/scripts/setup_experiment.py \
--domain {domain} --name {name} \
--target {target} --eval "{eval_cmd}" \
--metric {metric} --direction {direction} \
[--evaluator {evaluator}] [--scope {scope}]直接将参数传递给设置脚本:
bash
python {skill_path}/scripts/setup_experiment.py \
--domain {domain} --name {name} \
--target {target} --eval "{eval_cmd}" \
--metric {metric} --direction {direction} \
[--evaluator {evaluator}] [--scope {scope}]If no arguments (interactive mode)
若未提供参数(交互式模式)
Collect each parameter one at a time:
- Domain — Ask: "What domain? (engineering, marketing, content, prompts, custom)"
- Name — Ask: "Experiment name? (e.g., api-speed, blog-titles)"
- Target file — Ask: "Which file to optimize?" Verify it exists.
- Eval command — Ask: "How to measure it? (e.g., pytest bench.py, python evaluate.py)"
- Metric — Ask: "What metric does the eval output? (e.g., p50_ms, ctr_score)"
- Direction — Ask: "Is lower or higher better?"
- Evaluator (optional) — Show built-in evaluators. Ask: "Use a built-in evaluator, or your own?"
- Scope — Ask: "Store in project (.autoresearch/) or user (~/.autoresearch/)?"
Then run with the collected parameters.
setup_experiment.py逐个收集每个参数:
- 领域 — 询问:“所属领域?(engineering、marketing、content、prompts、custom)”
- 实验名称 — 询问:“实验名称?(例如:api-speed、blog-titles)”
- 目标文件 — 询问:“要优化的文件是哪个?”并验证文件是否存在。
- 评估命令 — 询问:“如何进行测量?(例如:pytest bench.py、python evaluate.py)”
- 指标 — 询问:“评估输出的指标是什么?(例如:p50_ms、ctr_score)”
- 优化方向 — 询问:“数值越低越好还是越高越好?”
- 评估器(可选)— 显示内置评估器,询问:“使用内置评估器还是自定义评估器?”
- 存储范围 — 询问:“存储在项目目录(.autoresearch/)还是用户目录(~/.autoresearch/)?”
随后使用收集到的参数运行。
setup_experiment.pyListing
列表查看
bash
undefinedbash
undefinedShow existing experiments
显示现有实验
python {skill_path}/scripts/setup_experiment.py --list
python {skill_path}/scripts/setup_experiment.py --list
Show available evaluators
显示可用评估器
python {skill_path}/scripts/setup_experiment.py --list-evaluators
undefinedpython {skill_path}/scripts/setup_experiment.py --list-evaluators
undefinedBuilt-in Evaluators
内置评估器
| Name | Metric | Use Case |
|---|---|---|
| | Function/API execution time |
| | File, bundle, Docker image size |
| | Test suite pass percentage |
| | Build/compile/Docker build time |
| | Peak memory during execution |
| | Headlines, titles, descriptions |
| | System prompts, agent instructions |
| | Social posts, ad copy, emails |
| 名称 | 指标 | 使用场景 |
|---|---|---|
| | 函数/API执行时间 |
| | 文件、打包文件、Docker镜像大小 |
| | 测试套件通过率 |
| | 构建/编译/Docker构建时间 |
| | 执行期间的峰值内存 |
| | 标题、副标题、描述 |
| | 系统提示词、Agent指令 |
| | 社交帖子、广告文案、邮件 |
After Setup
设置完成后
Report to the user:
- Experiment path and branch name
- Whether the eval command worked and the baseline metric
- Suggest: "Run to start iterating, or
/ar:run {domain}/{name}for autonomous mode."/ar:loop {domain}/{name}
向用户反馈:
- 实验路径和分支名称
- 评估命令是否可用以及基准指标
- 建议:“运行开始迭代,或使用
/ar:run {domain}/{name}进入自主模式。”/ar:loop {domain}/{name}