run
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese/ar:run — Single Experiment Iteration
/ar:run — 单次实验迭代
Run exactly ONE experiment iteration: review history, decide a change, edit, commit, evaluate.
运行恰好一次实验迭代:查看历史记录、确定修改内容、编辑、提交、评估。
Usage
使用方法
/ar:run engineering/api-speed # Run one iteration
/ar:run # List experiments, let user pick/ar:run engineering/api-speed # 运行一次迭代
/ar:run # 列出实验,让用户选择What It Does
功能说明
Step 1: Resolve experiment
步骤1:解析实验
If no experiment specified, run and ask the user to pick.
python {skill_path}/scripts/setup_experiment.py --list如果未指定实验,运行并让用户选择。
python {skill_path}/scripts/setup_experiment.py --listStep 2: Load context
步骤2:加载上下文
bash
undefinedbash
undefinedRead experiment config
读取实验配置
cat .autoresearch/{domain}/{name}/config.cfg
cat .autoresearch/{domain}/{name}/config.cfg
Read strategy and constraints
读取策略与约束
cat .autoresearch/{domain}/{name}/program.md
cat .autoresearch/{domain}/{name}/program.md
Read experiment history
读取实验历史
cat .autoresearch/{domain}/{name}/results.tsv
cat .autoresearch/{domain}/{name}/results.tsv
Checkout the experiment branch
切换到实验分支
git checkout autoresearch/{domain}/{name}
undefinedgit checkout autoresearch/{domain}/{name}
undefinedStep 3: Decide what to try
步骤3:确定尝试方向
Review results.tsv:
- What changes were kept? What pattern do they share?
- What was discarded? Avoid repeating those approaches.
- What crashed? Understand why.
- How many runs so far? (Escalate strategy accordingly)
Strategy escalation:
- Runs 1-5: Low-hanging fruit (obvious improvements)
- Runs 6-15: Systematic exploration (vary one parameter)
- Runs 16-30: Structural changes (algorithm swaps)
- Runs 30+: Radical experiments (completely different approaches)
查看results.tsv:
- 哪些修改被保留了?它们有什么共同模式?
- 哪些被舍弃了?避免重复这些方法。
- 哪些运行崩溃了?了解原因。
- 目前已运行多少次?(据此调整策略)
策略升级:
- 第1-5次运行:低难度优化点(明显的改进方向)
- 第6-15次运行:系统性探索(调整单个参数)
- 第16-30次运行:结构性修改(算法替换)
- 第30次以上运行:激进实验(完全不同的方案)
Step 4: Make ONE change
步骤4:进行一项修改
Edit only the target file specified in config.cfg. Change one thing. Keep it simple.
仅编辑config.cfg中指定的目标文件,只修改一处内容,保持简洁。
Step 5: Commit and evaluate
步骤5:提交并评估
bash
git add {target}
git commit -m "experiment: {short description of what changed}"
python {skill_path}/scripts/run_experiment.py \
--experiment {domain}/{name} --singlebash
git add {target}
git commit -m "experiment: {short description of what changed}"
python {skill_path}/scripts/run_experiment.py \
--experiment {domain}/{name} --singleStep 6: Report result
步骤6:报告结果
Read the script output. Tell the user:
- KEEP: "Improvement! {metric}: {value} ({delta} from previous best)"
- DISCARD: "No improvement. {metric}: {value} vs best {best}. Reverted."
- CRASH: "Evaluation failed: {reason}. Reverted."
读取脚本输出,告知用户:
- 保留:"性能提升!{metric}:{value}(较之前最优值变化{delta})"
- 舍弃:"无性能提升。{metric}:{value},对比最优值{best}。已回滚。"
- 崩溃:"评估失败:{reason}。已回滚。"
Step 7: Self-improvement check
步骤7:自我优化检查
After every 10th experiment (check results.tsv line count), update the Strategy section of program.md with patterns learned.
每完成10次实验后(查看results.tsv的行数),更新program.md中的策略部分,加入已总结的模式。
Rules
规则
- ONE change per iteration. Don't change 5 things at once.
- NEVER modify the evaluator (evaluate.py). It's ground truth.
- Simplicity wins. Equal performance with simpler code is an improvement.
- No new dependencies.
- 每次迭代仅做一项修改,不要同时修改5处内容。
- 绝不要修改评估器(evaluate.py),它是基准依据。
- 简洁优先:性能相同但代码更简洁即为改进。
- 不要添加新依赖。