run

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

/ar:run — Single Experiment Iteration

/ar:run — 单次实验迭代

Run exactly ONE experiment iteration: review history, decide a change, edit, commit, evaluate.

运行恰好一次实验迭代：查看历史记录、确定修改内容、编辑、提交、评估。

Usage

使用方法

/ar:run engineering/api-speed              # Run one iteration
/ar:run                                     # List experiments, let user pick

/ar:run engineering/api-speed              # 运行一次迭代
/ar:run                                     # 列出实验，让用户选择

What It Does

功能说明

Step 1: Resolve experiment

步骤1：解析实验

If no experiment specified, run

python {skill_path}/scripts/setup_experiment.py --list

and ask the user to pick.

如果未指定实验，运行

python {skill_path}/scripts/setup_experiment.py --list

并让用户选择。

Step 2: Load context

步骤2：加载上下文

bash

undefined

bash

undefined

Read experiment config

读取实验配置

cat .autoresearch/{domain}/{name}/config.cfg

Read strategy and constraints

读取策略与约束

cat .autoresearch/{domain}/{name}/program.md

Read experiment history

读取实验历史

cat .autoresearch/{domain}/{name}/results.tsv

Checkout the experiment branch

切换到实验分支

git checkout autoresearch/{domain}/{name}

undefined

git checkout autoresearch/{domain}/{name}

undefined

Step 3: Decide what to try

步骤3：确定尝试方向

Review results.tsv:

What changes were kept? What pattern do they share?
What was discarded? Avoid repeating those approaches.
What crashed? Understand why.
How many runs so far? (Escalate strategy accordingly)

Strategy escalation:

Runs 1-5: Low-hanging fruit (obvious improvements)
Runs 6-15: Systematic exploration (vary one parameter)
Runs 16-30: Structural changes (algorithm swaps)
Runs 30+: Radical experiments (completely different approaches)

查看results.tsv：

哪些修改被保留了？它们有什么共同模式？
哪些被舍弃了？避免重复这些方法。
哪些运行崩溃了？了解原因。
目前已运行多少次？（据此调整策略）

策略升级：

第1-5次运行：低难度优化点（明显的改进方向）
第6-15次运行：系统性探索（调整单个参数）
第16-30次运行：结构性修改（算法替换）
第30次以上运行：激进实验（完全不同的方案）

Step 4: Make ONE change

步骤4：进行一项修改

Edit only the target file specified in config.cfg. Change one thing. Keep it simple.

仅编辑config.cfg中指定的目标文件，只修改一处内容，保持简洁。

Step 5: Commit and evaluate

步骤5：提交并评估

bash

git add {target}
git commit -m "experiment: {short description of what changed}"

python {skill_path}/scripts/run_experiment.py \
  --experiment {domain}/{name} --single

bash

git add {target}
git commit -m "experiment: {short description of what changed}"

python {skill_path}/scripts/run_experiment.py \
  --experiment {domain}/{name} --single

Step 6: Report result

步骤6：报告结果

Read the script output. Tell the user:

KEEP: "Improvement! {metric}: {value} ({delta} from previous best)"
DISCARD: "No improvement. {metric}: {value} vs best {best}. Reverted."
CRASH: "Evaluation failed: {reason}. Reverted."

读取脚本输出，告知用户：

保留："性能提升！{metric}：{value}（较之前最优值变化{delta}）"
舍弃："无性能提升。{metric}：{value}，对比最优值{best}。已回滚。"
崩溃："评估失败：{reason}。已回滚。"

Step 7: Self-improvement check

步骤7：自我优化检查

After every 10th experiment (check results.tsv line count), update the Strategy section of program.md with patterns learned.

每完成10次实验后（查看results.tsv的行数），更新program.md中的策略部分，加入已总结的模式。

Rules

规则

ONE change per iteration. Don't change 5 things at once.
NEVER modify the evaluator (evaluate.py). It's ground truth.
Simplicity wins. Equal performance with simpler code is an improvement.
No new dependencies.

每次迭代仅做一项修改，不要同时修改5处内容。
绝不要修改评估器（evaluate.py），它是基准依据。
简洁优先：性能相同但代码更简洁即为改进。
不要添加新依赖。