autoresearch
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseClaude Autoresearch — Autonomous Goal-directed Iteration
Claude Autoresearch — 自主目标导向迭代
Inspired by Karpathy's autoresearch. Applies constraint-driven autonomous iteration to ANY work — not just ML research.
Core idea: You are an autonomous agent. Modify → Verify → Keep/Discard → Repeat.
灵感来自Karpathy的autoresearch。将约束驱动的自主迭代应用于任何工作——不仅仅是ML研究。
核心理念: 你是一个自主Agent。修改 → 验证 → 保留/舍弃 → 重复。
Subcommands
子命令
| Subcommand | Purpose |
|---|---|
| Run the autonomous loop (default) |
| Interactive wizard to build Scope, Metric, Direction & Verify from a Goal |
| 子命令 | 用途 |
|---|---|
| 运行自主循环(默认) |
| 交互式向导,基于目标构建范围、指标、方向与验证规则 |
/autoresearch:plan — Goal → Configuration Wizard
/autoresearch:plan — 目标→配置向导
Converts a plain-language goal into a validated, ready-to-execute autoresearch configuration.
Load: for full protocol.
references/plan-workflow.mdQuick summary:
- Capture Goal — ask what the user wants to improve (or accept inline text)
- Analyze Context — scan codebase for tooling, test runners, build scripts
- Define Scope — suggest file globs, validate they resolve to real files
- Define Metric — suggest mechanical metrics, validate they output a number
- Define Direction — higher or lower is better
- Define Verify — construct the shell command, dry-run it, confirm it works
- Confirm & Launch — present the complete config, offer to launch immediately
Critical gates:
- Metric MUST be mechanical (outputs a parseable number, not subjective)
- Verify command MUST pass a dry run on the current codebase before accepting
- Scope MUST resolve to ≥1 file
Usage:
/autoresearch:plan
Goal: Make the API respond faster
/autoresearch:plan Increase test coverage to 95%
/autoresearch:plan Reduce bundle size below 200KBAfter the wizard completes, the user gets a ready-to-paste invocation — or can launch it directly.
/autoresearch将自然语言描述的目标转换为经过验证、可直接执行的autoresearch配置。
加载查看完整流程。
references/plan-workflow.md快速概述:
- 捕获目标 — 询问用户想要优化的内容(或接受内联文本)
- 分析上下文 — 扫描代码库,了解工具、测试运行器、构建脚本等信息
- 定义范围 — 建议文件通配符,验证其指向真实文件
- 定义指标 — 建议可量化指标,验证其能输出数值
- 定义优化方向 — 确定指标是越高越好还是越低越好
- 定义验证规则 — 构建Shell命令,试运行并确认其可用
- 确认并启动 — 展示完整配置,提供直接启动选项
关键校验:
- 指标必须是可量化的(输出可解析的数值,而非主观判断)
- 验证命令必须在当前代码库中通过试运行才能被接受
- 范围必须指向至少1个文件
用法:
/autoresearch:plan
Goal: Make the API respond faster
/autoresearch:plan Increase test coverage to 95%
/autoresearch:plan Reduce bundle size below 200KB向导完成后,用户将获得可直接粘贴的调用命令,或可直接启动任务。
/autoresearchWhen to Activate
触发场景
- User invokes or
/autoresearch→ run the loop/ug:autoresearch - User invokes → run the planning wizard
/autoresearch:plan - User says "help me set up autoresearch", "plan an autoresearch run" → run the planning wizard
- User says "work autonomously", "iterate until done", "keep improving", "run overnight" → run the loop
- Any task requiring repeated iteration cycles with measurable outcomes → run the loop
- 用户调用或
/autoresearch→ 运行循环/ug:autoresearch - 用户调用→ 运行配置向导
/autoresearch:plan - 用户说"帮我设置autoresearch"、"规划autoresearch运行" → 运行配置向导
- 用户说"自主工作"、"迭代直到完成"、"持续优化"、"通宵运行" → 运行循环
- 任何需要重复迭代并产生可衡量结果的任务 → 运行循环
Optional: Controlled Loop Count
可选:受控循环次数
By default, autoresearch loops forever until manually interrupted. However, users can optionally specify a loop count to limit iterations using Claude Code's built-in command.
/loopRequires: Claude Code v1.0.32+ (thecommand was introduced in this version)/loop
默认情况下,autoresearch会无限循环直到手动中断。不过用户可通过Claude Code内置的命令,可选地指定循环次数来限制迭代次数。
/loop要求: Claude Code v1.0.32+(命令在此版本中引入)/loop
Usage
用法
Unlimited (default):
/autoresearch
Goal: Increase test coverage to 90%Bounded (N iterations):
/loop 25 /autoresearch
Goal: Increase test coverage to 90%This chains with , running exactly 25 iteration cycles. After 25 iterations, Claude stops and prints a final summary.
/autoresearch/loop 25无限循环(默认):
/autoresearch
Goal: Increase test coverage to 90%有限循环(N次迭代):
/loop 25 /autoresearch
Goal: Increase test coverage to 90%此命令将与关联,恰好运行25次迭代循环。25次迭代后,Claude将停止并打印最终总结。
/autoresearch/loop 25When to Use Bounded Loops
有限循环的适用场景
| Scenario | Recommendation |
|---|---|
| Run overnight, review in morning | Unlimited (default) |
| Quick 30-min improvement session | |
| Targeted fix with known scope | |
| Exploratory — see if approach works | |
| CI/CD pipeline integration | |
| 场景 | 建议 |
|---|---|
| 通宵运行,次日查看结果 | 无限循环(默认) |
| 30分钟快速优化会话 | |
| 范围明确的定向修复 | |
| 探索性尝试——验证方法是否可行 | |
| CI/CD流水线集成 | |
Behavior with Loop Count
指定循环次数后的行为
When a loop count is specified:
- Claude runs exactly N iterations through the autoresearch loop
- After iteration N, Claude prints a final summary with baseline → current best, keeps/discards/crashes
- If the goal is achieved before N iterations, Claude prints early completion and stops
- All other rules (atomic changes, mechanical verification, auto-rollback) still apply
当指定循环次数时:
- Claude将恰好运行N次autoresearch循环迭代
- 第N次迭代后,Claude将打印最终总结,包含基线→当前最优结果、保留/舍弃/失败记录
- 如果在N次迭代前已达成目标,Claude将提前结束并打印完成信息
- 所有其他规则(原子变更、可量化验证、自动回滚)仍然适用
Setup Phase (Do Once)
初始化阶段(仅需执行一次)
- Read all in-scope files for full context before any modification
- Define the goal — What does "better" mean? Extract or ask for a mechanical metric:
- Code: tests pass, build succeeds, performance benchmark improves
- Content: word count target hit, SEO score improves, readability score
- Design: lighthouse score, accessibility audit passes
- If no metric exists → define one with user, or use simplest proxy (e.g. "compiles without errors")
- Define scope constraints — Which files can you modify? Which are read-only?
- Create a results log — Track every iteration (see )
references/results-logging.md - Establish baseline — Run verification on current state. Record as iteration #0
- Confirm and go — Show user the setup, get confirmation, then BEGIN THE LOOP
- 读取所有范围内的文件,在进行任何修改前获取完整上下文
- 定义目标 — "更好"的标准是什么?提取或与用户协商定义可量化指标:
- 代码:测试通过、构建成功、性能基准提升
- 内容:达成字数目标、SEO分数提升、可读性分数优化
- 设计:Lighthouse分数提升、可访问性审计通过
- 如果没有现成指标→与用户共同定义,或使用最简单的替代指标(如"编译无错误")
- 定义范围约束 — 可修改哪些文件?哪些是只读文件?
- 创建结果日志 — 跟踪每一次迭代(查看)
references/results-logging.md - 建立基准线 — 在当前状态下运行验证,记录为第0次迭代
- 确认并启动 — 向用户展示初始化信息,获得确认后开始循环
The Loop
循环流程
Read for full protocol details.
references/autonomous-loop-protocol.mdLOOP (FOREVER or N times):
1. Review: Read current state + git history + results log
2. Ideate: Pick next change based on goal, past results, what hasn't been tried
3. Modify: Make ONE focused change to in-scope files
4. Commit: Git commit the change (before verification)
5. Verify: Run the mechanical metric (tests, build, benchmark, etc.)
6. Decide:
- IMPROVED → Keep commit, log "keep", advance
- SAME/WORSE → Git revert, log "discard"
- CRASHED → Try to fix (max 3 attempts), else log "crash" and move on
7. Log: Record result in results log
8. Repeat: Go to step 1.
- If unbounded: NEVER STOP. NEVER ASK "should I continue?"
- If bounded (N): Stop after N iterations, print final summary查看获取完整流程细节。
references/autonomous-loop-protocol.mdLOOP(无限次或N次):
1. 回顾:读取当前状态 + Git历史 + 结果日志
2. 构思:基于目标、过往结果和未尝试的方法,选择下一次变更
3. 修改:对范围内的文件进行**一次聚焦式变更**
4. 提交:Git提交变更(在验证前)
5. 验证:运行可量化指标(测试、构建、基准测试等)
6. 决策:
- 优化→保留提交,记录"保留",继续迭代
- 无变化/恶化→Git回滚,记录"舍弃"
- 崩溃→尝试修复(最多3次),否则记录"崩溃"并继续
7. 记录:将结果写入结果日志
8. 重复:回到步骤1
- 无限循环:永不停止。永不询问"是否继续?"
- 有限循环(N次):N次迭代后停止,打印最终总结Critical Rules
核心规则
- Loop until done — Unbounded: loop until interrupted. Bounded: loop N times then summarize.
- Read before write — Always understand full context before modifying
- One change per iteration — Atomic changes. If it breaks, you know exactly why
- Mechanical verification only — No subjective "looks good". Use metrics
- Automatic rollback — Failed changes revert instantly. No debates
- Simplicity wins — Equal results + less code = KEEP. Tiny improvement + ugly complexity = DISCARD
- Git is memory — Every kept change committed. Agent reads history to learn patterns
- When stuck, think harder — Re-read files, re-read goal, combine near-misses, try radical changes. Don't ask for help unless truly blocked by missing access/permissions
- 循环直到完成 — 无限循环:直到被中断。有限循环:完成N次迭代后总结
- 先读再写 — 修改前必须充分理解完整上下文
- 每次迭代仅做一次变更 — 原子变更。如果出错,可准确定位原因
- 仅采用可量化验证 — 不接受主观的"看起来不错",必须使用指标
- 自动回滚 — 失败的变更立即回滚,无需争议
- 简洁优先 — 结果相同+代码更少→保留。微小提升+复杂代码→舍弃
- Git作为记忆 — 所有保留的变更都将提交。Agent通过读取历史来学习模式
- 遇到瓶颈时深入思考 — 重新读取文件、回顾目标、组合近似方案、尝试激进变更。除非确实因权限/访问缺失受阻,否则不要寻求帮助
Principles Reference
原则参考
See for the 7 generalizable principles from autoresearch.
references/core-principles.md查看获取autoresearch的7项通用原则。
references/core-principles.mdAdapting to Different Domains
适配不同领域
| Domain | Metric | Scope | Verify Command |
|---|---|---|---|
| Backend code | Tests pass + coverage % | | |
| Frontend UI | Lighthouse score | | |
| ML training | val_bpb / loss | | |
| Blog/content | Word count + readability | | Custom script |
| Performance | Benchmark time (ms) | Target files | |
| Refactoring | Tests pass + LOC reduced | Target module | |
Adapt the loop to your domain. The PRINCIPLES are universal; the METRICS are domain-specific.
| 领域 | 指标 | 范围 | 验证命令 |
|---|---|---|---|
| 后端代码 | 测试通过 + 覆盖率百分比 | | |
| 前端UI | Lighthouse分数 | | |
| ML训练 | val_bpb / 损失值 | | |
| 博客/内容 | 字数 + 可读性分数 | | 自定义脚本 |
| 性能 | 基准测试时间(ms) | 目标文件 | |
| 代码重构 | 测试通过 + 代码行数减少 | 目标模块 | |
根据领域适配循环流程。原则是通用的;指标是领域特定的。