autobrowse
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAutoBrowse — Self-Improving Browser Skill
AutoBrowse — 自我改进的浏览器技能
Build reliable browser automation skills through iterative experimentation. An inner agent browses the site (). You — the outer agent — read what happened and improve the instructions (). Repeat until it passes consistently.
evaluate.tsstrategy.md通过迭代实验构建可靠的浏览器自动化技能。内部代理负责浏览网站(),而你作为外部代理,读取执行过程并优化指令(),重复此过程直到任务能持续完成。
evaluate.tsstrategy.mdEntry Points
入口方式
Invocation is flexible — both explicit flags and free-form natural language work:
/autobrowse --task google-flights
/autobrowse --task google-flights --iterations 10 --env remote
/autobrowse --tasks google-flights,amazon-add-to-cart
/autobrowse --all调用方式灵活——既支持明确的命令行参数,也支持自由格式的自然语言指令:
/autobrowse --task google-flights
/autobrowse --task google-flights --iterations 10 --env remote
/autobrowse --tasks google-flights,amazon-add-to-cart
/autobrowse --allAlso fine — parse freely:
以下方式同样可行——支持自由解析:
/autobrowse https://flights.google.com/
/autobrowse book a flight on delta.com
/autobrowse fix the existing google-flights skill
When the user drops a URL or free-form instruction instead of `--task <name>`:
- If an existing task in `${WORKSPACE}/tasks/` clearly matches the site/intent, use it.
- Otherwise, pick a short kebab-case name, create `${WORKSPACE}/tasks/<name>/task.md` from `${CLAUDE_SKILL_DIR}/references/example-task.md`, fill in the URL/goal based on what the user said, and proceed. Tell the user the chosen name in one line.
---/autobrowse https://flights.google.com/
/autobrowse book a flight on delta.com
/autobrowse fix the existing google-flights skill
当用户输入URL或自由格式指令而非`--task <name>`时:
- 如果`${WORKSPACE}/tasks/`中已有与该网站/意图匹配的任务,则直接使用。
- 否则,选择一个短横线命名的名称,从`${CLAUDE_SKILL_DIR}/references/example-task.md`创建`${WORKSPACE}/tasks/<name>/task.md`,根据用户输入填充URL和目标后继续执行,并向用户告知所选名称。
---How to run
运行步骤
Step 1 — Parse arguments and orient
步骤1 — 解析参数并确定方向
Check what was passed:
- → single task mode
--task <name> - or
--tasks a,b,c→ multi-task mode (spawn sub-agents)--all - → how many evaluate → improve cycles (default: 5)
--iterations N - → browser environment (default: local; use remote for bot-protected sites)
--env local|remote
If the user passed free-form text instead, map it to one of the above before continuing.
检查输入内容:
- → 单任务模式
--task <name> - 或
--tasks a,b,c→ 多任务模式(生成子代理)--all - → 评估→优化循环的次数(默认值:5)
--iterations N - → 浏览器运行环境(默认值:local;针对有反机器人保护的网站使用remote)
--env local|remote
如果用户输入的是自由格式文本,先将其映射为上述模式之一再继续。
Step 2 — Set up the workspace
步骤2 — 设置工作区
All training artifacts (task definitions, strategy iterations, traces, reports) live in a workspace directory in the current working directory — NOT inside . This keeps the inner agent's file writes out of Claude's home dir and away from permission friction.
~/.claude/skills/Default workspace:
${CWD}/autobrowse/bash
mkdir -p ./autobrowse/tasks ./autobrowse/traces ./autobrowse/reportsIf the task directory () doesn't exist yet, scaffold it:
./autobrowse/tasks/<task>/task.mdbash
mkdir -p ./autobrowse/tasks/<task>
cp ${CLAUDE_SKILL_DIR}/references/example-task.md ./autobrowse/tasks/<task>/task.md所有训练产物(任务定义、策略迭代版本、执行轨迹、报告)都存储在当前工作目录下的工作区文件夹中——而非内。这样可以避免内部代理的文件写入操作进入Claude的主目录,减少权限问题。
~/.claude/skills/默认工作区:
${CWD}/autobrowse/bash
mkdir -p ./autobrowse/tasks ./autobrowse/traces ./autobrowse/reports如果任务目录()尚未存在,则创建模板:
./autobrowse/tasks/<task>/task.mdbash
mkdir -p ./autobrowse/tasks/<task>
cp ${CLAUDE_SKILL_DIR}/references/example-task.md ./autobrowse/tasks/<task>/task.mdThen edit task.md to describe the URL, inputs, steps, and expected JSON output
随后编辑task.md,描述URL、输入信息、步骤和预期的JSON输出
The skill source at `${CLAUDE_SKILL_DIR}` stays read-only — only `./autobrowse/` in CWD gets written to during training. Graduation (final step) writes a single file to `~/.claude/skills/<task>/SKILL.md`.
List available tasks:
```bash
ls ./autobrowse/tasks/
`${CLAUDE_SKILL_DIR}`下的技能源文件保持只读状态——训练期间只有当前目录下的`./autobrowse/`会被写入。最终的毕业步骤会将单个文件写入`~/.claude/skills/<task>/SKILL.md`。
查看可用任务:
```bash
ls ./autobrowse/tasks/Step 3 — Multi-task: spawn parallel sub-agents
步骤3 — 多任务模式:并行生成子代理
If running multiple tasks, use the Agent tool to spawn one sub-agent per task simultaneously. Each sub-agent receives a self-contained prompt to run the full autobrowse loop for its task:
"You are running the autobrowse skill for task. Workspace:<name>(e.g.<absolute-path-to-workspace>). Run/path/to/project/autobrowseiterations of: evaluate → read trace → improve strategy.md → repeat. Use<N>. Pass--env <env>to every evaluate.mjs invocation. Follow the autobrowse loop instructions exactly.--workspace <workspace>When graduating, install the skill towith proper agentskills frontmatter (name + description). Do not just copy strategy.md — write a self-contained skill.~/.claude/skills/<task-name>/SKILL.mdAt the end, output a structured summary with: task name, pass/fail on final run, total cumulative cost, iterations completed, per-iteration table (iter number, turns, cost, status, hypothesis tested), and 2-3 bullet key learnings."
Spawn all sub-agents in parallel, wait for all to complete, then collect their summaries and write the session report.
For single task, skip this step and run the loop directly below.
如果运行多个任务,使用Agent工具为每个任务同时生成一个子代理。每个子代理会收到独立的提示,用于运行对应任务的完整AutoBrowse循环:
"你正在为任务运行autobrowse技能。工作区:<name>(例如<absolute-path-to-workspace>)。运行/path/to/project/autobrowse次评估→读取轨迹→优化strategy.md→重复的循环。使用<N>参数。在每次调用evaluate.mjs时传入--env <env>。严格遵循autobrowse循环的说明。--workspace <workspace>完成毕业步骤时,将技能安装到,并添加正确的agentskills前置信息(名称+描述)。不要直接复制strategy.md——要编写一个独立可用的技能。~/.claude/skills/<task-name>/SKILL.md最后,输出结构化摘要,包含:任务名称、最终运行结果(通过/失败)、累计总成本、完成的迭代次数、迭代详情表(迭代编号、交互轮次、成本、状态、测试的假设),以及2-3条关键要点。"
并行生成所有子代理,等待全部完成后,收集它们的摘要并写入会话报告。
单任务模式跳过此步骤,直接运行下方的循环。
The Loop (run this for each task)
循环流程(每个任务执行此流程)
Iteration start
迭代开始
Check that exists (scaffold it from the template if not — see Step 2). is auto-created empty by the harness on first run.
./autobrowse/tasks/<task>/task.mdstrategy.md确认已存在(如果不存在,从模板创建——见步骤2)。首次运行时,harness会自动创建空的。
./autobrowse/tasks/<task>/task.mdstrategy.mdRequirements
前置要求
- must be in the environment (or in a
ANTHROPIC_API_KEYfile in CWD —.envauto-loads it). If missing, the harness prints a clear error and exits; don't hunt for keys in other paths.evaluate.mjs
- 环境中必须配置(或当前目录下的
ANTHROPIC_API_KEY文件——.env会自动加载)。如果缺失,harness会打印清晰的错误信息并退出;无需在其他路径查找密钥。evaluate.mjs
Run the inner agent
运行内部代理
bash
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowsebash
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowseor for bot-protected sites:
针对有反机器人保护的网站:
node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse --env remote
This runs the browser session and writes a full trace to `./autobrowse/traces/<task>/latest/`.node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <task-name> --workspace ./autobrowse --env remote
此命令会运行浏览器会话,并将完整的执行轨迹写入`./autobrowse/traces/<task>/latest/`。Read the trace
读取执行轨迹
bash
cat ./autobrowse/traces/<task-name>/latest/summary.mdThe summary has duration, cost, turns, the decision log, and the final JSON output.
If the agent failed or got stuck, look deeper:
- Read — search for the failure turn
./autobrowse/traces/<task-name>/latest/trace.json - Read screenshots around the failure point with the Read tool
bash
cat ./autobrowse/traces/<task-name>/latest/summary.md摘要包含时长、成本、交互轮次、决策日志和最终JSON输出。
如果代理执行失败或卡住,需深入查看:
- 读取——搜索失败的交互轮次
./autobrowse/traces/<task-name>/latest/trace.json - 使用Read工具查看失败点附近的截图
Form one hypothesis
提出一个假设
Find the exact turn where things went wrong. What single heuristic would have prevented it?
Examples:
- "After clicking the dropdown, wait 1s — options animate in before they're clickable"
- "Navigate directly to — skip the landing page entirely"
/pay-invoice/ - "Use not
browse fill #field_3 value— this field clears on focus"browse type - "The page shows a spinner at turn 8 — add before snapshot"
browse wait timeout 2000
找出问题出现的具体交互轮次。哪种单一策略可以避免该问题?
示例:
- "点击下拉菜单后等待1秒——选项动画完成后才可点击"
- "直接导航到——完全跳过着陆页"
/pay-invoice/ - "使用而非
browse fill #field_3 value——此输入框获得焦点时会清空内容"browse type - "第8轮页面显示加载动画——在快照前添加"
browse wait timeout 2000
Update strategy.md
更新strategy.md
Edit . Keep everything that worked. Fix the specific failure. Add a concrete heuristic.
./autobrowse/tasks/<task-name>/strategy.mdGood strategies have:
- Fast path: direct URL or shortcuts to skip exploration
- Step-by-step workflow: exact sequence with timing notes
- Site-specific knowledge: selector IDs, form field names, success indicators
- Failure recovery: what to do when X goes wrong
编辑。保留所有有效的内容,修复具体的失败点,添加明确的策略。
./autobrowse/tasks/<task-name>/strategy.md优质策略应包含:
- 快速路径:直接URL或跳过探索的快捷方式
- 分步工作流:包含时间说明的精确步骤序列
- 网站特定知识:选择器ID、表单字段名称、成功标识
- 故障恢复:当X出现问题时的处理方法
Judge the result
判断结果
Read the new summary. Did it pass? Make clear progress?
- Pass or progress → keep, next iteration
- No progress or regression → revert strategy.md to the previous version and try a different hypothesis
读取新的摘要。任务是否通过?是否有明显进展?
- 通过或有进展 → 保留当前版本,进入下一次迭代
- 无进展或出现倒退 → 将strategy.md恢复到上一版本,尝试其他假设
After all iterations — publish if ready
所有迭代完成后——若就绪则发布
If the task passed on 2+ of the last 3 iterations or has reached the max iteration limit, install it as a Claude Code skill. Do not just copy strategy.md — the skill must be self-contained and useful to someone who has never seen this codebase. If graduating at max iterations without a clean pass, note the known failure point but still document everything learned.
Install by writing to :
~/.claude/skills/<task-name>/SKILL.mdbash
mkdir -p ~/.claude/skills/<task-name>Use this structure for the SKILL.md:
markdown
---
name: <task-name>
description: <1-2 sentences describing what this skill does and when to use it. Include trigger keywords.>
---如果任务在最后3次迭代中有2次以上通过或已达到最大迭代次数,则将其安装为Claude Code技能。不要直接复制strategy.md——技能必须独立可用,从未接触过此代码库的用户也能使用。如果达到最大迭代次数但未完全通过,需注明已知的失败点,但仍需记录所有学到的内容。
通过写入完成安装:
~/.claude/skills/<task-name>/SKILL.mdbash
mkdir -p ~/.claude/skills/<task-name>SKILL.md需使用以下结构:
markdown
---
name: <task-name>
description: <1-2句话描述该技能的功能和适用场景,包含触发关键词。>
---<Task Title> — Browser Skill
<任务标题> — 浏览器技能
Purpose
用途
<1-2 sentences: what this automates and why it exists.>
<1-2句话:该技能自动化的内容及其存在的意义。>
When to Use
适用场景
<When should someone reach for this skill.>
<何时应该使用此技能。>
Browse CLI Reference
Browse CLI参考
The inner agent uses the CLI. Key commands for this task:
browse- — kill existing session (always run before switching to remote)
browse stop - — start a fresh Browserbase cloud session and navigate
browse open <url> --remote - — start a clean local browser and navigate
browse open <url> --local - — open URL in a new tab
browse tab new <url> - — wait for page to finish loading
browse wait load - — wait a fixed amount of time for spinners or animations
browse wait timeout <ms> - — wait for an element to become visible
browse wait selector "<selector>" - — verify you're on the right page
browse get title - — extract all visible text (preferred for content extraction)
browse get text body - — get accessibility tree; each node has a ref in
browse snapshotformat (e.g.[X-Y],[0-5])[2-147] - — click element by ref from the latest snapshot (include the brackets)
browse click [X-Y]
Never use flags in SKILL.md. Named sessions are a parallel-run workaround — they contaminate skills with infrastructure concerns. Skills must work in isolation with the default session.
--session <name>内部代理使用命令行工具。此任务的关键命令:
browse- — 终止现有会话(切换到remote环境前务必运行此命令)
browse stop - — 启动新的Browserbase云端会话并导航
browse open <url> --remote - — 启动干净的本地浏览器并导航
browse open <url> --local - — 在新标签页打开URL
browse tab new <url> - — 等待页面加载完成
browse wait load - — 等待固定时长(适用于加载动画或过渡效果)
browse wait timeout <ms> - — 等待元素变为可见
browse wait selector "<selector>" - — 验证是否在正确页面
browse get title - — 提取所有可见文本(内容提取的首选方式)
browse get text body - — 获取可访问性树;每个节点都有
browse snapshot格式的引用(例如[X-Y]、[0-5])[2-147] - — 通过最新快照中的引用点击元素(需包含括号)
browse click [X-Y]
**切勿在SKILL.md中使用参数。**命名会话是并行运行的临时解决方案——会将基础设施相关的内容混入技能中。技能必须能在默认会话下独立运行。
--session <name>Workflow
工作流
Step 1 — Start session
步骤1 — 启动会话
<exact browse commands in order>
<按顺序排列的精确browse命令>
Step 2 — Navigate
步骤2 — 导航
<exact URL and verification steps>
<精确的URL和验证步骤>
Step 3 — Extract
步骤3 — 提取
<exact extraction commands>
<精确的提取命令>
Step 4 — Output
步骤4 — 输出
<what JSON to emit, referencing the schema below>
<要生成的JSON内容,参考下方的Schema>
Site-Specific Gotchas
网站特定注意事项
<Bullet list of every hard-won heuristic from the iterations. This is the core value of the skill.>
<从迭代中总结出的所有关键策略要点列表。这是该技能的核心价值。>
Failure Recovery
故障恢复
<What to do when navigation fails, session is contaminated, or extraction returns garbage>
<导航失败、会话被污染或提取结果无效时的处理方法>
Expected Output
预期输出
json
<paste the exact expected output schema from task.md>
After writing the SKILL.md, confirm it's installed:
```bash
ls ~/.claude/skills/<task-name>/SKILL.mdThe skill is now available as in Claude Code.
/<task-name>json
<粘贴task.md中的精确预期输出Schema>
写入SKILL.md后,确认安装成功:
```bash
ls ~/.claude/skills/<task-name>/SKILL.md该技能现在可在Claude Code中通过调用。
/<task-name>Final report (multi-task mode)
最终报告(多任务模式)
After all sub-agents complete, print a markdown table:
| Task | Iterations | Final Status | Graduated | Cost |
|---|---|---|---|---|
| google-flights | 5 | ✅ pass | yes | $0.42 |
| amazon-add-to-cart | 5 | ❌ fail | no | $1.20 |
Then write a persistent session report to so there's a durable record of the run inside the workspace:
./autobrowse/reports/bash
mkdir -p ./autobrowse/reportsWrite the file with:
./autobrowse/reports/YYYY-MM-DD-HH-MM-<tasks>.mdmarkdown
undefined所有子代理完成后,打印Markdown表格:
| 任务 | 迭代次数 | 最终状态 | 是否已毕业 | 成本 |
|---|---|---|---|---|
| google-flights | 5 | ✅ 通过 | 是 | $0.42 |
| amazon-add-to-cart | 5 | ❌ 失败 | 否 | $1.20 |
然后将持久化的会话报告写入,确保工作区内有本次运行的永久记录:
./autobrowse/reports/bash
mkdir -p ./autobrowse/reports创建文件,内容如下:
./autobrowse/reports/YYYY-MM-DD-HH-MM-<tasks>.mdmarkdown
undefinedAutoBrowse Session Report
AutoBrowse会话报告
Date: <ISO date>
Tasks: <comma-separated list>
Environment: remote|local
Total cost: $X.XX
日期: <ISO格式日期>
任务: <逗号分隔的任务列表>
环境: remote|local
总成本: $X.XX
Results
结果
| Task | Iterations | Pass Rate | Final Status | Graduated | Cost |
|---|---|---|---|---|---|
| ... | ... | X/5 | ✅/❌ | yes/no | $X.XX |
| 任务 | 迭代次数 | 通过率 | 最终状态 | 是否已毕业 | 成本 |
|---|---|---|---|---|---|
| ... | ... | X/5 | ✅/❌ | 是/否 | $X.XX |
Per-Task Learnings
各任务要点
<task-name>
<task-name>
- Key insight 1: <what the agent learned>
- Key insight 2: <another heuristic>
- Failure mode fixed: <what was failing and how it was resolved>
- 关键洞察1: <代理学到的内容>
- 关键洞察2: <另一个策略>
- 修复的故障模式: <之前的问题及解决方法>
Iteration Log
迭代日志
<task-name>
<task-name>
| Iter | Turns | Cost | Status | Hypothesis tested |
|---|---|---|---|---|
| 1 | 79 | $18.75 | ❌ fail | baseline |
| 2 | 9 | $0.26 | ✅ pass | session contamination fix |
| ... | ... | ... | ... | ... |
---| 迭代编号 | 交互轮次 | 成本 | 状态 | 测试的假设 |
|---|---|---|---|---|
| 1 | 79 | $18.75 | ❌ 失败 | 基准测试 |
| 2 | 9 | $0.26 | ✅ 通过 | 会话污染修复 |
| ... | ... | ... | ... | ... |
---Rules
规则
- Only edit — never touch
strategy.md(unless creating it from the template) ortask.mdevaluate.mjs - Stay in the workspace — all training writes go to , never to
./autobrowse/. The skill source is read-only.~/.claude/skills/autobrowse/ - One hypothesis per iteration — test one change at a time
- Build on wins — keep what worked, add to it
- Trust the trace — the inner agent shows exactly what it saw and did
- Graduate to — the only file you write there is the final graduated
~/.claude/skills/SKILL.md
- 仅编辑— 切勿修改
strategy.md(除非从模板创建)或task.mdevaluate.mjs - 保持在工作区内 — 所有训练相关的写入操作都指向,绝不写入
./autobrowse/。技能源文件为只读状态。~/.claude/skills/autobrowse/ - 每次迭代仅测试一个假设 — 一次只做一处修改
- 基于成功经验构建 — 保留有效的内容并在此基础上优化
- 信任执行轨迹 — 内部代理会准确展示其所见和所做的操作
- 毕业到— 仅在该目录下写入最终的毕业文件
~/.claude/skills/SKILL.md