browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser Automation Skill

Browser自动化Skill

Web browser automation using agent-browser with AI-optimized snapshots. Reduces context by 93% using element refs (@e1, @e2) instead of full DOM.
借助agent-browser实现Web浏览器自动化,搭配AI优化的快照功能。通过元素引用(@e1、@e2)替代完整DOM,将上下文数据减少93%。

Core Workflow

核心工作流程

bash
undefined
bash
undefined

1. Navigate to page

1. 导航至页面

agent-browser open <url>
agent-browser open <url>

2. Get accessibility tree with element refs

2. 获取带元素引用的可访问性树

agent-browser snapshot -i # -i = interactive elements only
agent-browser snapshot -i # -i = 仅包含交互式元素

3. Interact using refs from snapshot

3. 使用快照中的引用进行交互

agent-browser click @e2 agent-browser fill @e3 "text"
agent-browser click @e2 agent-browser fill @e3 "text"

4. Re-snapshot after page changes

4. 页面变化后重新生成快照

agent-browser snapshot -i
undefined
agent-browser snapshot -i
undefined

Quick Reference

快速参考

Navigation

导航

CommandDescription
open <url>
Navigate to URL
back
Go back
forward
Go forward
reload
Reload page
close
Close browser
命令描述
open <url>
导航至指定URL
back
返回上一页
forward
前进至下一页
reload
刷新页面
close
关闭浏览器

Snapshots (AI-Optimized)

快照(AI优化)

CommandDescription
snapshot
Full accessibility tree
snapshot -i
Interactive elements only (buttons, links, inputs)
snapshot -c
Compact (remove empty elements)
snapshot -d 3
Limit depth to 3 levels
screenshot [path]
Capture screenshot (base64 if no path)
命令描述
snapshot
完整可访问性树
snapshot -i
仅包含交互式元素(按钮、链接、输入框)
snapshot -c
精简模式(移除空元素)
snapshot -d 3
限制层级深度为3级
screenshot [path]
捕获截图(无路径时返回base64格式)

Interaction

交互操作

CommandDescription
click <sel>
Click element
fill <sel> <text>
Clear and fill input
type <sel> <text>
Type with key events
press <key>
Press key (Enter, Tab, etc.)
hover <sel>
Hover element
select <sel> <val>
Select dropdown option
check/uncheck <sel>
Toggle checkbox
scroll <dir> [px]
Scroll page
命令描述
click <sel>
点击元素
fill <sel> <text>
清空并填充输入框
type <sel> <text>
通过按键事件输入文本
press <key>
按下指定按键(Enter、Tab等)
hover <sel>
悬停在元素上
select <sel> <val>
选择下拉选项
check/uncheck <sel>
切换复选框状态
scroll <dir> [px]
滚动页面

Get Info

获取信息

CommandDescription
get text <sel>
Get text content
get html <sel>
Get innerHTML
get value <sel>
Get input value
get attr <sel> <attr>
Get attribute
get title
Get page title
get url
Get current URL
命令描述
get text <sel>
获取文本内容
get html <sel>
获取innerHTML
get value <sel>
获取输入框值
get attr <sel> <attr>
获取元素属性
get title
获取页面标题
get url
获取当前URL

Wait

等待操作

CommandDescription
wait <selector>
Wait for element
wait <ms>
Wait milliseconds
wait --text "text"
Wait for text
wait --url "pattern"
Wait for URL
wait --load networkidle
Wait for load state
命令描述
wait <selector>
等待元素出现
wait <ms>
等待指定毫秒数
wait --text "text"
等待指定文本出现
wait --url "pattern"
等待URL匹配指定模式
wait --load networkidle
等待页面进入空闲加载状态

Sessions

会话管理

CommandDescription
--session <name>
Use isolated session
session list
List active sessions
命令描述
--session <name>
使用独立会话
session list
列出活跃会话

Selectors

选择器

Element Refs (Recommended)

元素引用(推荐)

bash
undefined
bash
undefined

Get refs from snapshot

从快照中获取引用

agent-browser snapshot -i
agent-browser snapshot -i

Output: button "Submit" [ref=e2]

输出示例: button "Submit" [ref=e2]

Use ref to interact

使用引用进行交互

agent-browser click @e2
undefined
agent-browser click @e2
undefined

CSS Selectors

CSS选择器

bash
agent-browser click "#submit"
agent-browser fill ".email-input" "test@test.com"
bash
agent-browser click "#submit"
agent-browser fill ".email-input" "test@test.com"

Semantic Locators

语义定位器

bash
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find testid "login-btn" click
bash
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find testid "login-btn" click

Examples

示例

Login Flow

登录流程

bash
agent-browser open https://example.com/login
agent-browser snapshot -i
agent-browser fill @e2 "user@example.com"
agent-browser fill @e3 "password123"
agent-browser click @e4
agent-browser wait --url "**/dashboard"
bash
agent-browser open https://example.com/login
agent-browser snapshot -i
agent-browser fill @e2 "user@example.com"
agent-browser fill @e3 "password123"
agent-browser click @e4
agent-browser wait --url "**/dashboard"

Form Submission

表单提交

bash
agent-browser open https://example.com/contact
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, this is my message"
agent-browser click @e4
agent-browser wait --text "Thank you"
bash
agent-browser open https://example.com/contact
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, this is my message"
agent-browser click @e4
agent-browser wait --text "Thank you"

Data Extraction

数据提取

bash
agent-browser open https://example.com/products
agent-browser snapshot -i
bash
agent-browser open https://example.com/products
agent-browser snapshot -i

Iterate through product refs

遍历产品引用

agent-browser get text @e1 # Product name agent-browser get text @e2 # Price agent-browser get attr @e3 href # Link
undefined
agent-browser get text @e1 # 产品名称 agent-browser get text @e2 # 价格 agent-browser get attr @e3 href # 链接
undefined

Multi-Session (Swarm)

多会话(集群操作)

bash
undefined
bash
undefined

Session 1: Navigator

会话1:导航器

agent-browser --session nav open https://example.com agent-browser --session nav state save auth.json
agent-browser --session nav open https://example.com agent-browser --session nav state save auth.json

Session 2: Scraper (uses same auth)

会话2:数据采集器(复用相同认证信息)

agent-browser --session scrape state load auth.json agent-browser --session scrape open https://example.com/data agent-browser --session scrape snapshot -i
undefined
agent-browser --session scrape state load auth.json agent-browser --session scrape open https://example.com/data agent-browser --session scrape snapshot -i
undefined

Integration with Claude Flow

与Claude Flow集成

MCP Tools

MCP工具

All browser operations are available as MCP tools with
browser/
prefix:
  • browser/open
  • browser/snapshot
  • browser/click
  • browser/fill
  • browser/screenshot
  • etc.
所有浏览器操作均以
browser/
为前缀作为MCP工具提供:
  • browser/open
  • browser/snapshot
  • browser/click
  • browser/fill
  • browser/screenshot

Memory Integration

内存集成

bash
undefined
bash
undefined

Store successful patterns

存储成功的操作模式

npx @claude-flow/cli memory store --namespace browser-patterns --key "login-flow" --value "snapshot->fill->click->wait"
npx @claude-flow/cli memory store --namespace browser-patterns --key "login-flow" --value "snapshot->fill->click->wait"

Retrieve before similar task

执行类似任务前检索

npx @claude-flow/cli memory search --query "login automation"
undefined
npx @claude-flow/cli memory search --query "login automation"
undefined

Hooks

钩子

bash
undefined
bash
undefined

Pre-browse hook (get context)

浏览前钩子(获取上下文)

npx @claude-flow/cli hooks pre-edit --file "browser-task.ts"
npx @claude-flow/cli hooks pre-edit --file "browser-task.ts"

Post-browse hook (record success)

浏览后钩子(记录成功状态)

npx @claude-flow/cli hooks post-task --task-id "browse-1" --success true
undefined
npx @claude-flow/cli hooks post-task --task-id "browse-1" --success true
undefined

Tips

技巧

  1. Always use snapshots - They're optimized for AI with refs
  2. Prefer
    -i
    flag
    - Gets only interactive elements, smaller output
  3. Use refs, not selectors - More reliable, deterministic
  4. Re-snapshot after navigation - Page state changes
  5. Use sessions for parallel work - Each session is isolated
  1. 始终使用快照 - 针对AI优化并包含元素引用,效率更高
  2. 优先使用
    -i
    参数
    - 仅获取交互式元素,输出内容更精简
  3. 使用元素引用而非选择器 - 更可靠、确定性更强
  4. 导航后重新生成快照 - 页面状态会发生变化
  5. 使用会话进行并行操作 - 每个会话相互独立