browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser Automation Skill
Browser自动化Skill
Web browser automation using agent-browser with AI-optimized snapshots. Reduces context by 93% using element refs (@e1, @e2) instead of full DOM.
借助agent-browser实现Web浏览器自动化,搭配AI优化的快照功能。通过元素引用(@e1、@e2)替代完整DOM,将上下文数据减少93%。
Core Workflow
核心工作流程
bash
undefinedbash
undefined1. Navigate to page
1. 导航至页面
agent-browser open <url>
agent-browser open <url>
2. Get accessibility tree with element refs
2. 获取带元素引用的可访问性树
agent-browser snapshot -i # -i = interactive elements only
agent-browser snapshot -i # -i = 仅包含交互式元素
3. Interact using refs from snapshot
3. 使用快照中的引用进行交互
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser click @e2
agent-browser fill @e3 "text"
4. Re-snapshot after page changes
4. 页面变化后重新生成快照
agent-browser snapshot -i
undefinedagent-browser snapshot -i
undefinedQuick Reference
快速参考
Navigation
导航
| Command | Description |
|---|---|
| Navigate to URL |
| Go back |
| Go forward |
| Reload page |
| Close browser |
| 命令 | 描述 |
|---|---|
| 导航至指定URL |
| 返回上一页 |
| 前进至下一页 |
| 刷新页面 |
| 关闭浏览器 |
Snapshots (AI-Optimized)
快照(AI优化)
| Command | Description |
|---|---|
| Full accessibility tree |
| Interactive elements only (buttons, links, inputs) |
| Compact (remove empty elements) |
| Limit depth to 3 levels |
| Capture screenshot (base64 if no path) |
| 命令 | 描述 |
|---|---|
| 完整可访问性树 |
| 仅包含交互式元素(按钮、链接、输入框) |
| 精简模式(移除空元素) |
| 限制层级深度为3级 |
| 捕获截图(无路径时返回base64格式) |
Interaction
交互操作
| Command | Description |
|---|---|
| Click element |
| Clear and fill input |
| Type with key events |
| Press key (Enter, Tab, etc.) |
| Hover element |
| Select dropdown option |
| Toggle checkbox |
| Scroll page |
| 命令 | 描述 |
|---|---|
| 点击元素 |
| 清空并填充输入框 |
| 通过按键事件输入文本 |
| 按下指定按键(Enter、Tab等) |
| 悬停在元素上 |
| 选择下拉选项 |
| 切换复选框状态 |
| 滚动页面 |
Get Info
获取信息
| Command | Description |
|---|---|
| Get text content |
| Get innerHTML |
| Get input value |
| Get attribute |
| Get page title |
| Get current URL |
| 命令 | 描述 |
|---|---|
| 获取文本内容 |
| 获取innerHTML |
| 获取输入框值 |
| 获取元素属性 |
| 获取页面标题 |
| 获取当前URL |
Wait
等待操作
| Command | Description |
|---|---|
| Wait for element |
| Wait milliseconds |
| Wait for text |
| Wait for URL |
| Wait for load state |
| 命令 | 描述 |
|---|---|
| 等待元素出现 |
| 等待指定毫秒数 |
| 等待指定文本出现 |
| 等待URL匹配指定模式 |
| 等待页面进入空闲加载状态 |
Sessions
会话管理
| Command | Description |
|---|---|
| Use isolated session |
| List active sessions |
| 命令 | 描述 |
|---|---|
| 使用独立会话 |
| 列出活跃会话 |
Selectors
选择器
Element Refs (Recommended)
元素引用(推荐)
bash
undefinedbash
undefinedGet refs from snapshot
从快照中获取引用
agent-browser snapshot -i
agent-browser snapshot -i
Output: button "Submit" [ref=e2]
输出示例: button "Submit" [ref=e2]
Use ref to interact
使用引用进行交互
agent-browser click @e2
undefinedagent-browser click @e2
undefinedCSS Selectors
CSS选择器
bash
agent-browser click "#submit"
agent-browser fill ".email-input" "test@test.com"bash
agent-browser click "#submit"
agent-browser fill ".email-input" "test@test.com"Semantic Locators
语义定位器
bash
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find testid "login-btn" clickbash
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find testid "login-btn" clickExamples
示例
Login Flow
登录流程
bash
agent-browser open https://example.com/login
agent-browser snapshot -i
agent-browser fill @e2 "user@example.com"
agent-browser fill @e3 "password123"
agent-browser click @e4
agent-browser wait --url "**/dashboard"bash
agent-browser open https://example.com/login
agent-browser snapshot -i
agent-browser fill @e2 "user@example.com"
agent-browser fill @e3 "password123"
agent-browser click @e4
agent-browser wait --url "**/dashboard"Form Submission
表单提交
bash
agent-browser open https://example.com/contact
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, this is my message"
agent-browser click @e4
agent-browser wait --text "Thank you"bash
agent-browser open https://example.com/contact
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, this is my message"
agent-browser click @e4
agent-browser wait --text "Thank you"Data Extraction
数据提取
bash
agent-browser open https://example.com/products
agent-browser snapshot -ibash
agent-browser open https://example.com/products
agent-browser snapshot -iIterate through product refs
遍历产品引用
agent-browser get text @e1 # Product name
agent-browser get text @e2 # Price
agent-browser get attr @e3 href # Link
undefinedagent-browser get text @e1 # 产品名称
agent-browser get text @e2 # 价格
agent-browser get attr @e3 href # 链接
undefinedMulti-Session (Swarm)
多会话(集群操作)
bash
undefinedbash
undefinedSession 1: Navigator
会话1:导航器
agent-browser --session nav open https://example.com
agent-browser --session nav state save auth.json
agent-browser --session nav open https://example.com
agent-browser --session nav state save auth.json
Session 2: Scraper (uses same auth)
会话2:数据采集器(复用相同认证信息)
agent-browser --session scrape state load auth.json
agent-browser --session scrape open https://example.com/data
agent-browser --session scrape snapshot -i
undefinedagent-browser --session scrape state load auth.json
agent-browser --session scrape open https://example.com/data
agent-browser --session scrape snapshot -i
undefinedIntegration with Claude Flow
与Claude Flow集成
MCP Tools
MCP工具
All browser operations are available as MCP tools with prefix:
browser/browser/openbrowser/snapshotbrowser/clickbrowser/fillbrowser/screenshot- etc.
所有浏览器操作均以为前缀作为MCP工具提供:
browser/browser/openbrowser/snapshotbrowser/clickbrowser/fillbrowser/screenshot- 等
Memory Integration
内存集成
bash
undefinedbash
undefinedStore successful patterns
存储成功的操作模式
npx @claude-flow/cli memory store --namespace browser-patterns --key "login-flow" --value "snapshot->fill->click->wait"
npx @claude-flow/cli memory store --namespace browser-patterns --key "login-flow" --value "snapshot->fill->click->wait"
Retrieve before similar task
执行类似任务前检索
npx @claude-flow/cli memory search --query "login automation"
undefinednpx @claude-flow/cli memory search --query "login automation"
undefinedHooks
钩子
bash
undefinedbash
undefinedPre-browse hook (get context)
浏览前钩子(获取上下文)
npx @claude-flow/cli hooks pre-edit --file "browser-task.ts"
npx @claude-flow/cli hooks pre-edit --file "browser-task.ts"
Post-browse hook (record success)
浏览后钩子(记录成功状态)
npx @claude-flow/cli hooks post-task --task-id "browse-1" --success true
undefinednpx @claude-flow/cli hooks post-task --task-id "browse-1" --success true
undefinedTips
技巧
- Always use snapshots - They're optimized for AI with refs
- Prefer flag - Gets only interactive elements, smaller output
-i - Use refs, not selectors - More reliable, deterministic
- Re-snapshot after navigation - Page state changes
- Use sessions for parallel work - Each session is isolated
- 始终使用快照 - 针对AI优化并包含元素引用,效率更高
- 优先使用参数 - 仅获取交互式元素,输出内容更精简
-i - 使用元素引用而非选择器 - 更可靠、确定性更强
- 导航后重新生成快照 - 页面状态会发生变化
- 使用会话进行并行操作 - 每个会话相互独立