agent-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAgent Browser Skill
Agent Browser Skill
Fast browser automation using accessibility tree snapshots with refs for deterministic element selection.
借助可访问性树快照和引用实现确定性元素选择的快速浏览器自动化工具。
Why Use This Over Built-in Browser Tool
为何选择它而非内置浏览器工具
Use agent-browser when:
- Automating multi-step workflows
- Need deterministic element selection
- Performance is critical
- Working with complex SPAs
- Need session isolation
Use built-in browser tool when:
- Need screenshots/PDFs for analysis
- Visual inspection required
- Browser extension integration needed
使用agent-browser的场景:
- 自动化多步骤工作流
- 需要确定性元素选择
- 性能至关重要
- 处理复杂SPA
- 需要会话隔离
使用内置浏览器工具的场景:
- 需要截图/PDF用于分析
- 需要视觉检查
- 需要浏览器扩展集成
Core Workflow
核心工作流
bash
undefinedbash
undefined1. Navigate and snapshot
1. 导航并生成快照
agent-browser open https://example.com
agent-browser snapshot -i --json
agent-browser open https://example.com
agent-browser snapshot -i --json
2. Parse refs from JSON, then interact
2. 解析JSON中的引用,然后进行交互
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser click @e2
agent-browser fill @e3 "text"
3. Re-snapshot after page changes
3. 页面变化后重新生成快照
agent-browser snapshot -i --json
undefinedagent-browser snapshot -i --json
undefinedKey Commands
核心命令
Navigation
导航
bash
agent-browser open <url>
agent-browser back | forward | reload | closebash
agent-browser open <url>
agent-browser back | forward | reload | closeSnapshot (Always use -i --json)
快照(始终使用 -i --json)
bash
agent-browser snapshot -i --json # Interactive elements, JSON output
agent-browser snapshot -i -c -d 5 --json # + compact, depth limit
agent-browser snapshot -s "#main" -i # Scope to selectorbash
agent-browser snapshot -i --json # 仅包含交互元素,JSON格式输出
agent-browser snapshot -i -c -d 5 --json # + 紧凑格式、深度限制
agent-browser snapshot -s "#main" -i # 限定选择器范围Interactions (Ref-based)
交互操作(基于引用)
bash
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser type @e3 "text"
agent-browser hover @e4
agent-browser check @e5 | uncheck @e5
agent-browser select @e6 "value"
agent-browser press "Enter"
agent-browser scroll down 500
agent-browser drag @e7 @e8bash
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser type @e3 "text"
agent-browser hover @e4
agent-browser check @e5 | uncheck @e5
agent-browser select @e6 "value"
agent-browser press "Enter"
agent-browser scroll down 500
agent-browser drag @e7 @e8Get Information
获取信息
bash
agent-browser get text @e1 --json
agent-browser get html @e2 --json
agent-browser get value @e3 --json
agent-browser get attr @e4 "href" --json
agent-browser get title --json
agent-browser get url --json
agent-browser get count ".item" --jsonbash
agent-browser get text @e1 --json
agent-browser get html @e2 --json
agent-browser get value @e3 --json
agent-browser get attr @e4 "href" --json
agent-browser get title --json
agent-browser get url --json
agent-browser get count ".item" --jsonCheck State
状态检查
bash
agent-browser is visible @e2 --json
agent-browser is enabled @e3 --json
agent-browser is checked @e4 --jsonbash
agent-browser is visible @e2 --json
agent-browser is enabled @e3 --json
agent-browser is checked @e4 --jsonWait
等待操作
bash
agent-browser wait @e2 # Wait for element
agent-browser wait 1000 # Wait ms
agent-browser wait --text "Welcome" # Wait for text
agent-browser wait --url "**/dashboard" # Wait for URL
agent-browser wait --load networkidle # Wait for network
agent-browser wait --fn "window.ready === true"bash
agent-browser wait @e2 # 等待元素出现
agent-browser wait 1000 # 等待指定毫秒数
agent-browser wait --text "Welcome" # 等待文本出现
agent-browser wait --url "**/dashboard" # 等待URL匹配
agent-browser wait --load networkidle # 等待网络空闲
agent-browser wait --fn "window.ready === true"Sessions (Isolated Browsers)
会话(隔离浏览器上下文)
bash
agent-browser --session admin open site.com
agent-browser --session user open site.com
agent-browser session listbash
agent-browser --session admin open site.com
agent-browser --session user open site.com
agent-browser session listOr via env: AGENT_BROWSER_SESSION=admin agent-browser ...
或通过环境变量:AGENT_BROWSER_SESSION=admin agent-browser ...
undefinedundefinedState Persistence
状态持久化
bash
agent-browser state save auth.json # Save cookies/storage
agent-browser state load auth.json # Load (skip login)bash
agent-browser state save auth.json # 保存Cookie/存储数据
agent-browser state load auth.json # 加载状态(跳过登录流程)Screenshots & PDFs
截图与PDF
bash
agent-browser screenshot page.png
agent-browser screenshot --full page.png
agent-browser pdf page.pdfbash
agent-browser screenshot page.png
agent-browser screenshot --full page.png
agent-browser pdf page.pdfNetwork Control
网络控制
bash
agent-browser network route "**/ads/*" --abort # Block
agent-browser network route "**/api/*" --body '{"x":1}' # Mock
agent-browser network requests --filter api # Viewbash
agent-browser network route "**/ads/*" --abort # 拦截请求
agent-browser network route "**/api/*" --body '{"x":1}' # 模拟响应
agent-browser network requests --filter api # 查看请求Cookies & Storage
Cookie与存储
bash
agent-browser cookies # Get all
agent-browser cookies set name value
agent-browser storage local key # Get localStorage
agent-browser storage local set key valbash
agent-browser cookies # 获取所有Cookie
agent-browser cookies set name value
agent-browser storage local key # 获取localStorage值
agent-browser storage local set key valTabs & Frames
标签页与框架
bash
agent-browser tab new https://example.com
agent-browser tab 2 # Switch to tab
agent-browser frame @e5 # Switch to iframe
agent-browser frame main # Back to mainbash
agent-browser tab new https://example.com
agent-browser tab 2 # 切换到指定标签页
agent-browser frame @e5 # 切换到iframe
agent-browser frame main # 返回主框架Snapshot Output Format
快照输出格式
json
{
"success": true,
"data": {
"snapshot": "...",
"refs": {
"e1": {"role": "heading", "name": "Example Domain"},
"e2": {"role": "button", "name": "Submit"},
"e3": {"role": "textbox", "name": "Email"}
}
}
}json
{
"success": true,
"data": {
"snapshot": "...",
"refs": {
"e1": {"role": "heading", "name": "Example Domain"},
"e2": {"role": "button", "name": "Submit"},
"e3": {"role": "textbox", "name": "Email"}
}
}
}Best Practices
最佳实践
- Always use flag - Focus on interactive elements
-i - Always use - Easier to parse
--json - Wait for stability -
agent-browser wait --load networkidle - Save auth state - Skip login flows with
state save/load - Use sessions - Isolate different browser contexts
- Use for debugging - See what's happening
--headed
- 始终使用参数 - 仅关注可交互元素
-i - 始终使用参数 - 更易于解析
--json - 等待页面稳定 - 使用
agent-browser wait --load networkidle - 保存认证状态 - 通过跳过登录流程
state save/load - 使用会话 - 隔离不同的浏览器上下文
- 调试时使用参数 - 可视化查看操作过程
--headed
Example: Search and Extract
示例:搜索与提取
bash
agent-browser open https://www.google.com
agent-browser snapshot -i --jsonbash
agent-browser open https://www.google.com
agent-browser snapshot -i --jsonAI identifies search box @e1
AI识别出搜索框@e1
agent-browser fill @e1 "AI agents"
agent-browser press Enter
agent-browser wait --load networkidle
agent-browser snapshot -i --json
agent-browser fill @e1 "AI agents"
agent-browser press Enter
agent-browser wait --load networkidle
agent-browser snapshot -i --json
AI identifies result refs
AI识别出结果引用
agent-browser get text @e3 --json
agent-browser get attr @e4 "href" --json
undefinedagent-browser get text @e3 --json
agent-browser get attr @e4 "href" --json
undefinedExample: Multi-Session Testing
示例:多会话测试
bash
undefinedbash
undefinedAdmin session
管理员会话
agent-browser --session admin open app.com
agent-browser --session admin state load admin-auth.json
agent-browser --session admin snapshot -i --json
agent-browser --session admin open app.com
agent-browser --session admin state load admin-auth.json
agent-browser --session admin snapshot -i --json
User session (simultaneous)
用户会话(同时进行)
agent-browser --session user open app.com
agent-browser --session user state load user-auth.json
agent-browser --session user snapshot -i --json
undefinedagent-browser --session user open app.com
agent-browser --session user state load user-auth.json
agent-browser --session user snapshot -i --json
undefinedInstallation
安装
bash
npm install -g agent-browser
agent-browser install # Download Chromium
agent-browser install --with-deps # Linux: + system depsbash
npm install -g agent-browser
agent-browser install # 下载Chromium
agent-browser install --with-deps # Linux系统:同时安装依赖Credits
致谢
Skill created by Yossi Elkrief (@MaTriXy)
agent-browser CLI by Vercel Labs
本Skill由Yossi Elkrief(@MaTriXy)开发
agent-browser CLI由Vercel Labs提供