core-agent-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser Automation with agent-browser
使用agent-browser实现浏览器自动化
Priority Note
优先级说明
For fetching Rust/crate information, use this priority order:
- rust-learner skill - Orchestrates actionbook + browser-fetcher
- actionbook MCP - Pre-computed selectors for known sites
- agent-browser CLI - Direct browser automation (last resort)
Use agent-browser directly only when:
- actionbook has no pre-computed selectors for the target site
- You need interactive browser testing/automation
- You need screenshots or form filling
获取Rust/crate信息时,请遵循以下优先级顺序:
- rust-learner技能 - 协调actionbook + browser-fetcher
- actionbook MCP - 针对已知站点的预计算选择器
- agent-browser CLI - 直接浏览器自动化(最后手段)
仅在以下情况直接使用agent-browser:
- actionbook没有针对目标站点的预计算选择器
- 需要交互式浏览器测试/自动化
- 需要截图或表单填写
Quick start
快速开始
bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browserbash
agent-browser open <url> # 导航至页面
agent-browser snapshot -i # 获取带引用的交互元素
agent-browser click @e1 # 通过引用点击元素
agent-browser fill @e2 "text" # 通过引用填充输入框
agent-browser close # 关闭浏览器Core workflow
核心工作流
- Navigate:
agent-browser open <url> - Snapshot: (returns elements with refs like
agent-browser snapshot -i,@e1)@e2 - Interact using refs from the snapshot
- Re-snapshot after navigation or significant DOM changes
- 导航:
agent-browser open <url> - 快照:(返回带
agent-browser snapshot -i、@e1等引用的元素)@e2 - 使用快照中的引用进行交互
- 导航或DOM发生重大变化后重新快照
Commands
命令
Navigation
导航
bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browserbash
agent-browser open <url> # 导航至URL
agent-browser back # 返回上一页
agent-browser forward # 前进至下一页
agent-browser reload # 重新加载页面
agent-browser close # 关闭浏览器Snapshot (page analysis)
快照(页面分析)
bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3bash
agent-browser snapshot # 完整可访问性树
agent-browser snapshot -i # 仅显示交互元素(推荐)
agent-browser snapshot -c # 精简输出
agent-browser snapshot -d 3 # 限制深度为3Interactions (use @refs from snapshot)
交互(使用快照中的@引用)
bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into viewbash
agent-browser click @e1 # 点击
agent-browser dblclick @e1 # 双击
agent-browser fill @e2 "text" # 清空并输入
agent-browser type @e2 "text" # 直接输入(不清空)
agent-browser press Enter # 按下按键
agent-browser press Control+a # 组合按键
agent-browser hover @e1 # 悬停
agent-browser check @e1 # 勾选复选框
agent-browser uncheck @e1 # 取消勾选复选框
agent-browser select @e1 "value" # 选择下拉选项
agent-browser scroll down 500 # 向下滚动页面
agent-browser scrollintoview @e1 # 滚动至元素可见Get information
获取信息
bash
agent-browser get text @e1 # Get element text
agent-browser get value @e1 # Get input value
agent-browser get title # Get page title
agent-browser get url # Get current URLbash
agent-browser get text @e1 # 获取元素文本
agent-browser get value @e1 # 获取输入框值
agent-browser get title # 获取页面标题
agent-browser get url # 获取当前URLScreenshots
截图
bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full pagebash
agent-browser screenshot # 将截图输出至标准输出
agent-browser screenshot path.png # 保存至文件
agent-browser screenshot --full # 整页截图Wait
等待
bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --load networkidle # Wait for network idlebash
agent-browser wait @e1 # 等待元素出现
agent-browser wait 2000 # 等待指定毫秒数
agent-browser wait --text "Success" # 等待文本出现
agent-browser wait --load networkidle # 等待网络空闲Semantic locators (alternative to refs)
语义定位器(引用的替代方案)
bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"Example: Form submission
示例:表单提交
bash
agent-browser open https://example.com/form
agent-browser snapshot -ibash
agent-browser open https://example.com/form
agent-browser snapshot -iOutput shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
输出显示:文本框 "Email" [ref=e1], 文本框 "Password" [ref=e2], 按钮 "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
undefinedagent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # 检查结果
undefined