agent-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser Automation with agent-browser

基于agent-browser的浏览器自动化

Core Workflow

核心工作流

Every browser automation follows this pattern:
  1. Navigate:
    agent-browser open <url>
  2. Snapshot:
    agent-browser snapshot -i
    (get element refs like
    @e1
    ,
    @e2
    )
  3. Interact: Use refs to click, fill, select
  4. Re-snapshot: After navigation or DOM changes, get fresh refs
bash
agent-browser open https://example.com/form
agent-browser snapshot -i
所有浏览器自动化都遵循以下模式:
  1. 导航
    agent-browser open <url>
  2. 快照
    agent-browser snapshot -i
    (获取元素引用,如
    @e1
    @e2
  3. 交互:使用引用进行点击、填写、选择操作
  4. 重新快照:导航或DOM变更后,获取最新的元素引用
bash
agent-browser open https://example.com/form
agent-browser snapshot -i

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

输出:@e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # Check result
undefined
agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # 检查结果
undefined

Essential Commands

核心命令

bash
undefined
bash
undefined

Navigation

导航

agent-browser open <url> # Navigate (aliases: goto, navigate) agent-browser close # Close browser
agent-browser open <url> # 导航(别名:goto、navigate) agent-browser close # 关闭浏览器

Snapshot

快照

agent-browser snapshot -i # Interactive elements with refs (recommended) agent-browser snapshot -s "#selector" # Scope to CSS selector
agent-browser snapshot -i # 获取带引用的交互元素(推荐) agent-browser snapshot -s "#selector" # 限定到CSS选择器范围

Interaction (use @refs from snapshot)

交互(使用快照中的@引用)

agent-browser click @e1 # Click element agent-browser fill @e2 "text" # Clear and type text agent-browser type @e2 "text" # Type without clearing agent-browser select @e1 "option" # Select dropdown option agent-browser check @e1 # Check checkbox agent-browser press Enter # Press key agent-browser scroll down 500 # Scroll page
agent-browser click @e1 # 点击元素 agent-browser fill @e2 "text" # 清空并输入文本 agent-browser type @e2 "text" # 输入文本不清空原有内容 agent-browser select @e1 "option" # 选择下拉选项 agent-browser check @e1 # 勾选复选框 agent-browser press Enter # 按下按键 agent-browser scroll down 500 # 向下滚动页面

Get information

获取信息

agent-browser get text @e1 # Get element text agent-browser get url # Get current URL agent-browser get title # Get page title
agent-browser get text @e1 # 获取元素文本 agent-browser get url # 获取当前URL agent-browser get title # 获取页面标题

Wait

等待

agent-browser wait @e1 # Wait for element agent-browser wait --load networkidle # Wait for network idle agent-browser wait --url "**/page" # Wait for URL pattern agent-browser wait 2000 # Wait milliseconds
agent-browser wait @e1 # 等待元素加载 agent-browser wait --load networkidle # 等待网络空闲 agent-browser wait --url "**/page" # 等待URL匹配指定模式 agent-browser wait 2000 # 等待指定毫秒数

Capture

捕获

agent-browser screenshot # Screenshot to temp dir agent-browser screenshot --full # Full page screenshot agent-browser pdf output.pdf # Save as PDF
undefined
agent-browser screenshot # 截图保存到临时目录 agent-browser screenshot --full # 整页截图 agent-browser pdf output.pdf # 保存为PDF
undefined

Common Patterns

常见模式

Form Submission

表单提交

bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle

Authentication with State Persistence

带状态持久化的身份验证

bash
undefined
bash
undefined

Login once and save state

登录一次并保存状态

agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json
agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json

Reuse in future sessions

在后续会话中复用

agent-browser state load auth.json agent-browser open https://app.example.com/dashboard
undefined
agent-browser state load auth.json agent-browser open https://app.example.com/dashboard
undefined

Data Extraction

数据提取

bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # Get specific element text
agent-browser get text body > page.txt  # Get all page text
bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # 获取特定元素文本
agent-browser get text body > page.txt  # 获取页面所有文本

JSON output for parsing

输出JSON格式以便解析

agent-browser snapshot -i --json agent-browser get text @e1 --json
undefined
agent-browser snapshot -i --json agent-browser get text @e1 --json
undefined

Parallel Sessions

并行会话

bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list
bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list

Visual Browser (Debugging)

可视化浏览器(调试用)

bash
agent-browser --headed open https://example.com
agent-browser highlight @e1          # Highlight element
agent-browser record start demo.webm # Record session
bash
agent-browser --headed open https://example.com
agent-browser highlight @e1          # 高亮元素
agent-browser record start demo.webm # 录制会话

iOS Simulator (Mobile Safari)

iOS模拟器(移动端Safari)

bash
undefined
bash
undefined

List available iOS simulators

列出可用的iOS模拟器

agent-browser device list
agent-browser device list

Launch Safari on a specific device

在指定设备上启动Safari

agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com

Same workflow as desktop - snapshot, interact, re-snapshot

与桌面端相同工作流 - 快照、交互、重新快照

agent-browser -p ios snapshot -i agent-browser -p ios tap @e1 # Tap (alias for click) agent-browser -p ios fill @e2 "text" agent-browser -p ios swipe up # Mobile-specific gesture
agent-browser -p ios snapshot -i agent-browser -p ios tap @e1 # 点击(click的别名) agent-browser -p ios fill @e2 "text" agent-browser -p ios swipe up # 移动端特定手势

Take screenshot

截图

agent-browser -p ios screenshot mobile.png
agent-browser -p ios screenshot mobile.png

Close session (shuts down simulator)

关闭会话(关闭模拟器)

agent-browser -p ios close

**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)

**Real devices:** Works with physical iOS devices if pre-configured. Use `--device "<UDID>"` where UDID is from `xcrun xctrace list devices`.
agent-browser -p ios close

**要求**:安装Xcode的macOS系统,以及Appium(`npm install -g appium && appium driver install xcuitest`)

**真实设备**:如果预先配置好,可在物理iOS设备上运行。使用`--device "<UDID>"`,其中UDID可从`xcrun xctrace list devices`获取。

Ref Lifecycle (Important)

引用生命周期(重要)

Refs (
@e1
,
@e2
, etc.) are invalidated when the page changes. Always re-snapshot after:
  • Clicking links or buttons that navigate
  • Form submissions
  • Dynamic content loading (dropdowns, modals)
bash
agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs
引用(
@e1
@e2
等)会在页面变更时失效。在以下操作后务必重新快照:
  • 点击链接或按钮导致页面导航
  • 表单提交
  • 动态内容加载(下拉菜单、模态框)
bash
agent-browser click @e5              # 导航到新页面
agent-browser snapshot -i            # 务必重新快照
agent-browser click @e1              # 使用新的引用

Semantic Locators (Alternative to Refs)

语义定位器(引用的替代方案)

When refs are unavailable or unreliable, use semantic locators:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
当引用不可用或不可靠时,使用语义定位器:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click

Deep-Dive Documentation

深入文档

ReferenceWhen to Use
references/commands.mdFull command reference with all options
references/snapshot-refs.mdRef lifecycle, invalidation rules, troubleshooting
references/session-management.mdParallel sessions, state persistence, concurrent scraping
references/authentication.mdLogin flows, OAuth, 2FA handling, state reuse
references/video-recording.mdRecording workflows for debugging and documentation
references/proxy-support.mdProxy configuration, geo-testing, rotating proxies
参考文档使用场景
references/commands.md包含所有选项的完整命令参考
references/snapshot-refs.md引用生命周期、失效规则、故障排除
references/session-management.md并行会话、状态持久化、并发抓取
references/authentication.md登录流程、OAuth、2FA处理、状态复用
references/video-recording.md录制工作流用于调试和文档记录
references/proxy-support.md代理配置、地域测试、轮换代理

Ready-to-Use Templates

即用型模板

TemplateDescription
templates/form-automation.shForm filling with validation
templates/authenticated-session.shLogin once, reuse state
templates/capture-workflow.shContent extraction with screenshots
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
模板描述
templates/form-automation.sh带验证的表单填写自动化
templates/authenticated-session.sh一次登录,复用状态
templates/capture-workflow.sh带截图的内容提取
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output