agent-browser

对比查看原文与翻译

🇺🇸

原文

英文
🇨🇳

翻译

中文

Browser Automation with agent-browser

基于agent-browser的浏览器自动化

Core Workflow

核心工作流程

Every browser automation follows this pattern:
  1. Navigate:
    agent-browser open <url>
  2. Snapshot:
    agent-browser snapshot -i
    (get element refs like
    @e1
    ,
    @e2
    )
  3. Interact: Use refs to click, fill, select
  4. Re-snapshot: After navigation or DOM changes, get fresh refs
bash
agent-browser open https://example.com/form
agent-browser snapshot -i
所有浏览器自动化都遵循以下模式:
  1. 导航
    agent-browser open <url>
  2. 快照
    agent-browser snapshot -i
    (获取元素引用,如
    @e1
    @e2
  3. 交互:使用引用进行点击、填写、选择操作
  4. 重新快照:页面导航或DOM变化后,获取新的引用
bash
agent-browser open https://example.com/form
agent-browser snapshot -i

Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

输出:@e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # Check result
undefined
agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # 检查结果
undefined

Essential Commands

核心命令

bash
undefined
bash
undefined

Navigation

导航

agent-browser open <url> # Navigate (aliases: goto, navigate) agent-browser close # Close browser
agent-browser open <url> # 导航(别名:goto、navigate) agent-browser close # 关闭浏览器

Snapshot

快照

agent-browser snapshot -i # Interactive elements with refs (recommended) agent-browser snapshot -s "#selector" # Scope to CSS selector
agent-browser snapshot -i # 带引用的交互式元素(推荐) agent-browser snapshot -s "#selector" # 限定CSS选择器范围

Interaction (use @refs from snapshot)

交互(使用快照中的@引用)

agent-browser click @e1 # Click element agent-browser fill @e2 "text" # Clear and type text agent-browser type @e2 "text" # Type without clearing agent-browser select @e1 "option" # Select dropdown option agent-browser check @e1 # Check checkbox agent-browser press Enter # Press key agent-browser scroll down 500 # Scroll page
agent-browser click @e1 # 点击元素 agent-browser fill @e2 "text" # 清空并输入文本 agent-browser type @e2 "text" # 输入文本不清空原有内容 agent-browser select @e1 "option" # 选择下拉选项 agent-browser check @e1 # 勾选复选框 agent-browser press Enter # 按下按键 agent-browser scroll down 500 # 向下滚动页面500像素

Get information

获取信息

agent-browser get text @e1 # Get element text agent-browser get url # Get current URL agent-browser get title # Get page title
agent-browser get text @e1 # 获取元素文本 agent-browser get url # 获取当前URL agent-browser get title # 获取页面标题

Wait

等待

agent-browser wait @e1 # Wait for element agent-browser wait --load networkidle # Wait for network idle agent-browser wait --url "**/page" # Wait for URL pattern agent-browser wait 2000 # Wait milliseconds
agent-browser wait @e1 # 等待元素出现 agent-browser wait --load networkidle # 等待网络空闲 agent-browser wait --url "**/page" # 等待URL匹配指定模式 agent-browser wait 2000 # 等待2000毫秒

Capture

捕获

agent-browser screenshot # Screenshot to temp dir agent-browser screenshot --full # Full page screenshot agent-browser pdf output.pdf # Save as PDF
undefined
agent-browser screenshot # 将截图保存到临时目录 agent-browser screenshot --full # 整页截图 agent-browser pdf output.pdf # 保存为PDF
undefined

Common Patterns

常见模式

Form Submission

表单提交

bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
bash
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "jane@example.com"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle

Authentication with State Persistence

带状态持久化的认证

bash
undefined
bash
undefined

Login once and save state

登录一次并保存状态

agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json
agent-browser open https://app.example.com/login agent-browser snapshot -i agent-browser fill @e1 "$USERNAME" agent-browser fill @e2 "$PASSWORD" agent-browser click @e3 agent-browser wait --url "**/dashboard" agent-browser state save auth.json

Reuse in future sessions

在后续会话中复用

agent-browser state load auth.json agent-browser open https://app.example.com/dashboard
undefined
agent-browser state load auth.json agent-browser open https://app.example.com/dashboard
undefined

Data Extraction

数据提取

bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # Get specific element text
agent-browser get text body > page.txt  # Get all page text
bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e5           # 获取指定元素的文本
agent-browser get text body > page.txt  # 获取页面所有文本

JSON output for parsing

输出JSON格式以便解析

agent-browser snapshot -i --json agent-browser get text @e1 --json
undefined
agent-browser snapshot -i --json agent-browser get text @e1 --json
undefined

Parallel Sessions

并行会话

bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list
bash
agent-browser --session site1 open https://site-a.com
agent-browser --session site2 open https://site-b.com

agent-browser --session site1 snapshot -i
agent-browser --session site2 snapshot -i

agent-browser session list

Visual Browser (Debugging)

可视化浏览器(调试用)

bash
agent-browser --headed open https://example.com
agent-browser highlight @e1          # Highlight element
agent-browser record start demo.webm # Record session
bash
agent-browser --headed open https://example.com
agent-browser highlight @e1          # 高亮元素
agent-browser record start demo.webm # 录制会话

Ref Lifecycle (Important)

引用生命周期(重要)

Refs (
@e1
,
@e2
, etc.) are invalidated when the page changes. Always re-snapshot after:
  • Clicking links or buttons that navigate
  • Form submissions
  • Dynamic content loading (dropdowns, modals)
bash
agent-browser click @e5              # Navigates to new page
agent-browser snapshot -i            # MUST re-snapshot
agent-browser click @e1              # Use new refs
引用(
@e1
@e2
等)会在页面变化时失效。在以下操作后必须重新快照:
  • 点击链接或按钮导致页面导航
  • 表单提交
  • 动态内容加载(下拉菜单、模态框)
bash
agent-browser click @e5              # 导航到新页面
agent-browser snapshot -i            # 必须重新快照
agent-browser click @e1              # 使用新的引用

Semantic Locators (Alternative to Refs)

语义定位器(引用的替代方案)

When refs are unavailable or unreliable, use semantic locators:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click
当引用不可用或不可靠时,使用语义定位器:
bash
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find role button click --name "Submit"
agent-browser find placeholder "Search" type "query"
agent-browser find testid "submit-btn" click

Deep-Dive Documentation

深度文档

ReferenceWhen to Use
references/commands.mdFull command reference with all options
references/snapshot-refs.mdRef lifecycle, invalidation rules, troubleshooting
references/session-management.mdParallel sessions, state persistence, concurrent scraping
references/authentication.mdLogin flows, OAuth, 2FA handling, state reuse
references/video-recording.mdRecording workflows for debugging and documentation
references/proxy-support.mdProxy configuration, geo-testing, rotating proxies
参考文档使用场景
references/commands.md包含所有选项的完整命令参考
references/snapshot-refs.md引用生命周期、失效规则、故障排除
references/session-management.md并行会话、状态持久化、并发抓取
references/authentication.md登录流程、OAuth、2FA处理、状态复用
references/video-recording.md录制工作流用于调试和文档
references/proxy-support.md代理配置、地域测试、轮换代理

Ready-to-Use Templates

即用型模板

TemplateDescription
templates/form-automation.shForm filling with validation
templates/authenticated-session.shLogin once, reuse state
templates/capture-workflow.shContent extraction with screenshots
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output
模板描述
templates/form-automation.sh带验证的表单填写
templates/authenticated-session.sh一次登录,复用状态
templates/capture-workflow.sh结合截图的内容提取
bash
./templates/form-automation.sh https://example.com/form
./templates/authenticated-session.sh https://app.example.com/login
./templates/capture-workflow.sh https://example.com ./output