browser-automation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser Automation with agent-browser

使用agent-browser实现浏览器自动化

Headless browser CLI by Vercel. Full upstream docs: github.com/vercel-labs/agent-browser
Vercel推出的无头浏览器CLI。完整上游文档:github.com/vercel-labs/agent-browser

Installation

安装

bash
npm install -g agent-browser
agent-browser install                # Download Chromium
agent-browser install --with-deps    # With system dependencies (Linux)
bash
npm install -g agent-browser
agent-browser install                # 下载Chromium
agent-browser install --with-deps    # 安装包含系统依赖的版本(Linux)

Optional: npx skills add vercel-labs/agent-browser

可选操作:npx skills add vercel-labs/agent-browser

undefined
undefined

Quick Start

快速开始

bash
agent-browser open <url>          # Navigate to page
agent-browser snapshot -i         # Get interactive elements with refs
agent-browser click @e1           # Click element by ref
agent-browser fill @e2 "text"     # Fill input by ref
agent-browser close               # Close browser
bash
agent-browser open <url>          # 导航至指定URL
agent-browser snapshot -i         # 获取带引用标记的可交互元素
agent-browser click @e1           # 通过引用标记点击元素
agent-browser fill @e2 "text"     # 通过引用标记填充输入框
agent-browser close               # 关闭浏览器

Core Concept: Snapshot + Refs

核心概念:Snapshot + Refs

Run
agent-browser snapshot -i
to get interactive elements tagged
@e1
,
@e2
, etc. Use these refs for all subsequent interactions. Re-snapshot after navigation or significant DOM changes. This yields 93% less context than full-DOM approaches.
运行
agent-browser snapshot -i
命令,获取标记为
@e1
@e2
等的可交互元素。后续所有交互操作都使用这些引用。在页面导航或DOM发生重大变化后,重新执行快照操作。与全DOM方案相比,这种方式的上下文数据减少93%

When to Use

适用场景

  • Web scraping from JS-rendered / SPA pages
  • Form automation and multi-step workflows
  • Screenshot capture and visual verification
  • E2E test generation and debugging
  • Content capture from authenticated pages
  • 从JS渲染/SPA页面爬取网页内容
  • 表单自动化与多步骤工作流
  • 截图捕获与视觉验证
  • E2E测试生成与调试
  • 从已认证页面捕获内容

Key Commands

核心命令

CommandPurpose
open <url>
Navigate to URL
snapshot -i
Interactive elements with refs
click @e1
Click element
fill @e2 "text"
Clear + type into input
get text @e1
Extract element text
wait --load networkidle
Wait for SPA render
screenshot <path>
Save screenshot
state save <file>
Persist cookies/storage
state load <file>
Restore session
eval "js"
Run JavaScript
record start <path>
Start video recording
record stop
Stop recording
--session <name>
Isolate parallel sessions
--headed
Show browser window
Run
agent-browser --help
for the full 60+ command reference.
命令用途
open <url>
导航至指定URL
snapshot -i
获取带引用标记的可交互元素
click @e1
点击指定元素
fill @e2 "text"
清空并输入内容到指定输入框
get text @e1
提取元素文本内容
wait --load networkidle
等待SPA页面渲染完成
screenshot <path>
保存截图至指定路径
state save <file>
持久化保存Cookie与存储数据
state load <file>
恢复之前的会话状态
eval "js"
执行JavaScript代码
record start <path>
开始录制视频
record stop
停止录制视频
--session <name>
隔离并行会话
--headed
显示浏览器窗口
运行
agent-browser --help
查看全部60+命令的参考文档。

OrchestKit Integration

OrchestKit集成

Safety hook
agent-browser-safety.ts
blocks destructive patterns (credential exfil, recursive spawning) automatically via pretool hook.
Sessions — Use
--session <name>
to run isolated parallel browsers within a single Claude Code session.
Environment variables:
bash
AGENT_BROWSER_SESSION="my-session"   # Default session name
AGENT_BROWSER_PROFILE="/path"        # Persistent browser profile
AGENT_BROWSER_PROVIDER="browserbase" # Cloud provider (browserbase | kernel | browseruse)
AGENT_BROWSER_HEADED=1               # Run headed
安全钩子
agent-browser-safety.ts
通过预工具钩子自动阻止破坏性操作(如凭证泄露、递归生成进程)。
会话管理 — 使用
--session <name>
参数可在单个Claude Code会话中运行相互隔离的并行浏览器实例。
环境变量:
bash
AGENT_BROWSER_SESSION="my-session"   # 默认会话名称
AGENT_BROWSER_PROFILE="/path"        # 持久化浏览器配置文件路径
AGENT_BROWSER_PROVIDER="browserbase" # 云服务商(browserbase | kernel | browseruse)
AGENT_BROWSER_HEADED=1               # 以有头模式运行

Upstream Documentation

上游文档

  • GitHub仓库: vercel-labs/agent-browser
  • CLI帮助:
    agent-browser --help
  • 添加技能:
    npx skills add vercel-labs/agent-browser

Related Skills

相关技能

  • browser-content-capture
    — Content extraction patterns using agent-browser
  • webapp-testing
    — E2E testing with Playwright test framework
  • e2e-testing
    — End-to-end testing patterns
  • browser-content-capture
    — 使用agent-browser实现内容提取的模式
  • webapp-testing
    — 基于Playwright测试框架的E2E测试
  • e2e-testing
    — 端到端测试模式