open-browser-use
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOpen Browser Use
Open Browser Use
Overview
概述
Open Browser Use connects an MV3 Chrome extension, a local native messaging host, a CLI, SDKs, and an optional stdio MCP server so agents can automate a real Chrome profile. It is not Codex.app-specific; adapt the commands, MCP config, and SDK examples to the agent runtime you are operating in.
Open Browser Use 连接MV3 Chrome扩展、本地原生消息主机、CLI、SDK以及可选的stdio MCP服务器,让Agent能够自动化操作真实的Chrome配置文件。它并非Codex.app专属;可根据你所使用的Agent运行时环境调整命令、MCP配置和SDK示例。
Core Workflow
核心工作流程
- Check setup with or
open-browser-use ping. If it fails because setup is missing, read references/installation.md.obu ping - Choose a unique browser session id for the current agent task before opening or claiming tabs. Prefer the surrounding runtime's conversation/session id when available; otherwise create a short unique id such as . Reuse that same id for every Open Browser Use command in this task.
obu-<task-slug>-<timestamp> - Name the current browser task group before opening or claiming tabs. Use a short task label followed by ; if no better task label is available, use
- OBU.Task - OBU - Use the CLI for simple inspection or one-shot actions: ,
info,tabs,user-tabs,history,open-tab,navigate, andcdp.call - If the surrounding agent runtime supports local MCP servers, configure and call the exposed browser tools directly. Read references/sdk-and-protocol.md.
obu mcp - Use the JavaScript, Python, or Go SDK for multi-step workflows, event subscriptions, or when the surrounding agent runtime already runs code. Read references/sdk-and-protocol.md.
- Before ending browser work, release or keep session tabs with , the MCP
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '<json-array>'tool, or the SDKfinalize_tabs/finalizeTabs/finalize_tabsmethod.FinalizeTabs - If communication fails after setup, read references/troubleshooting.md.
- 使用或
open-browser-use ping检查设置。如果因缺少设置而失败,请阅读references/installation.md。obu ping - 在打开或认领标签页之前,为当前Agent任务选择一个唯一的浏览器会话ID。如果有可用的关联运行时对话/会话ID,优先使用该ID;否则创建一个简短的唯一ID,例如。在本次任务的所有Open Browser Use命令中重复使用同一个ID。
obu-<task-slug>-<timestamp> - 在打开或认领标签页之前,为当前浏览器任务组命名。使用简短的任务标签后接;如果没有更合适的任务标签,使用
- OBU。Task - OBU - 使用CLI执行简单检查或一次性操作:、
info、tabs、user-tabs、history、open-tab、navigate和cdp。call - 如果关联的Agent运行时环境支持本地MCP服务器,请配置并直接调用暴露的浏览器工具。阅读references/sdk-and-protocol.md。
obu mcp - 对于多步骤工作流、事件订阅或关联Agent运行时环境已在运行代码的场景,使用JavaScript、Python或Go SDK。阅读references/sdk-and-protocol.md。
- 在结束浏览器操作前,使用、MCP的
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '<json-array>'工具或SDK的finalize_tabs/finalizeTabs/finalize_tabs方法来释放或保留会话标签页。FinalizeTabs - 如果设置完成后通信失败,请阅读references/troubleshooting.md。
Operating Rules
操作规则
- Treat the browser as the user's real Chrome profile. Do not inspect cookies, passwords, session stores, or unrelated browser data.
- Ask the user before installing the extension, opening Chrome for them, enabling extension permissions, uploading local files, reading/writing clipboard data, submitting forms, purchasing, deleting, sending, or making other externally visible changes.
- Do not assume Codex.app helpers, Node REPL globals, or a bundled plugin UI are available. Use the installed /
open-browser-useCLI or the published SDKs.obu - Do not guess tab ids. List tabs first, then use ids returned by ,
tabs,user-tabs, or SDK calls.open-tab - Prefer /
claim-tabfor existing user tabs. Claiming should be based on the currentclaimUserTabresult and visible evidence such as URL, title, recency, or group.user-tabs - Use only when the user or runtime provides an explicit socket. Otherwise let the CLI and SDKs discover the active socket registry.
--socket - Do not rely on the CLI fallback session for agent tasks. Always pass a task-unique
obu-clito CLI and MCP commands, or set--session-id/sessionId/session_idin SDK clients. The fallback exists for quick manual use and can reuse stale task groups across unrelated agent sessions.SessionID - Direct CLI subcommands and can share the same browser session only when they use the same explicit
open-browser-use run. Finalize that same session before ending browser work.--session-id - Use only when no safer convenience command or SDK wrapper exists.
call --method <method> --params '<json>'
- 将浏览器视为用户的真实Chrome配置文件。不要检查Cookie、密码、会话存储或无关的浏览器数据。
- 在安装扩展、为用户打开Chrome、启用扩展权限、上传本地文件、读写剪贴板数据、提交表单、购买、删除、发送或进行其他外部可见的更改之前,需征得用户同意。
- 不要假设Codex.app助手、Node REPL全局变量或捆绑的插件UI可用。使用已安装的/
open-browser-useCLI或已发布的SDK。obu - 不要猜测标签页ID。先列出标签页,然后使用、
tabs、user-tabs或SDK调用返回的ID。open-tab - 对于现有用户标签页,优先使用/
claim-tab。认领操作应基于当前claimUserTab的结果以及可见证据,如URL、标题、最近使用情况或分组。user-tabs - 仅当用户或运行时环境提供明确的套接字时,才使用。否则让CLI和SDK自动发现活动套接字注册表。
--socket - 不要依赖CLI回退会话执行Agent任务。始终为CLI和MCP命令传递任务唯一的
obu-cli,或在SDK客户端中设置--session-id/sessionId/session_id。回退会话仅用于快速手动操作,可能会在不相关的Agent会话之间复用陈旧的任务组。SessionID - 直接CLI子命令和只有在使用相同的显式
open-browser-use run时才能共享同一个浏览器会话。在结束浏览器操作前完成该会话的收尾工作。--session-id - 仅当没有更安全的便捷命令或SDK包装器时,才使用。
call --method <method> --params '<json>'
Common CLI Actions
常见CLI操作
sh
export OBU_SESSION_ID="obu-docs-scan-$(date +%Y%m%d%H%M%S)"
open-browser-use ping --session-id "$OBU_SESSION_ID"
open-browser-use info --session-id "$OBU_SESSION_ID"
open-browser-use name-session --session-id "$OBU_SESSION_ID" --name "Task - OBU"
open-browser-use tabs --session-id "$OBU_SESSION_ID"
open-browser-use user-tabs --session-id "$OBU_SESSION_ID"
open-browser-use history --session-id "$OBU_SESSION_ID" --query "example" --limit 20
open-browser-use open-tab --session-id "$OBU_SESSION_ID" --url https://example.com
open-browser-use navigate --session-id "$OBU_SESSION_ID" --tab-id <tab-id> --url https://example.com
open-browser-use cdp --session-id "$OBU_SESSION_ID" --tab-id <tab-id> --method Runtime.evaluate --params '{"expression":"document.title"}'
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]'For CLI-level orchestration without writing SDK code, use a line-oriented
action plan:
sh
open-browser-use run --session-id "$OBU_SESSION_ID" -c '
name-session "Docs scan - OBU"
open-tab https://docs.browser-use.com
wait-load domcontentloaded
page-info
finalize-tabs []
'Each action line shares one session/turn. and set the
default tab for later tab-scoped actions such as , ,
, , , and .
open-tabclaim-tabwait-loadpage-infonavigatecdpmove-mousewait-file-chooserUse as the short alias when available.
obush
export OBU_SESSION_ID="obu-docs-scan-$(date +%Y%m%d%H%M%S)"
open-browser-use ping --session-id "$OBU_SESSION_ID"
open-browser-use info --session-id "$OBU_SESSION_ID"
open-browser-use name-session --session-id "$OBU_SESSION_ID" --name "Task - OBU"
open-browser-use tabs --session-id "$OBU_SESSION_ID"
open-browser-use user-tabs --session-id "$OBU_SESSION_ID"
open-browser-use history --session-id "$OBU_SESSION_ID" --query "example" --limit 20
open-browser-use open-tab --session-id "$OBU_SESSION_ID" --url https://example.com
open-browser-use navigate --session-id "$OBU_SESSION_ID" --tab-id <tab-id> --url https://example.com
open-browser-use cdp --session-id "$OBU_SESSION_ID" --tab-id <tab-id> --method Runtime.evaluate --params '{"expression":"document.title"}'
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]'如果无需编写SDK代码,可使用面向行的行动计划进行CLI级编排:
sh
open-browser-use run --session-id "$OBU_SESSION_ID" -c '
name-session "Docs scan - OBU"
open-tab https://docs.browser-use.com
wait-load domcontentloaded
page-info
finalize-tabs []
'每个操作行共享一个会话/轮次。和会为后续的标签页范围操作(如、、、、和)设置默认标签页。
open-tabclaim-tabwait-loadpage-infonavigatecdpmove-mousewait-file-chooser如果可用,使用作为短别名。
obuMCP Usage
MCP使用方法
For runtimes that can launch local MCP servers over stdio, use:
toml
[mcp_servers.open_browser_use]
command = "obu"
args = ["mcp", "--session-id", "obu-<task-or-conversation-id>"]Use a fresh value per agent task or conversation. If the runtime
has a stable conversation/session id, derive the MCP from it.
--session-id--session-idThe MCP server exposes tools including , , ,
, , , , , ,
, and unrestricted .
user_tabsopen_tabclaim_tabnavigatewait_loadpage_infocdphistoryrun_action_planfinalize_tabscall对于可通过stdio启动本地MCP服务器的运行时环境,使用:
toml
[mcp_servers.open_browser_use]
command = "obu"
args = ["mcp", "--session-id", "obu-<task-or-conversation-id>"]为每个Agent任务或对话使用全新的值。如果运行时环境有稳定的对话/会话ID,从中派生MCP的。
--session-id--session-idMCP服务器暴露的工具包括、、、、、、、、、以及无限制的。
user_tabsopen_tabclaim_tabnavigatewait_loadpage_infocdphistoryrun_action_planfinalize_tabscallTab Lifecycle
标签页生命周期
- Session tabs are tabs Open Browser Use has created or claimed for the current agent workflow.
- Use one unique session id per agent task or conversation. Do not share the fallback session across unrelated tasks.
obu-cli - Task session groups should be named from the task, using the pattern . Use
<short task> - OBUas the fallback name.Task - OBU - Keep no tabs by default: .
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]' - Keep a tab only when the user needs that live page after the turn. Omit research, source, search, intermediate, duplicate, blank, error, and login/navigation tabs after extracting what you need.
- Keep a tab with when the tab itself is the user-facing output or requested open page, such as a created or edited document, dashboard, checkout/cart, submitted form result, or a page the user explicitly asked to inspect directly.
status: "deliverable" - Keep a tab with only when the task is still in progress and the user or a later turn should continue from the current task group, such as a page waiting for user input, login, approval, payment, CAPTCHA, or an unfinished workflow.
status: "handoff" - Handoff tabs stay in the task session group. Deliverable tabs move to the shared tab group.
✅ Open Browser Use - Run finalization as the last Open Browser Use browser action for the turn. Do not call Open Browser Use browser tools after finalizing; if more browser work is needed, do it first and finalize once with the final tab disposition.
- 会话标签页是Open Browser Use为当前Agent工作流创建或认领的标签页。
- 为每个Agent任务或对话使用一个唯一的会话ID。不要在不相关的任务之间共享回退会话。
obu-cli - 任务会话组应根据任务命名,采用的格式。使用
<简短任务名> - OBU作为默认名称。Task - OBU - 默认不保留任何标签页:。
open-browser-use finalize-tabs --session-id "$OBU_SESSION_ID" --keep '[]' - 仅当用户在轮次结束后需要该活动页面时才保留标签页。在提取所需内容后,移除研究、来源、搜索、中间、重复、空白、错误以及登录/导航标签页。
- 当标签页本身是面向用户的输出或请求打开的页面(如创建或编辑的文档、仪表板、结账/购物车、提交表单的结果,或用户明确要求直接查看的页面)时,将其标记为并保留。
status: "deliverable" - 仅当任务仍在进行中且用户或后续轮次需要从当前任务组继续操作时(如等待用户输入、登录、审批、支付、验证码或未完成的工作流的页面),将标签页标记为并保留。
status: "handoff" - 交接标签页(handoff)留在任务会话组中。交付标签页(deliverable)移至共享的标签页组。
✅ Open Browser Use - 将收尾操作作为本轮次的最后一个Open Browser Use浏览器操作。收尾后不要调用Open Browser Use浏览器工具;如果需要更多浏览器操作,请先执行这些操作,然后一次性完成收尾并设置最终的标签页处置方式。
File Choosers, Downloads, And Clipboard
文件选择器、下载和剪贴板
- File uploads use the intercepted file chooser flow: start waiting, trigger the chooser in the page, then set absolute local paths with or the SDK equivalent.
set-file-chooser-files - Downloads can be observed with SDK notification handlers or Browser Use methods such as and
waitForDownload.downloadPath - Clipboard helpers operate through the current controlled tab and should be treated as sensitive user actions.
- 文件上传使用拦截的文件选择器流程:开始等待,在页面中触发选择器,然后使用或等效的SDK方法设置本地绝对路径。
set-file-chooser-files - 可通过SDK通知处理程序或Browser Use方法(如和
waitForDownload)监控下载。downloadPath - 剪贴板助手通过当前受控标签页运行,应视为敏感的用户操作。
References
参考资料
- references/installation.md: one-time CLI and browser extension setup, including cases where user cooperation is required.
- references/sdk-and-protocol.md: JavaScript, Python, Go, socket, and JSON-RPC usage details.
- references/troubleshooting.md: connection failures, stale sockets, extension/native host checks, and permission issues.
- references/installation.md:一次性CLI和浏览器扩展设置,包括需要用户配合的场景。
- references/sdk-and-protocol.md:JavaScript、Python、Go、套接字和JSON-RPC的使用细节。
- references/troubleshooting.md:连接失败、陈旧套接字、扩展/原生主机检查以及权限问题。