chrome-relay

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Chrome Relay

Drives the user's real Chrome through a Chrome extension + local native host. Prefer it when logged-in browser state (auth cookies, sessions, installed extensions) matters.

通过Chrome扩展程序+本地原生主机驱动用户的真实Chrome浏览器。当需要用到已登录的浏览器状态（认证Cookie、会话、已安装的扩展程序）时，优先使用此工具。

Setup

安装配置

Chrome extension

CLI:

pnpm add -g chrome-relay
chrome-relay install
chrome-relay doctor

Verify CLI ≥ 0.5.20 — earlier versions have a silent click bug on Radix/React-Aria UIs:

chrome-relay --version

Chrome扩展程序

CLI：

pnpm add -g chrome-relay
chrome-relay install
chrome-relay doctor

请确认CLI版本≥0.5.20——早期版本在Radix/React-Aria界面上存在静默点击bug：

chrome-relay --version

Tool surface

工具功能列表

Command	What it does
`tabs`	List windows + tabs with their `tabId` s
`navigate <url>`	Open in current tab. `--new` opens in a background tab (default). `--active` brings it to foreground. `--tab <id>` retargets an existing tab.
`read --tab <id> -i`	Interaction map: visible interactive elements with selectors, text, role, bounds. Pipe to file.
`ax --tab <id>`	Accessibility tree — ~30× smaller than `read` , more semantic. Returns `backendDOMNodeId` s.
`click <selector> --tab <id>`	Trusted hover + press + release at element center (CDP `Input.dispatchMouseEvent` with `pointerType: "mouse"` ).
`click --x N --y N --tab <id>`	Coordinate-mode click. Selector optional.
`click-ax --node <backendDOMNodeId> --tab <id>`	Click an element resolved from a prior `ax` call.
`hover [selector \| --x --y] --tab <id>`	Pointer move only — fires `:hover` styles.
`fill <selector> <value> --tab <id>`	Atomic value write into `<input>` / `<textarea>` / `<select>` . Bypasses React's value tracker.
`type <text> --tab <id> [-s <selector>]`	CDP `Input.insertText` . Use for contenteditable / Draft.js / Lexical / ProseMirror. Appends at caret; clear the input first if it had a value.
`keys <chord> --tab <id>`	Single key or chord: `Enter` , `Tab` , `Escape` , `Cmd+K` , `Shift+ArrowDown` .
`js <code> --tab <id>`	`Runtime.evaluate` in MAIN world. Use `return` for the value. Top-level `await` works.
`screenshot --tab <id> -o <path>`	PNG. `--full` captures beyond viewport. `--max-edge N` resizes.
`screencast --tab <id> -o <path>`	Record a tab via CDP (paint-driven). Requires an active tab.
`network --tab <id>`	HTTP request/response ring buffer, last 200 per tab. `network read --request-id <id>` for bodies.
`console --tab <id>`	`console.log/warn/error` + page exceptions, last 200.
`viewport`	Emulate device viewport, DPR, mobile flag, touch, UA.
`workspace` / `group`	Manage named windows / tab-groups so multiple agents can drive separate windows.
`switch <tabId>` / `close <tabIds...>`	Activate or close tabs
`self-reload`	Restart the extension's service worker after a rebuild
`release-notes --since <ver>` / `update`	Queryable changelog; agent-readable JSON.
`call <tool> [json]`	Raw pass-through for any internal tool.

命令	功能说明
`tabs`	列出窗口及标签页，并显示它们的 `tabId`
`navigate <url>`	在当前标签页打开指定URL。 `--new` 参数会在后台打开新标签页（默认行为）。 `--active` 参数会将标签页置于前台。 `--tab <id>` 参数会指定已存在的标签页作为目标。
`read --tab <id> -i`	交互映射：显示可见的交互元素及其选择器、文本、角色、边界。可将结果输出到文件。
`ax --tab <id>`	无障碍树——比 `read` 输出小约30倍，语义性更强。返回 `backendDOMNodeId` 。
`click <selector> --tab <id>`	在元素中心执行可信的悬停+按下+释放操作（通过CDP的 `Input.dispatchMouseEvent` ， `pointerType: "mouse"` ）。
`click --x N --y N --tab <id>`	坐标模式点击。选择器为可选参数。
`click-ax --node <backendDOMNodeId> --tab <id>`	点击通过之前 `ax` 调用获取到的元素。
`hover [selector \| --x --y] --tab <id>`	仅移动指针——触发 `:hover` 样式。
`fill <selector> <value> --tab <id>`	向 `<input>` / `<textarea>` / `<select>` 原子性写入值。绕过React的值跟踪机制。
`type <text> --tab <id> [-s <selector>]`	CDP的 `Input.insertText` 。适用于contenteditable/Draft.js/Lexical/ProseMirror。在光标位置追加文本；如果输入框已有内容，请先清空。
`keys <chord> --tab <id>`	单个按键或组合键： `Enter` 、 `Tab` 、 `Escape` 、 `Cmd+K` 、 `Shift+ArrowDown` 。
`js <code> --tab <id>`	在MAIN环境中执行 `Runtime.evaluate` 。使用 `return` 返回结果。支持顶级 `await` 。
`screenshot --tab <id> -o <path>`	生成PNG截图。 `--full` 参数捕获超出视口的内容。 `--max-edge N` 参数调整尺寸。
`screencast --tab <id> -o <path>`	通过CDP录制标签页（基于绘制事件）。需要标签页处于激活状态。
`network --tab <id>`	HTTP请求/响应环形缓冲区，每个标签页保留最近200条记录。使用 `network read --request-id <id>` 查看请求/响应体。
`console --tab <id>`	`console.log/warn/error` 输出及页面异常，保留最近200条记录。
`viewport`	模拟设备视口、设备像素比（DPR）、移动端标识、触摸功能、用户代理（UA）。
`workspace` / `group`	管理命名窗口/标签页组，以便多个Agent可以驱动不同的窗口。
`switch <tabId>` / `close <tabIds...>`	激活或关闭标签页
`self-reload`	重建后重启扩展程序的服务工作线程
`release-notes --since <ver>` / `update`	可查询的更新日志；Agent可读的JSON格式。
`call <tool> [json]`	直接调用任意内部工具的原始接口。

Picking the right text tool

选择合适的文本操作工具

Target element	Tool
`<input>` , `<textarea>` , `<select>` (including React-controlled)	`fill`
`[contenteditable]` , `role="textbox"` , Draft.js / Lexical / ProseMirror, X compose, LinkedIn DM, new Reddit composer	`type`
Submit, navigate menus, modifier shortcuts	`keys`
Combobox / autocomplete option selection	`type` into filter → `keys ArrowDown` → `keys Enter` (why)
Shadow DOM, framework-internal pokes, scraping, custom widgets	`js`

目标元素	工具
`<input>` 、 `<textarea>` 、 `<select>` （包括React受控组件）	`fill`
`[contenteditable]` 、 `role="textbox"` 、Draft.js/Lexical/ProseMirror、X编辑器、LinkedIn私信、新版Reddit编辑器	`type`
提交表单、导航菜单、修饰键快捷键	`keys`
组合框/自动补全选项选择	在筛选框中使用 `type` 输入 → `keys ArrowDown` → `keys Enter` （原因）
Shadow DOM、框架内部操作、数据爬取、自定义组件	`js`

Workflow

使用流程

Find the tab —
```
chrome-relay tabs
```

Open the page —

chrome-relay navigate "https://example.com" --new

(background by default)

Read structure — pipe to a file, don't dump 100KB into context:
sh
```
chrome-relay read --tab 1234 -i > /tmp/page.json
jq '.elements[] | select(.text | test("Compose"; "i"))' /tmp/page.json
```
For dense apps (LinkedIn, Notion), prefer
```
ax
```
— way smaller payload.

Act on the selectors:

chrome-relay click "<selector>" --tab 1234
chrome-relay fill "<selector>" "value" --tab 1234
chrome-relay type "tweet body" --tab 1234 -s "[data-testid=tweetTextarea_0]"
chrome-relay keys "Enter" --tab 1234

Drop to

js

when the DOM doesn't expose what you need:

chrome-relay js --tab 1234 "return document.title"
chrome-relay js --tab 1234 "const r = await fetch('/api/me'); return await r.json()"

Capture proof —

chrome-relay screenshot --tab 1234 -o /tmp/evidence.png

找到目标标签页 ——
```
chrome-relay tabs
```

打开页面 ——

chrome-relay navigate "https://example.com" --new

（默认在后台打开）

读取页面结构 —— 将结果输出到文件，不要将100KB的内容直接传入上下文：
sh
```
chrome-relay read --tab 1234 -i > /tmp/page.json
jq '.elements[] | select(.text | test("Compose"; "i"))' /tmp/page.json
```
对于内容密集的应用（如LinkedIn、Notion），优先使用
```
ax
```
——输出内容小得多。

通过选择器执行操作：

chrome-relay click "<selector>" --tab 1234
chrome-relay fill "<selector>" "value" --tab 1234
chrome-relay type "tweet body" --tab 1234 -s "[data-testid=tweetTextarea_0]"
chrome-relay keys "Enter" --tab 1234

当DOM未暴露所需内容时，使用

js

命令：

chrome-relay js --tab 1234 "return document.title"
chrome-relay js --tab 1234 "const r = await fetch('/api/me'); return await r.json()"

捕获操作证据 ——

chrome-relay screenshot --tab 1234 -o /tmp/evidence.png

Top gotchas

常见陷阱

type
appends — it inserts at the caret. If the input had a value (autosaved draft, default text), clear it first via
```
js
```
or
```
keys
```
(Cmd+A then Backspace).
Coords go stale fast — read
```
getBoundingClientRect
```
, scroll/reflow, then click → you hit the wrong element. For autocomplete popups especially, use keyboard nav, not coord clicks.
Click "succeeded" but nothing happened — first diagnostic:
```
document.elementFromPoint(x, y)
```
. If it returns a wrapper or form background, your coords are wrong. If it returns the right element but state didn't change, you're likely on chrome-relay <0.5.20 — upgrade.

More recipes: references/patterns.md Failure modes: references/troubleshooting.md

type
命令会追加文本——它在光标位置插入文本。如果输入框已有内容（自动保存的草稿、默认文本），请先通过
```
js
```
或
```
keys
```
命令清空（Cmd+A然后按Backspace）。
坐标会快速失效——读取
```
getBoundingClientRect
```
后，页面滚动/重排，再点击就会命中错误元素。对于自动补全弹窗，尤其建议使用键盘导航，而非坐标点击。
显示点击"成功"但无实际效果——首先排查：
```
document.elementFromPoint(x, y)
```
。如果返回的是包装器或表单背景元素，说明坐标错误。如果返回正确元素但状态未改变，很可能是使用了chrome-relay <0.5.20版本——请升级。

更多使用示例：references/patterns.md 故障排查：references/troubleshooting.md

Operational guidance

操作指南

Don't give up early. A failing click is information, not a stop signal. Attach a document-level listener with

capture:true

and watch what fires:

chrome-relay js --tab 1234 "
  ['pointerdown','mousedown','click'].forEach(t =>
    document.addEventListener(t, e => console.log(t, e.target.tagName, e.target.className), {capture:true})
  );
  return 'listening'
"
# do the action, then:
chrome-relay console --tab 1234

Don't echo secrets. When extracting tokens / API keys via
```
js
```
, write the result directly to a file. Never
```
echo $TOKEN
```
or interpolate into shell strings — it ends up in scrollback, logs, and tool transcripts.
Capture before irreversible actions (form submit, send message, account change). Save the screenshot path.

不要过早放弃。点击失败是信息，而非终止信号。通过

capture:true

添加文档级监听器，观察触发的事件：

chrome-relay js --tab 1234 "
  ['pointerdown','mousedown','click'].forEach(t =>
    document.addEventListener(t, e => console.log(t, e.target.tagName, e.target.className), {capture:true})
  );
  return 'listening'
"
# 执行操作后，运行以下命令查看日志：
chrome-relay console --tab 1234

不要泄露敏感信息。通过
```
js
```
提取令牌/API密钥时，请直接将结果写入文件。切勿使用
```
echo $TOKEN
```
或将其插入shell字符串——这会导致敏感信息出现在回滚历史、日志和工具记录中。
在执行不可逆操作前捕获证据（表单提交、发送消息、账户变更）。保存截图路径。

Guardrails

注意事项

Pipe
```
read -i
```
to a file and grep/jq it. Don't paste the full element map into chat.
If a flag is unclear,
```
chrome-relay <command> --help
```
is authoritative — these docs lag.

将
```
read -i
```
的结果输出到文件，再通过grep/jq查询。不要将完整的元素映射粘贴到聊天窗口。
如果对参数有疑问，
```
chrome-relay <command> --help
```
是权威参考——本文档内容可能滞后。