chrome-relay
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseChrome Relay
Chrome Relay
Drives the user's real Chrome through a Chrome extension + local native host. Prefer it when logged-in browser state (auth cookies, sessions, installed extensions) matters.
通过Chrome扩展程序+本地原生主机驱动用户的真实Chrome浏览器。当需要用到已登录的浏览器状态(认证Cookie、会话、已安装的扩展程序)时,优先使用此工具。
Setup
安装配置
- Chrome extension
- CLI:
sh
pnpm add -g chrome-relay chrome-relay install chrome-relay doctor
Verify CLI ≥ 0.5.20 — earlier versions have a silent click bug on Radix/React-Aria UIs:
sh
chrome-relay --version- Chrome扩展程序
- CLI:
sh
pnpm add -g chrome-relay chrome-relay install chrome-relay doctor
请确认CLI版本≥0.5.20——早期版本在Radix/React-Aria界面上存在静默点击bug:
sh
chrome-relay --versionTool surface
工具功能列表
| Command | What it does |
|---|---|
| List windows + tabs with their |
| Open in current tab. |
| Interaction map: visible interactive elements with selectors, text, role, bounds. Pipe to file. |
| Accessibility tree — ~30× smaller than |
| Trusted hover + press + release at element center (CDP |
| Coordinate-mode click. Selector optional. |
| Click an element resolved from a prior |
| Pointer move only — fires |
| Atomic value write into |
| CDP |
| Single key or chord: |
| |
| PNG. |
| Record a tab via CDP (paint-driven). Requires an active tab. |
| HTTP request/response ring buffer, last 200 per tab. |
| |
| Emulate device viewport, DPR, mobile flag, touch, UA. |
| Manage named windows / tab-groups so multiple agents can drive separate windows. |
| Activate or close tabs |
| Restart the extension's service worker after a rebuild |
| Queryable changelog; agent-readable JSON. |
| Raw pass-through for any internal tool. |
| 命令 | 功能说明 |
|---|---|
| 列出窗口及标签页,并显示它们的 |
| 在当前标签页打开指定URL。 |
| 交互映射:显示可见的交互元素及其选择器、文本、角色、边界。可将结果输出到文件。 |
| 无障碍树——比 |
| 在元素中心执行可信的悬停+按下+释放操作(通过CDP的 |
| 坐标模式点击。选择器为可选参数。 |
| 点击通过之前 |
| 仅移动指针——触发 |
| 向 |
| CDP的 |
| 单个按键或组合键: |
| 在MAIN环境中执行 |
| 生成PNG截图。 |
| 通过CDP录制标签页(基于绘制事件)。需要标签页处于激活状态。 |
| HTTP请求/响应环形缓冲区,每个标签页保留最近200条记录。使用 |
| |
| 模拟设备视口、设备像素比(DPR)、移动端标识、触摸功能、用户代理(UA)。 |
| 管理命名窗口/标签页组,以便多个Agent可以驱动不同的窗口。 |
| 激活或关闭标签页 |
| 重建后重启扩展程序的服务工作线程 |
| 可查询的更新日志;Agent可读的JSON格式。 |
| 直接调用任意内部工具的原始接口。 |
Picking the right text tool
选择合适的文本操作工具
| Target element | Tool |
|---|---|
| |
| |
| Submit, navigate menus, modifier shortcuts | |
| Combobox / autocomplete option selection | |
| Shadow DOM, framework-internal pokes, scraping, custom widgets | |
| 目标元素 | 工具 |
|---|---|
| |
| |
| 提交表单、导航菜单、修饰键快捷键 | |
| 组合框/自动补全选项选择 | 在筛选框中使用 |
| Shadow DOM、框架内部操作、数据爬取、自定义组件 | |
Workflow
使用流程
- Find the tab —
chrome-relay tabs - Open the page — (background by default)
chrome-relay navigate "https://example.com" --new - Read structure — pipe to a file, don't dump 100KB into context:
For dense apps (LinkedIn, Notion), prefersh
chrome-relay read --tab 1234 -i > /tmp/page.json jq '.elements[] | select(.text | test("Compose"; "i"))' /tmp/page.json— way smaller payload.ax - Act on the selectors:
sh
chrome-relay click "<selector>" --tab 1234 chrome-relay fill "<selector>" "value" --tab 1234 chrome-relay type "tweet body" --tab 1234 -s "[data-testid=tweetTextarea_0]" chrome-relay keys "Enter" --tab 1234 - Drop to when the DOM doesn't expose what you need:
jsshchrome-relay js --tab 1234 "return document.title" chrome-relay js --tab 1234 "const r = await fetch('/api/me'); return await r.json()" - Capture proof —
chrome-relay screenshot --tab 1234 -o /tmp/evidence.png
- 找到目标标签页 ——
chrome-relay tabs - 打开页面 —— (默认在后台打开)
chrome-relay navigate "https://example.com" --new - 读取页面结构 —— 将结果输出到文件,不要将100KB的内容直接传入上下文:
对于内容密集的应用(如LinkedIn、Notion),优先使用sh
chrome-relay read --tab 1234 -i > /tmp/page.json jq '.elements[] | select(.text | test("Compose"; "i"))' /tmp/page.json——输出内容小得多。ax - 通过选择器执行操作:
sh
chrome-relay click "<selector>" --tab 1234 chrome-relay fill "<selector>" "value" --tab 1234 chrome-relay type "tweet body" --tab 1234 -s "[data-testid=tweetTextarea_0]" chrome-relay keys "Enter" --tab 1234 - 当DOM未暴露所需内容时,使用命令:
jsshchrome-relay js --tab 1234 "return document.title" chrome-relay js --tab 1234 "const r = await fetch('/api/me'); return await r.json()" - 捕获操作证据 ——
chrome-relay screenshot --tab 1234 -o /tmp/evidence.png
Top gotchas
常见陷阱
- appends — it inserts at the caret. If the input had a value (autosaved draft, default text), clear it first via
typeorjs(Cmd+A then Backspace).keys - Coords go stale fast — read , scroll/reflow, then click → you hit the wrong element. For autocomplete popups especially, use keyboard nav, not coord clicks.
getBoundingClientRect - Click "succeeded" but nothing happened — first diagnostic: . If it returns a wrapper or form background, your coords are wrong. If it returns the right element but state didn't change, you're likely on chrome-relay <0.5.20 — upgrade.
document.elementFromPoint(x, y)
More recipes: references/patterns.md
Failure modes: references/troubleshooting.md
- 命令会追加文本——它在光标位置插入文本。如果输入框已有内容(自动保存的草稿、默认文本),请先通过
type或js命令清空(Cmd+A然后按Backspace)。keys - 坐标会快速失效——读取后,页面滚动/重排,再点击就会命中错误元素。对于自动补全弹窗,尤其建议使用键盘导航,而非坐标点击。
getBoundingClientRect - 显示点击"成功"但无实际效果——首先排查:。如果返回的是包装器或表单背景元素,说明坐标错误。如果返回正确元素但状态未改变,很可能是使用了chrome-relay <0.5.20版本——请升级。
document.elementFromPoint(x, y)
更多使用示例:references/patterns.md
故障排查:references/troubleshooting.md
Operational guidance
操作指南
- Don't give up early. A failing click is information, not a stop signal. Attach a document-level listener with and watch what fires:
capture:trueshchrome-relay js --tab 1234 " ['pointerdown','mousedown','click'].forEach(t => document.addEventListener(t, e => console.log(t, e.target.tagName, e.target.className), {capture:true}) ); return 'listening' " # do the action, then: chrome-relay console --tab 1234 - Don't echo secrets. When extracting tokens / API keys via , write the result directly to a file. Never
jsor interpolate into shell strings — it ends up in scrollback, logs, and tool transcripts.echo $TOKEN - Capture before irreversible actions (form submit, send message, account change). Save the screenshot path.
- 不要过早放弃。点击失败是信息,而非终止信号。通过添加文档级监听器,观察触发的事件:
capture:trueshchrome-relay js --tab 1234 " ['pointerdown','mousedown','click'].forEach(t => document.addEventListener(t, e => console.log(t, e.target.tagName, e.target.className), {capture:true}) ); return 'listening' " # 执行操作后,运行以下命令查看日志: chrome-relay console --tab 1234 - 不要泄露敏感信息。通过提取令牌/API密钥时,请直接将结果写入文件。切勿使用
js或将其插入shell字符串——这会导致敏感信息出现在回滚历史、日志和工具记录中。echo $TOKEN - 在执行不可逆操作前捕获证据(表单提交、发送消息、账户变更)。保存截图路径。
Guardrails
注意事项
- Pipe to a file and grep/jq it. Don't paste the full element map into chat.
read -i - If a flag is unclear, is authoritative — these docs lag.
chrome-relay <command> --help
- 将的结果输出到文件,再通过grep/jq查询。不要将完整的元素映射粘贴到聊天窗口。
read -i - 如果对参数有疑问,是权威参考——本文档内容可能滞后。
chrome-relay <command> --help