browser-cdp

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Browser CDP

Control a real browser through a Chrome DevTools Protocol proxy.

通过Chrome DevTools Protocol代理控制真实浏览器。

Overview

概述

This skill provides browser automation via a lightweight HTTP proxy that wraps CDP. The proxy exposes REST endpoints for navigation, screenshots, JS evaluation, clicking, and more — no Playwright/Puppeteer dependency needed.

本技能通过封装了CDP的轻量HTTP代理提供浏览器自动化能力。该代理对外提供了用于页面跳转、截图、JS执行、点击等操作的REST接口，无需依赖Playwright/Puppeteer。

Prerequisites

前置要求

Install the required Python dependency:

bash

pip install psutil

A CDP proxy must be running on

http://localhost:3456

. Start it from the repository root with:

bash

python3 skills/browser-cdp/scripts/cdp_proxy.py

This launches Chrome/Edge with remote debugging enabled and proxies CDP commands over HTTP.

安装所需的Python依赖：

bash

pip install psutil

必须在

http://localhost:3456

上运行CDP代理，可在仓库根目录下通过以下命令启动：

bash

python3 skills/browser-cdp/scripts/cdp_proxy.py

该命令会启动开启了远程调试功能的Chrome/Edge，并通过HTTP代理转发CDP指令。

When to Use

适用场景

✅ USE this skill when:

"Open this URL and tell me what's on the page"
"Take a screenshot of the current page"
"Run this JavaScript on the page"
"Click the button that says..."
"Search for and install a Chrome extension"
"Log into this site and do something"
Any task requiring a real browser context

❌ DON'T use this skill when:

Simple HTTP API calls → use
```
curl
```
directly
Downloading files → use
```
curl -O
```
Parsing HTML from a saved file → use
```
python3
```
with BeautifulSoup
No CDP proxy running → ask the user to start it first

✅ 适合使用本技能的场景：

"打开这个URL并告诉我页面上的内容"
"给当前页面截图"
"在页面上运行这段JavaScript"
"点击写有...的按钮"
"搜索并安装Chrome扩展"
"登录这个站点并执行操作"
任何需要真实浏览器上下文的任务

❌ 不适合使用本技能的场景：

简单HTTP API调用 → 直接使用
```
curl
```
下载文件 → 使用
```
curl -O
```
解析本地保存的HTML文件 → 搭配BeautifulSoup使用
```
python3
```
没有运行CDP代理 → 先请用户启动代理

API Reference

API参考

All endpoints are relative to

http://localhost:3456

所有接口的根路径都是

http://localhost:3456

。

GET /targets

List all open browser tabs.

bash

curl -s http://localhost:3456/targets | python3 -m json.tool

Response:

json

[
  { "id": "ABC123", "title": "Google", "url": "https://google.com" }
]

列出所有已打开的浏览器标签页。

bash

curl -s http://localhost:3456/targets | python3 -m json.tool

返回示例：

json

[
  { "id": "ABC123", "title": "Google", "url": "https://google.com" }
]

GET /navigate?url=<URL>

Navigate a tab to a URL. Uses the most recently created tab, or specify

?target=<targetId>

bash

curl -s "http://localhost:3456/navigate?url=https://example.com"

将标签页跳转到指定URL，默认使用最新创建的标签页，也可以通过

?target=<targetId>

指定目标标签。

bash

curl -s "http://localhost:3456/navigate?url=https://example.com"

GET /screenshot

Take a PNG screenshot of the current page.

bash

undefined

对当前页面生成PNG格式截图。

bash

undefined

Save to file

保存到本地文件

curl -s -o screenshot.png http://localhost:3456/screenshot

undefined

curl -s -o screenshot.png http://localhost:3456/screenshot

undefined

POST /eval

Execute JavaScript in the page. The request body is plain text (not JSON), sent as

Content-Type: text/plain

bash

curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "document.title"

For multi-line scripts, pipe from stdin or use a heredoc:

bash

curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"

在页面中执行JavaScript。请求体为纯文本（非JSON格式），请求头需设置

Content-Type: text/plain

。

bash

curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "document.title"

如果需要执行多行脚本，可以从标准输入管道传入或者使用heredoc：

bash

curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"

GET /click?selector=<CSS>

Click an element matching a CSS selector.

bash

curl -s "http://localhost:3456/click?selector=%23submit-btn"

点击匹配CSS选择器的元素。

bash

curl -s "http://localhost:3456/click?selector=%23submit-btn"

GET /new

Open a new browser tab and return its target ID.

bash

curl -s http://localhost:3456/new

Response:

json

{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }

打开新的浏览器标签页并返回其目标ID。

bash

curl -s http://localhost:3456/new

返回示例：

json

{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }

Common Workflows

常用工作流

Navigate and extract page content

跳转页面并提取内容

bash

undefined

bash

undefined

Open a page

打开页面

curl -s "http://localhost:3456/navigate?url=https://example.com"

Extract all text content

提取所有文本内容

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.body.innerText"

Extract all links

提取所有链接

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"

undefined

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"

undefined

Take a screenshot

页面截图

bash

curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshot

bash

curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshot

Search and install a Chrome extension

搜索并安装Chrome扩展

bash

undefined

bash

undefined

Search the Chrome Web Store (no login required for search)

搜索Chrome网上应用店（搜索无需登录）

curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/search/example%20extension"

Extract extension IDs from search results

从搜索结果中提取扩展ID

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"

Install an extension (requires the extension ID)

安装扩展（需要传入扩展ID）

curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/detail/<extension-id>"

Then click the "Add to Chrome" button

点击「添加至Chrome」按钮

curl -s "http://localhost:3456/click?selector=%5Bdata-id%3Dinstall-button%5D"

undefined

curl -s "http://localhost:3456/click?selector=%5Bdata-id%3Dinstall-button%5D"

undefined

Fill a form and submit

填写表单并提交

bash

undefined

bash

undefined

Navigate to the form

跳转到表单页面

curl -s "http://localhost:3456/navigate?url=https://example.com/login"

Fill in fields

填写字段

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"

Submit

提交表单

curl -s "http://localhost:3456/click?selector=%23login-form+%3E+button"

undefined

curl -s "http://localhost:3456/click?selector=%23login-form+%3E+button"

undefined

Notes

注意事项

The CDP proxy must be running before using any commands
If the proxy is not running, ask the user to start it:
```
python3 skills/browser-cdp/scripts/cdp_proxy.py
```
Use URL encoding for query parameters with special characters
The
```
/eval
```
endpoint returns the result of the last expression (like a REPL)
Screenshots are returned as PNG binary data
For complex multi-step interactions, chain
```
/eval
```
and
```
/click
```
calls
The proxy supports a
```
?target=<targetId>
```
parameter on most endpoints to target a specific tab

使用任何命令前必须先运行CDP代理
如果代理未运行，请提示用户启动：
```
python3 skills/browser-cdp/scripts/cdp_proxy.py
```
包含特殊字符的查询参数需要进行URL编码
```
/eval
```
接口会返回最后一个表达式的执行结果（类似REPL）
截图接口返回PNG格式的二进制数据
复杂的多步交互可以链式调用
```
/eval
```
和
```
/click
```
接口
大部分接口都支持
```
?target=<targetId>
```
参数，用于指定操作的目标标签页