browser-cdp

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Browser CDP

Browser CDP

Control a real browser through a Chrome DevTools Protocol proxy.
通过Chrome DevTools Protocol代理控制真实浏览器。

Overview

概述

This skill provides browser automation via a lightweight HTTP proxy that wraps CDP. The proxy exposes REST endpoints for navigation, screenshots, JS evaluation, clicking, and more — no Playwright/Puppeteer dependency needed.
本技能通过封装了CDP的轻量HTTP代理提供浏览器自动化能力。该代理对外提供了用于页面跳转、截图、JS执行、点击等操作的REST接口,无需依赖Playwright/Puppeteer。

Prerequisites

前置要求

Install the required Python dependency:
bash
pip install psutil
A CDP proxy must be running on
http://localhost:3456
. Start it from the repository root with:
bash
python3 skills/browser-cdp/scripts/cdp_proxy.py
This launches Chrome/Edge with remote debugging enabled and proxies CDP commands over HTTP.
安装所需的Python依赖:
bash
pip install psutil
必须在
http://localhost:3456
上运行CDP代理,可在仓库根目录下通过以下命令启动:
bash
python3 skills/browser-cdp/scripts/cdp_proxy.py
该命令会启动开启了远程调试功能的Chrome/Edge,并通过HTTP代理转发CDP指令。

When to Use

适用场景

USE this skill when:
  • "Open this URL and tell me what's on the page"
  • "Take a screenshot of the current page"
  • "Run this JavaScript on the page"
  • "Click the button that says..."
  • "Search for and install a Chrome extension"
  • "Log into this site and do something"
  • Any task requiring a real browser context
DON'T use this skill when:
  • Simple HTTP API calls → use
    curl
    directly
  • Downloading files → use
    curl -O
  • Parsing HTML from a saved file → use
    python3
    with BeautifulSoup
  • No CDP proxy running → ask the user to start it first
适合使用本技能的场景:
  • "打开这个URL并告诉我页面上的内容"
  • "给当前页面截图"
  • "在页面上运行这段JavaScript"
  • "点击写有...的按钮"
  • "搜索并安装Chrome扩展"
  • "登录这个站点并执行操作"
  • 任何需要真实浏览器上下文的任务
不适合使用本技能的场景:
  • 简单HTTP API调用 → 直接使用
    curl
  • 下载文件 → 使用
    curl -O
  • 解析本地保存的HTML文件 → 搭配BeautifulSoup使用
    python3
  • 没有运行CDP代理 → 先请用户启动代理

API Reference

API参考

All endpoints are relative to
http://localhost:3456
.
所有接口的根路径都是
http://localhost:3456

GET /targets

GET /targets

List all open browser tabs.
bash
curl -s http://localhost:3456/targets | python3 -m json.tool
Response:
json
[
  { "id": "ABC123", "title": "Google", "url": "https://google.com" }
]
列出所有已打开的浏览器标签页。
bash
curl -s http://localhost:3456/targets | python3 -m json.tool
返回示例:
json
[
  { "id": "ABC123", "title": "Google", "url": "https://google.com" }
]

GET /navigate?url=<URL>

GET /navigate?url=<URL>

Navigate a tab to a URL. Uses the most recently created tab, or specify
?target=<targetId>
.
bash
curl -s "http://localhost:3456/navigate?url=https://example.com"
将标签页跳转到指定URL,默认使用最新创建的标签页,也可以通过
?target=<targetId>
指定目标标签。
bash
curl -s "http://localhost:3456/navigate?url=https://example.com"

GET /screenshot

GET /screenshot

Take a PNG screenshot of the current page.
bash
undefined
对当前页面生成PNG格式截图。
bash
undefined

Save to file

保存到本地文件

curl -s -o screenshot.png http://localhost:3456/screenshot
undefined
curl -s -o screenshot.png http://localhost:3456/screenshot
undefined

POST /eval

POST /eval

Execute JavaScript in the page. The request body is plain text (not JSON), sent as
Content-Type: text/plain
.
bash
curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "document.title"
For multi-line scripts, pipe from stdin or use a heredoc:
bash
curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"
在页面中执行JavaScript。请求体为纯文本(非JSON格式),请求头需设置
Content-Type: text/plain
bash
curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "document.title"
如果需要执行多行脚本,可以从标准输入管道传入或者使用heredoc:
bash
curl -s -X POST http://localhost:3456/eval \
  -H "Content-Type: text/plain" \
  -d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"

GET /click?selector=<CSS>

GET /click?selector=<CSS>

Click an element matching a CSS selector.
bash
curl -s "http://localhost:3456/click?selector=%23submit-btn"
点击匹配CSS选择器的元素。
bash
curl -s "http://localhost:3456/click?selector=%23submit-btn"

GET /new

GET /new

Open a new browser tab and return its target ID.
bash
curl -s http://localhost:3456/new
Response:
json
{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }
打开新的浏览器标签页并返回其目标ID。
bash
curl -s http://localhost:3456/new
返回示例:
json
{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }

Common Workflows

常用工作流

Navigate and extract page content

跳转页面并提取内容

bash
undefined
bash
undefined

Open a page

打开页面

Extract all text content

提取所有文本内容

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.body.innerText"
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.body.innerText"

Extract all links

提取所有链接

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
undefined
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
undefined

Take a screenshot

页面截图

bash
curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshot
bash
curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshot

Search and install a Chrome extension

搜索并安装Chrome扩展

bash
undefined
bash
undefined

Search the Chrome Web Store (no login required for search)

搜索Chrome网上应用店(搜索无需登录)

Extract extension IDs from search results

从搜索结果中提取扩展ID

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"

Install an extension (requires the extension ID)

安装扩展(需要传入扩展ID)

Then click the "Add to Chrome" button

点击「添加至Chrome」按钮

Fill a form and submit

填写表单并提交

bash
undefined
bash
undefined

Navigate to the form

跳转到表单页面

Fill in fields

填写字段

curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"

Submit

提交表单

Notes

注意事项

  • The CDP proxy must be running before using any commands
  • If the proxy is not running, ask the user to start it:
    python3 skills/browser-cdp/scripts/cdp_proxy.py
  • Use URL encoding for query parameters with special characters
  • The
    /eval
    endpoint returns the result of the last expression (like a REPL)
  • Screenshots are returned as PNG binary data
  • For complex multi-step interactions, chain
    /eval
    and
    /click
    calls
  • The proxy supports a
    ?target=<targetId>
    parameter on most endpoints to target a specific tab
  • 使用任何命令前必须先运行CDP代理
  • 如果代理未运行,请提示用户启动:
    python3 skills/browser-cdp/scripts/cdp_proxy.py
  • 包含特殊字符的查询参数需要进行URL编码
  • /eval
    接口会返回最后一个表达式的执行结果(类似REPL)
  • 截图接口返回PNG格式的二进制数据
  • 复杂的多步交互可以链式调用
    /eval
    /click
    接口
  • 大部分接口都支持
    ?target=<targetId>
    参数,用于指定操作的目标标签页