browser-cdp
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBrowser CDP
Browser CDP
Control a real browser through a Chrome DevTools Protocol proxy.
通过Chrome DevTools Protocol代理控制真实浏览器。
Overview
概述
This skill provides browser automation via a lightweight HTTP proxy that wraps CDP. The proxy exposes REST endpoints for navigation, screenshots, JS evaluation, clicking, and more — no Playwright/Puppeteer dependency needed.
本技能通过封装了CDP的轻量HTTP代理提供浏览器自动化能力。该代理对外提供了用于页面跳转、截图、JS执行、点击等操作的REST接口,无需依赖Playwright/Puppeteer。
Prerequisites
前置要求
Install the required Python dependency:
bash
pip install psutilA CDP proxy must be running on . Start it from the repository root with:
http://localhost:3456bash
python3 skills/browser-cdp/scripts/cdp_proxy.pyThis launches Chrome/Edge with remote debugging enabled and proxies CDP commands over HTTP.
安装所需的Python依赖:
bash
pip install psutil必须在上运行CDP代理,可在仓库根目录下通过以下命令启动:
http://localhost:3456bash
python3 skills/browser-cdp/scripts/cdp_proxy.py该命令会启动开启了远程调试功能的Chrome/Edge,并通过HTTP代理转发CDP指令。
When to Use
适用场景
✅ USE this skill when:
- "Open this URL and tell me what's on the page"
- "Take a screenshot of the current page"
- "Run this JavaScript on the page"
- "Click the button that says..."
- "Search for and install a Chrome extension"
- "Log into this site and do something"
- Any task requiring a real browser context
❌ DON'T use this skill when:
- Simple HTTP API calls → use directly
curl - Downloading files → use
curl -O - Parsing HTML from a saved file → use with BeautifulSoup
python3 - No CDP proxy running → ask the user to start it first
✅ 适合使用本技能的场景:
- "打开这个URL并告诉我页面上的内容"
- "给当前页面截图"
- "在页面上运行这段JavaScript"
- "点击写有...的按钮"
- "搜索并安装Chrome扩展"
- "登录这个站点并执行操作"
- 任何需要真实浏览器上下文的任务
❌ 不适合使用本技能的场景:
- 简单HTTP API调用 → 直接使用
curl - 下载文件 → 使用
curl -O - 解析本地保存的HTML文件 → 搭配BeautifulSoup使用
python3 - 没有运行CDP代理 → 先请用户启动代理
API Reference
API参考
All endpoints are relative to .
http://localhost:3456所有接口的根路径都是。
http://localhost:3456GET /targets
GET /targets
List all open browser tabs.
bash
curl -s http://localhost:3456/targets | python3 -m json.toolResponse:
json
[
{ "id": "ABC123", "title": "Google", "url": "https://google.com" }
]列出所有已打开的浏览器标签页。
bash
curl -s http://localhost:3456/targets | python3 -m json.tool返回示例:
json
[
{ "id": "ABC123", "title": "Google", "url": "https://google.com" }
]GET /navigate?url=<URL>
GET /navigate?url=<URL>
Navigate a tab to a URL. Uses the most recently created tab, or specify .
?target=<targetId>bash
curl -s "http://localhost:3456/navigate?url=https://example.com"将标签页跳转到指定URL,默认使用最新创建的标签页,也可以通过指定目标标签。
?target=<targetId>bash
curl -s "http://localhost:3456/navigate?url=https://example.com"GET /screenshot
GET /screenshot
Take a PNG screenshot of the current page.
bash
undefined对当前页面生成PNG格式截图。
bash
undefinedSave to file
保存到本地文件
curl -s -o screenshot.png http://localhost:3456/screenshot
undefinedcurl -s -o screenshot.png http://localhost:3456/screenshot
undefinedPOST /eval
POST /eval
Execute JavaScript in the page. The request body is plain text (not JSON), sent as .
Content-Type: text/plainbash
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "document.title"For multi-line scripts, pipe from stdin or use a heredoc:
bash
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"在页面中执行JavaScript。请求体为纯文本(非JSON格式),请求头需设置。
Content-Type: text/plainbash
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "document.title"如果需要执行多行脚本,可以从标准输入管道传入或者使用heredoc:
bash
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"GET /click?selector=<CSS>
GET /click?selector=<CSS>
Click an element matching a CSS selector.
bash
curl -s "http://localhost:3456/click?selector=%23submit-btn"点击匹配CSS选择器的元素。
bash
curl -s "http://localhost:3456/click?selector=%23submit-btn"GET /new
GET /new
Open a new browser tab and return its target ID.
bash
curl -s http://localhost:3456/newResponse:
json
{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }打开新的浏览器标签页并返回其目标ID。
bash
curl -s http://localhost:3456/new返回示例:
json
{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }Common Workflows
常用工作流
Navigate and extract page content
跳转页面并提取内容
bash
undefinedbash
undefinedOpen a page
打开页面
Extract all text content
提取所有文本内容
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.body.innerText"
-H "Content-Type: text/plain"
-d "document.body.innerText"
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.body.innerText"
-H "Content-Type: text/plain"
-d "document.body.innerText"
Extract all links
提取所有链接
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
undefinedcurl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
undefinedTake a screenshot
页面截图
bash
curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshotbash
curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshotSearch and install a Chrome extension
搜索并安装Chrome扩展
bash
undefinedbash
undefinedSearch the Chrome Web Store (no login required for search)
搜索Chrome网上应用店(搜索无需登录)
Extract extension IDs from search results
从搜索结果中提取扩展ID
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"
-H "Content-Type: text/plain"
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"
Install an extension (requires the extension ID)
安装扩展(需要传入扩展ID)
curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/detail/<extension-id>"
curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/detail/<extension-id>"
Then click the "Add to Chrome" button
点击「添加至Chrome」按钮
undefinedundefinedFill a form and submit
填写表单并提交
bash
undefinedbash
undefinedNavigate to the form
跳转到表单页面
Fill in fields
填写字段
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"
curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"
-H "Content-Type: text/plain"
-d "document.querySelector('#username').value = 'myuser'" curl -s -X POST http://localhost:3456/eval
-H "Content-Type: text/plain"
-d "document.querySelector('#password').value = 'mypass'"
Submit
提交表单
undefinedundefinedNotes
注意事项
- The CDP proxy must be running before using any commands
- If the proxy is not running, ask the user to start it:
python3 skills/browser-cdp/scripts/cdp_proxy.py - Use URL encoding for query parameters with special characters
- The endpoint returns the result of the last expression (like a REPL)
/eval - Screenshots are returned as PNG binary data
- For complex multi-step interactions, chain and
/evalcalls/click - The proxy supports a parameter on most endpoints to target a specific tab
?target=<targetId>
- 使用任何命令前必须先运行CDP代理
- 如果代理未运行,请提示用户启动:
python3 skills/browser-cdp/scripts/cdp_proxy.py - 包含特殊字符的查询参数需要进行URL编码
- 接口会返回最后一个表达式的执行结果(类似REPL)
/eval - 截图接口返回PNG格式的二进制数据
- 复杂的多步交互可以链式调用和
/eval接口/click - 大部分接口都支持参数,用于指定操作的目标标签页
?target=<targetId>