cloudflare-browser-rendering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCloudflare Browser Rendering - Complete Reference
Cloudflare Browser Rendering - 完整参考文档
Production-ready knowledge domain for building browser automation workflows with Cloudflare Browser Rendering.
Status: Production Ready ✅
Last Updated: 2025-10-22
Dependencies: cloudflare-worker-base (for Worker setup)
Latest Versions: @cloudflare/puppeteer@1.0.4, @cloudflare/playwright@1.0.0, wrangler@4.43.0
基于 Cloudflare Browser Rendering 构建浏览器自动化工作流的生产级知识域。
状态:生产就绪 ✅
最后更新:2025-10-22
依赖项:cloudflare-worker-base(用于 Worker 配置)
最新版本:@cloudflare/puppeteer@1.0.4、@cloudflare/playwright@1.0.0、wrangler@4.43.0
Table of Contents
目录
Quick Start (5 minutes)
快速入门(5分钟)
1. Add Browser Binding
1. 添加浏览器绑定
wrangler.jsonc:
jsonc
{
"name": "browser-worker",
"main": "src/index.ts",
"compatibility_date": "2023-03-14",
"compatibility_flags": ["nodejs_compat"],
"browser": {
"binding": "MYBROWSER"
}
}Why nodejs_compat? Browser Rendering requires Node.js APIs and polyfills.
wrangler.jsonc:
jsonc
{
"name": "browser-worker",
"main": "src/index.ts",
"compatibility_date": "2023-03-14",
"compatibility_flags": ["nodejs_compat"],
"browser": {
"binding": "MYBROWSER"
}
}为什么需要 nodejs_compat? 浏览器渲染需要 Node.js API 和 polyfill。
2. Install Puppeteer
2. 安装 Puppeteer
bash
npm install @cloudflare/puppeteerbash
npm install @cloudflare/puppeteer3. Take Your First Screenshot
3. 捕获第一张截图
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url") || "https://example.com";
// Launch browser
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
// Navigate and capture
await page.goto(url);
const screenshot = await page.screenshot();
// Clean up
await browser.close();
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
}
};typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url") || "https://example.com";
// 启动浏览器
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
// 导航并捕获截图
await page.goto(url);
const screenshot = await page.screenshot();
// 清理资源
await browser.close();
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
}
};4. Deploy
4. 部署
bash
npx wrangler deployTest at:
https://your-worker.workers.dev/?url=https://example.comCRITICAL:
- Always pass to
env.MYBROWSER(not undefined)puppeteer.launch() - Always call when done (or use
browser.close()for session reuse)browser.disconnect() - Use compatibility flag
nodejs_compat
bash
npx wrangler deploy测试地址:
https://your-worker.workers.dev/?url=https://example.com关键注意事项:
- 必须将 传递给
env.MYBROWSER(不能传undefined)puppeteer.launch() - 使用完成后必须调用 (或使用
browser.close()实现会话复用)browser.disconnect() - 启用 兼容性标志
nodejs_compat
Browser Rendering Overview
浏览器渲染概述
What is Browser Rendering?
什么是浏览器渲染?
Cloudflare Browser Rendering provides headless Chromium browsers running on Cloudflare's global network. Use familiar tools like Puppeteer and Playwright to automate browser tasks:
- Screenshots - Capture visual snapshots of web pages
- PDF Generation - Convert HTML/URLs to PDFs
- Web Scraping - Extract content from dynamic websites
- Testing - Automate frontend tests
- Crawling - Navigate multi-page workflows
Cloudflare Browser Rendering 在 Cloudflare 全球网络上提供无头 Chromium 浏览器。使用熟悉的 Puppeteer 和 Playwright 工具自动化浏览器任务:
- 截图 - 捕获网页的视觉快照
- PDF生成 - 将HTML/URL转换为PDF
- 网页爬取 - 从动态网站提取内容
- 测试 - 自动化前端测试
- 抓取 - 导航多页面工作流
Two Integration Methods
两种集成方式
| Method | Best For | Complexity |
|---|---|---|
| Workers Bindings | Complex automation, custom workflows, session management | Advanced |
| REST API | Simple screenshot/PDF tasks | Simple |
This skill covers Workers Bindings (the advanced method with full Puppeteer/Playwright APIs).
| 方法 | 最佳适用场景 | 复杂度 |
|---|---|---|
| Workers 绑定 | 复杂自动化、自定义工作流、会话管理 | 高级 |
| REST API | 简单的截图/PDF任务 | 简单 |
本指南覆盖 Workers 绑定(支持完整 Puppeteer/Playwright API 的高级方法)。
Puppeteer vs Playwright
Puppeteer vs Playwright
| Feature | Puppeteer | Playwright |
|---|---|---|
| API Familiarity | Most popular | Growing adoption |
| Package | | |
| Session Management | ✅ Advanced APIs | ⚠️ Basic |
| Browser Support | Chromium only | Chromium only (Firefox/Safari not yet supported) |
| Best For | Screenshots, PDFs, scraping | Testing, frontend automation |
Recommendation: Use Puppeteer for most use cases. Playwright is ideal if you're already using it for testing.
| 特性 | Puppeteer | Playwright |
|---|---|---|
| API 熟悉度 | 最受欢迎 | 采用率增长中 |
| 包版本 | | |
| 会话管理 | ✅ 高级API | ⚠️ 基础功能 |
| 浏览器支持 | 仅 Chromium | 仅 Chromium(暂不支持Firefox/Safari) |
| 最佳适用场景 | 截图、PDF、爬取 | 测试、前端自动化 |
推荐: 大多数场景使用 Puppeteer。如果已有 Playwright 测试用例,可选择 Playwright。
Puppeteer API Reference
Puppeteer API 参考
puppeteer.launch()
puppeteer.launch()
Launch a new browser instance.
Signature:
typescript
await puppeteer.launch(binding: Fetcher, options?: LaunchOptions): Promise<Browser>Parameters:
- (required) - Browser binding from
bindingenv.MYBROWSER - (optional):
options- (number) - Keep browser alive for N milliseconds (max: 600000 = 10 minutes)
keep_alive
Returns: - Browser instance
Promise<Browser>Example:
typescript
const browser = await puppeteer.launch(env.MYBROWSER, {
keep_alive: 60000 // Keep alive for 60 seconds
});CRITICAL: Must pass binding. Error "Cannot read properties of undefined (reading 'fetch')" means the binding wasn't passed.
env.MYBROWSER启动新的浏览器实例。
签名:
typescript
await puppeteer.launch(binding: Fetcher, options?: LaunchOptions): Promise<Browser>参数:
- (必填)- 来自
binding的浏览器绑定env.MYBROWSER - (可选):
options- (数字)- 保持浏览器存活N毫秒(最大值:600000 = 10分钟)
keep_alive
返回值: - 浏览器实例
Promise<Browser>示例:
typescript
const browser = await puppeteer.launch(env.MYBROWSER, {
keep_alive: 60000 // 保持存活60秒
});关键注意事项: 必须传递 绑定。错误“Cannot read properties of undefined (reading 'fetch')”意味着未传递绑定。
env.MYBROWSERpuppeteer.connect()
puppeteer.connect()
Connect to an existing browser session.
Signature:
typescript
await puppeteer.connect(binding: Fetcher, sessionId: string): Promise<Browser>Use Cases:
- Reuse existing browser sessions for performance
- Share browser instance across multiple Workers
- Reduce startup time
Example:
typescript
const sessionId = "478f4d7d-e943-40f6-a414-837d3736a1dc";
const browser = await puppeteer.connect(env.MYBROWSER, sessionId);连接到现有的浏览器会话。
签名:
typescript
await puppeteer.connect(binding: Fetcher, sessionId: string): Promise<Browser>适用场景:
- 复用现有浏览器会话以提升性能
- 在多个 Worker 之间共享浏览器实例
- 减少启动时间
示例:
typescript
const sessionId = "478f4d7d-e943-40f6-a414-837d3736a1dc";
const browser = await puppeteer.connect(env.MYBROWSER, sessionId);puppeteer.sessions()
puppeteer.sessions()
List currently running browser sessions.
Signature:
typescript
await puppeteer.sessions(binding: Fetcher): Promise<SessionInfo[]>Returns:
typescript
interface SessionInfo {
sessionId: string;
startTime: number;
connectionId?: string; // Present if worker is connected
connectionStartTime?: number;
}Example:
typescript
const sessions = await puppeteer.sessions(env.MYBROWSER);
// Find sessions without active connections
const freeSessions = sessions.filter(s => !s.connectionId);列出当前运行的浏览器会话。
签名:
typescript
await puppeteer.sessions(binding: Fetcher): Promise<SessionInfo[]>返回值:
typescript
interface SessionInfo {
sessionId: string;
startTime: number;
connectionId?: string; // 如果Worker已连接则存在
connectionStartTime?: number;
}示例:
typescript
const sessions = await puppeteer.sessions(env.MYBROWSER);
// 查找无活跃连接的会话
const freeSessions = sessions.filter(s => !s.connectionId);puppeteer.history()
puppeteer.history()
List recent sessions (both open and closed).
Signature:
typescript
await puppeteer.history(binding: Fetcher): Promise<HistoryEntry[]>Returns:
typescript
interface HistoryEntry {
sessionId: string;
startTime: number;
endTime?: number;
closeReason?: number;
closeReasonText?: string; // "NormalClosure", "BrowserIdle", etc.
}Use Case: Monitor usage patterns and debug session issues.
列出最近的会话(包括已关闭和正在运行的)。
签名:
typescript
await puppeteer.history(binding: Fetcher): Promise<HistoryEntry[]>返回值:
typescript
interface HistoryEntry {
sessionId: string;
startTime: number;
endTime?: number;
closeReason?: number;
closeReasonText?: string; // "NormalClosure", "BrowserIdle"等
}适用场景: 监控使用模式并调试会话问题。
puppeteer.limits()
puppeteer.limits()
Check current account limits and available sessions.
Signature:
typescript
await puppeteer.limits(binding: Fetcher): Promise<LimitsInfo>Returns:
typescript
interface LimitsInfo {
activeSessions: Array<{ id: string }>;
maxConcurrentSessions: number;
allowedBrowserAcquisitions: number;
timeUntilNextAllowedBrowserAcquisition: number; // milliseconds
}Example:
typescript
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
return new Response("Rate limit reached", { status: 429 });
}检查当前账户的限制和可用会话数。
签名:
typescript
await puppeteer.limits(binding: Fetcher): Promise<LimitsInfo>返回值:
typescript
interface LimitsInfo {
activeSessions: Array<{ id: string }>;
maxConcurrentSessions: number;
allowedBrowserAcquisitions: number;
timeUntilNextAllowedBrowserAcquisition: number; // 毫秒
}示例:
typescript
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
return new Response("达到速率限制", { status: 429 });
}Browser API
Browser API
Methods available on the object returned by or .
Browserlaunch()connect()launch()connect()Browserbrowser.newPage()
browser.newPage()
Create a new page (tab) in the browser.
Signature:
typescript
await browser.newPage(): Promise<Page>Example:
typescript
const page = await browser.newPage();
await page.goto("https://example.com");Performance Tip: Reuse browser instances and open multiple tabs instead of launching new browsers.
在浏览器中创建新页面(标签页)。
签名:
typescript
await browser.newPage(): Promise<Page>示例:
typescript
const page = await browser.newPage();
await page.goto("https://example.com");性能提示: 复用浏览器实例并打开多个标签页,而非启动新浏览器。
browser.sessionId()
browser.sessionId()
Get the current browser session ID.
Returns: - Session ID
stringExample:
typescript
const sessionId = browser.sessionId();
console.log("Current session:", sessionId);获取当前浏览器会话ID。
返回值: - 会话ID
string示例:
typescript
const sessionId = browser.sessionId();
console.log("当前会话:", sessionId);browser.close()
browser.close()
Close the browser and terminate the session.
Signature:
typescript
await browser.close(): Promise<void>When to use: When you're completely done with the browser and want to free resources.
关闭浏览器并终止会话。
签名:
typescript
await browser.close(): Promise<void>适用场景: 完全使用完浏览器后,释放资源时使用。
browser.disconnect()
browser.disconnect()
Disconnect from the browser WITHOUT closing it.
Signature:
typescript
await browser.disconnect(): Promise<void>When to use: Session reuse - allows another Worker to connect to the same session later.
Example:
typescript
// Keep session alive for reuse
const sessionId = browser.sessionId();
await browser.disconnect(); // Don't close, just disconnect
// Later: puppeteer.connect(env.MYBROWSER, sessionId)断开与浏览器的连接但不关闭它。
签名:
typescript
await browser.disconnect(): Promise<void>适用场景: 会话复用 - 允许其他Worker稍后连接到同一个会话。
示例:
typescript
// 保持会话存活以便复用
const sessionId = browser.sessionId();
await browser.disconnect(); // 不关闭,仅断开连接
// 后续:puppeteer.connect(env.MYBROWSER, sessionId)browser.createBrowserContext()
browser.createBrowserContext()
Create an isolated incognito browser context.
Signature:
typescript
await browser.createBrowserContext(): Promise<BrowserContext>Use Cases:
- Isolate cookies and cache between operations
- Test multi-user scenarios
- Maintain session isolation while reusing browser
Example:
typescript
const context1 = await browser.createBrowserContext();
const context2 = await browser.createBrowserContext();
const page1 = await context1.newPage();
const page2 = await context2.newPage();
// page1 and page2 have separate cookies/cache创建隔离的无痕浏览器上下文。
签名:
typescript
await browser.createBrowserContext(): Promise<BrowserContext>适用场景:
- 在操作之间隔离Cookie和缓存
- 测试多用户场景
- 复用浏览器的同时保持会话隔离
示例:
typescript
const context1 = await browser.createBrowserContext();
const context2 = await browser.createBrowserContext();
const page1 = await context1.newPage();
const page2 = await context2.newPage();
// page1和page2拥有独立的Cookie/缓存Page API
Page API
Methods available on the object returned by .
Pagebrowser.newPage()browser.newPage()Pagepage.goto()
page.goto()
Navigate to a URL.
Signature:
typescript
await page.goto(url: string, options?: NavigationOptions): Promise<Response>Options:
- - When to consider navigation complete:
waitUntil- - Wait for load event (default)
"load" - - Wait for DOMContentLoaded
"domcontentloaded" - - Wait until no network connections for 500ms
"networkidle0" - - Wait until ≤2 network connections for 500ms
"networkidle2"
- - Maximum navigation time in milliseconds (default: 30000)
timeout
Example:
typescript
await page.goto("https://example.com", {
waitUntil: "networkidle0",
timeout: 60000
});Best Practice: Use for dynamic content, for static pages.
"networkidle0""load"导航到指定URL。
签名:
typescript
await page.goto(url: string, options?: NavigationOptions): Promise<Response>选项:
- - 何时认为导航完成:
waitUntil- - 等待load事件(默认)
"load" - - 等待DOMContentLoaded
"domcontentloaded" - - 等待500ms内无网络连接
"networkidle0" - - 等待500ms内网络连接数≤2
"networkidle2"
- - 最大导航时间(毫秒,默认:30000)
timeout
示例:
typescript
await page.goto("https://example.com", {
waitUntil: "networkidle0",
timeout: 60000
});最佳实践: 动态内容使用 ,静态页面使用 。
"networkidle0""load"page.screenshot()
page.screenshot()
Capture a screenshot of the page.
Signature:
typescript
await page.screenshot(options?: ScreenshotOptions): Promise<Buffer>Options:
- (boolean) - Capture full scrollable page (default: false)
fullPage - (string) -
typeor"png"(default:"jpeg")"png" - (number) - JPEG quality 0-100 (only for jpeg)
quality - (object) - Capture specific region:
clip{ x, y, width, height }
Examples:
typescript
// Full page screenshot
const screenshot = await page.screenshot({ fullPage: true });
// JPEG with compression
const screenshot = await page.screenshot({
type: "jpeg",
quality: 80
});
// Specific region
const screenshot = await page.screenshot({
clip: { x: 0, y: 0, width: 800, height: 600 }
});捕获页面截图。
签名:
typescript
await page.screenshot(options?: ScreenshotOptions): Promise<Buffer>选项:
- (布尔值)- 捕获完整可滚动页面(默认:false)
fullPage - (字符串)-
type或"png"(默认:"jpeg")"png" - (数字)- JPEG质量0-100(仅适用于jpeg)
quality - (对象)- 捕获特定区域:
clip{ x, y, width, height }
示例:
typescript
// 全页面截图
const screenshot = await page.screenshot({ fullPage: true });
// 带压缩的JPEG截图
const screenshot = await page.screenshot({
type: "jpeg",
quality: 80
});
// 特定区域截图
const screenshot = await page.screenshot({
clip: { x: 0, y: 0, width: 800, height: 600 }
});page.pdf()
page.pdf()
Generate a PDF of the page.
Signature:
typescript
await page.pdf(options?: PDFOptions): Promise<Buffer>Options:
- (string) - Page format:
format,"Letter", etc."A4" - (boolean) - Include background graphics (default: false)
printBackground - (object) -
margin(e.g.,{ top, right, bottom, left })"1cm" - (boolean) - Landscape orientation (default: false)
landscape - (number) - Scale factor 0.1-2 (default: 1)
scale
Example:
typescript
const pdf = await page.pdf({
format: "A4",
printBackground: true,
margin: { top: "1cm", right: "1cm", bottom: "1cm", left: "1cm" }
});
return new Response(pdf, {
headers: { "content-type": "application/pdf" }
});生成页面的PDF。
签名:
typescript
await page.pdf(options?: PDFOptions): Promise<Buffer>选项:
- (字符串)- 页面格式:
format,"Letter"等"A4" - (布尔值)- 包含背景图形(默认:false)
printBackground - (对象)-
margin(例如:{ top, right, bottom, left })"1cm" - (布尔值)- 横向排版(默认:false)
landscape - (数字)- 缩放因子0.1-2(默认:1)
scale
示例:
typescript
const pdf = await page.pdf({
format: "A4",
printBackground: true,
margin: { top: "1cm", right: "1cm", bottom: "1cm", left: "1cm" }
});
return new Response(pdf, {
headers: { "content-type": "application/pdf" }
});page.content()
page.content()
Get the full HTML content of the page.
Signature:
typescript
await page.content(): Promise<string>Example:
typescript
const html = await page.content();
console.log(html); // Full HTML source获取页面的完整HTML内容。
签名:
typescript
await page.content(): Promise<string>示例:
typescript
const html = await page.content();
console.log(html); // 完整HTML源码page.setContent()
page.setContent()
Set custom HTML content.
Signature:
typescript
await page.setContent(html: string, options?: NavigationOptions): Promise<void>Use Case: Generate PDFs from custom HTML.
Example:
typescript
await page.setContent(`
<!DOCTYPE html>
<html>
<head><style>body { font-family: Arial; }</style></head>
<body><h1>Hello World</h1></body>
</html>
`);
const pdf = await page.pdf({ format: "A4" });设置自定义HTML内容。
签名:
typescript
await page.setContent(html: string, options?: NavigationOptions): Promise<void>适用场景: 从自定义HTML生成PDF。
示例:
typescript
await page.setContent(`
<!DOCTYPE html>
<html>
<head><style>body { font-family: Arial; }</style></head>
<body><h1>Hello World</h1></body>
</html>
`);
const pdf = await page.pdf({ format: "A4" });page.evaluate()
page.evaluate()
Execute JavaScript in the browser context.
Signature:
typescript
await page.evaluate<T>(fn: () => T): Promise<T>Use Cases:
- Extract data from the DOM
- Manipulate page content
- Workaround for XPath (not directly supported)
Examples:
typescript
// Extract text content
const title = await page.evaluate(() => document.title);
// Extract structured data
const data = await page.evaluate(() => ({
title: document.title,
url: window.location.href,
headings: Array.from(document.querySelectorAll("h1, h2")).map(el => el.textContent),
links: Array.from(document.querySelectorAll("a")).map(el => el.href)
}));
// XPath workaround (XPath selectors not directly supported)
const innerHtml = await page.evaluate(() => {
return new XPathEvaluator()
.createExpression("/html/body/div/h1")
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
.singleNodeValue.innerHTML;
});在浏览器上下文中执行JavaScript。
签名:
typescript
await page.evaluate<T>(fn: () => T): Promise<T>适用场景:
- 从DOM提取数据
- 操作页面内容
- 绕过XPath不支持的限制
示例:
typescript
// 提取文本内容
const title = await page.evaluate(() => document.title);
// 提取结构化数据
const data = await page.evaluate(() => ({
title: document.title,
url: window.location.href,
headings: Array.from(document.querySelectorAll("h1, h2")).map(el => el.textContent),
links: Array.from(document.querySelectorAll("a")).map(el => el.href)
}));
// XPath替代方案(XPath选择器不直接支持)
const innerHtml = await page.evaluate(() => {
return new XPathEvaluator()
.createExpression("/html/body/div/h1")
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
.singleNodeValue.innerHTML;
});page.waitForSelector()
page.waitForSelector()
Wait for an element to appear in the DOM.
Signature:
typescript
await page.waitForSelector(selector: string, options?: WaitForOptions): Promise<ElementHandle>Options:
- (number) - Maximum wait time in milliseconds
timeout - (boolean) - Wait for element to be visible
visible
Example:
typescript
await page.goto("https://example.com");
await page.waitForSelector("#content", { visible: true });
const screenshot = await page.screenshot();等待元素出现在DOM中。
签名:
typescript
await page.waitForSelector(selector: string, options?: WaitForOptions): Promise<ElementHandle>选项:
- (数字)- 最大等待时间(毫秒)
timeout - (布尔值)- 等待元素可见
visible
示例:
typescript
await page.goto("https://example.com");
await page.waitForSelector("#content", { visible: true });
const screenshot = await page.screenshot();page.type()
page.type()
Type text into an input field.
Signature:
typescript
await page.type(selector: string, text: string): Promise<void>Example:
typescript
await page.type('input[name="email"]', 'user@example.com');在输入框中输入文本。
签名:
typescript
await page.type(selector: string, text: string): Promise<void>示例:
typescript
await page.type('input[name="email"]', 'user@example.com');page.click()
page.click()
Click an element.
Signature:
typescript
await page.click(selector: string): Promise<void>Example:
typescript
await page.click('button[type="submit"]');
await page.waitForNavigation();点击元素。
签名:
typescript
await page.click(selector: string): Promise<void>示例:
typescript
await page.click('button[type="submit"]');
await page.waitForNavigation();Playwright API Reference
Playwright API 参考
Playwright provides a similar API to Puppeteer with slight differences.
Playwright 提供与 Puppeteer 类似的API,但有一些细微差异。
Installation
安装
bash
npm install @cloudflare/playwrightbash
npm install @cloudflare/playwrightBasic Example
基础示例
typescript
import { env } from "cloudflare:test";
import { chromium } from "@cloudflare/playwright";
interface Env {
BROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const browser = await chromium.launch(env.BROWSER);
const page = await browser.newPage();
await page.goto("https://example.com");
const screenshot = await page.screenshot();
await browser.close();
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
}
};typescript
import { env } from "cloudflare:test";
import { chromium } from "@cloudflare/playwright";
interface Env {
BROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const browser = await chromium.launch(env.BROWSER);
const page = await browser.newPage();
await page.goto("https://example.com");
const screenshot = await page.screenshot();
await browser.close();
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
}
};Key Differences from Puppeteer
与 Puppeteer 的主要差异
| Feature | Puppeteer | Playwright |
|---|---|---|
| Import | | |
| Launch | | |
| Session API | ✅ Advanced (sessions, history, limits) | ⚠️ Basic |
| Auto-waiting | Manual | Built-in auto-waiting |
| Selectors | CSS only | CSS, text, XPath (via evaluate workaround) |
Recommendation: Stick with Puppeteer unless you have existing Playwright tests to migrate.
| 特性 | Puppeteer | Playwright |
|---|---|---|
| 导入方式 | | |
| 启动方式 | | |
| 会话API | ✅ 高级功能(sessions, history, limits) | ⚠️ 基础功能 |
| 自动等待 | 手动调用 | 内置自动等待 |
| 选择器 | 仅CSS | CSS、文本、XPath(通过evaluate替代方案) |
推荐: 除非需要迁移现有Playwright测试用例,否则优先使用Puppeteer。
Session Management
会话管理
Why Session Management Matters
会话管理的重要性
Problem: Launching new browsers is slow and consumes concurrency limits.
Solution: Reuse browser sessions across requests.
Benefits:
- ⚡ Faster (no cold start)
- 💰 Lower concurrency usage
- 📊 Better resource utilization
问题: 启动新浏览器速度慢且会消耗并发限制。
解决方案: 在多个请求之间复用浏览器会话。
优势:
- ⚡ 速度更快(无冷启动)
- 💰 降低并发使用量
- 📊 资源利用率更高
Session Reuse Pattern
会话复用模式
typescript
import puppeteer from "@cloudflare/puppeteer";
async function getBrowser(env: Env): Promise<{ browser: Browser; launched: boolean }> {
// Check for available sessions
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
// Reuse existing session
try {
const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
return { browser, launched: false };
} catch (e) {
console.log("Failed to connect, launching new browser");
}
}
// Launch new session
const browser = await puppeteer.launch(env.MYBROWSER);
return { browser, launched: true };
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { browser, launched } = await getBrowser(env);
try {
const page = await browser.newPage();
await page.goto("https://example.com");
const screenshot = await page.screenshot();
// Disconnect (don't close) to allow reuse
await browser.disconnect();
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
} catch (error) {
await browser.close(); // Close on error
throw error;
}
}
};CRITICAL:
- Use to keep session alive
browser.disconnect() - Use on errors
browser.close() - Always handle connection failures
typescript
import puppeteer from "@cloudflare/puppeteer";
async function getBrowser(env: Env): Promise<{ browser: Browser; launched: boolean }> {
// 检查可用会话
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
// 复用现有会话
try {
const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
return { browser, launched: false };
} catch (e) {
console.log("连接失败,启动新浏览器");
}
}
// 启动新会话
const browser = await puppeteer.launch(env.MYBROWSER);
return { browser, launched: true };
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { browser, launched } = await getBrowser(env);
try {
const page = await browser.newPage();
await page.goto("https://example.com");
const screenshot = await page.screenshot();
// 断开连接(不关闭)以便复用
await browser.disconnect();
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
} catch (error) {
await browser.close(); // 出错时关闭
throw error;
}
}
};关键注意事项:
- 使用 保持会话存活
browser.disconnect() - 出错时使用
browser.close() - 始终处理连接失败的情况
Incognito Browser Contexts
无痕浏览器上下文
Use browser contexts to isolate cookies and cache while sharing a browser instance.
Benefits:
- Share browser (reduce concurrency)
- Isolate sessions (separate cookies/cache)
- Test multi-user scenarios
Example:
typescript
const browser = await puppeteer.launch(env.MYBROWSER);
// Create isolated contexts
const context1 = await browser.createBrowserContext();
const context2 = await browser.createBrowserContext();
// Each context has its own cookies/cache
const page1 = await context1.newPage();
const page2 = await context2.newPage();
await page1.goto("https://app.example.com"); // User 1
await page2.goto("https://app.example.com"); // User 2
await context1.close();
await context2.close();
await browser.close();使用浏览器上下文在复用浏览器实例的同时隔离Cookie和缓存。
优势:
- 共享浏览器(减少并发使用)
- 隔离会话(独立的Cookie/缓存)
- 测试多用户场景
示例:
typescript
const browser = await puppeteer.launch(env.MYBROWSER);
// 创建隔离的上下文
const context1 = await browser.createBrowserContext();
const context2 = await browser.createBrowserContext();
const page1 = await context1.newPage();
const page2 = await context2.newPage();
await page1.goto("https://app.example.com"); // 用户1
await page2.goto("https://app.example.com"); // 用户2
await context1.close();
await context2.close();
await browser.close();Multiple Tabs vs Multiple Browsers
多标签页 vs 多浏览器
Scenario: Scrape 10 URLs
❌ Bad (10 browsers):
typescript
for (const url of urls) {
const browser = await puppeteer.launch(env.MYBROWSER); // 10 launches!
// ... scrape ...
await browser.close();
}✅ Good (1 browser, 10 tabs):
typescript
const browser = await puppeteer.launch(env.MYBROWSER);
const results = await Promise.all(
urls.map(async (url) => {
const page = await browser.newPage();
await page.goto(url);
const data = await page.evaluate(() => ({
title: document.title,
text: document.body.innerText
}));
await page.close();
return { url, data };
})
);
await browser.close();Benefit: Uses 1 concurrent browser instead of 10.
场景: 爬取10个URL
❌ 不良实践(10个浏览器):
typescript
for (const url of urls) {
const browser = await puppeteer.launch(env.MYBROWSER); // 启动10次!
// ... 爬取 ...
await browser.close();
}✅ 最佳实践(1个浏览器,10个标签页):
typescript
const browser = await puppeteer.launch(env.MYBROWSER);
const results = await Promise.all(
urls.map(async (url) => {
const page = await browser.newPage();
await page.goto(url);
const data = await page.evaluate(() => ({
title: document.title,
text: document.body.innerText
}));
await page.close();
return { url, data };
})
);
await browser.close();优势: 仅使用1个并发浏览器,而非10个。
Common Patterns
常见模式
Pattern 1: Screenshot with KV Caching
模式1:带KV缓存的截图
Cache screenshots to reduce browser usage and improve performance.
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
CACHE: KVNamespace;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url");
if (!url) {
return new Response("Missing ?url parameter", { status: 400 });
}
const normalizedUrl = new URL(url).toString();
// Check cache
let screenshot = await env.CACHE.get(normalizedUrl, { type: "arrayBuffer" });
if (!screenshot) {
// Generate screenshot
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(normalizedUrl);
screenshot = await page.screenshot();
await browser.close();
// Cache for 24 hours
await env.CACHE.put(normalizedUrl, screenshot, {
expirationTtl: 60 * 60 * 24
});
}
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
}
};缓存截图以减少浏览器使用量并提升性能。
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
CACHE: KVNamespace;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url");
if (!url) {
return new Response("缺少 ?url 参数", { status: 400 });
}
const normalizedUrl = new URL(url).toString();
// 检查缓存
let screenshot = await env.CACHE.get(normalizedUrl, { type: "arrayBuffer" });
if (!screenshot) {
// 生成截图
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(normalizedUrl);
screenshot = await page.screenshot();
await browser.close();
// 缓存24小时
await env.CACHE.put(normalizedUrl, screenshot, {
expirationTtl: 60 * 60 * 24
});
}
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
}
};Pattern 2: PDF Generation from HTML
模式2:从HTML生成PDF
Convert custom HTML to PDF.
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
if (request.method !== "POST") {
return new Response("Method not allowed", { status: 405 });
}
const { html } = await request.json<{ html: string }>();
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
// Set custom HTML
await page.setContent(html, { waitUntil: "networkidle0" });
// Generate PDF
const pdf = await page.pdf({
format: "A4",
printBackground: true,
margin: {
top: "1cm",
right: "1cm",
bottom: "1cm",
left: "1cm"
}
});
await browser.close();
return new Response(pdf, {
headers: {
"content-type": "application/pdf",
"content-disposition": "attachment; filename=document.pdf"
}
});
}
};将自定义HTML转换为PDF。
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
if (request.method !== "POST") {
return new Response("方法不允许", { status: 405 });
}
const { html } = await request.json<{ html: string }>();
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
// 设置自定义HTML
await page.setContent(html, { waitUntil: "networkidle0" });
// 生成PDF
const pdf = await page.pdf({
format: "A4",
printBackground: true,
margin: {
top: "1cm",
right: "1cm",
bottom: "1cm",
left: "1cm"
}
});
await browser.close();
return new Response(pdf, {
headers: {
"content-type": "application/pdf",
"content-disposition": "attachment; filename=document.pdf"
}
});
}
};Pattern 3: Web Scraping with Structured Data
模式3:结构化数据网页爬取
Extract structured data from web pages.
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
interface ProductData {
title: string;
price: string;
description: string;
image: string;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url");
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(url!, { waitUntil: "networkidle0" });
// Extract structured data
const data = await page.evaluate<ProductData>(() => {
return {
title: document.querySelector("h1")?.textContent || "",
price: document.querySelector(".price")?.textContent || "",
description: document.querySelector(".description")?.textContent || "",
image: document.querySelector("img")?.src || ""
};
});
await browser.close();
return Response.json({ url, data });
}
};从网页提取结构化数据。
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
interface ProductData {
title: string;
price: string;
description: string;
image: string;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url");
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(url!, { waitUntil: "networkidle0" });
// 提取结构化数据
const data = await page.evaluate<ProductData>(() => {
return {
title: document.querySelector("h1")?.textContent || "",
price: document.querySelector(".price")?.textContent || "",
description: document.querySelector(".description")?.textContent || "",
image: document.querySelector("img")?.src || ""
};
});
await browser.close();
return Response.json({ url, data });
}
};Pattern 4: Batch Scraping Multiple URLs
模式4:批量爬取多个URL
Efficiently scrape multiple URLs using tabs.
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
async function scrapeUrl(browser: Browser, url: string): Promise<any> {
const page = await browser.newPage();
try {
await page.goto(url, { waitUntil: "networkidle0", timeout: 30000 });
const data = await page.evaluate(() => ({
title: document.title,
url: window.location.href,
text: document.body.innerText.slice(0, 500) // First 500 chars
}));
await page.close();
return { success: true, url, data };
} catch (error) {
await page.close();
return {
success: false,
url,
error: error instanceof Error ? error.message : "Unknown error"
};
}
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { urls } = await request.json<{ urls: string[] }>();
if (!urls || urls.length === 0) {
return new Response("Missing urls array", { status: 400 });
}
const browser = await puppeteer.launch(env.MYBROWSER);
// Scrape all URLs in parallel (each in its own tab)
const results = await Promise.all(
urls.map(url => scrapeUrl(browser, url))
);
await browser.close();
return Response.json({ results });
}
};使用标签页高效爬取多个URL。
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
async function scrapeUrl(browser: Browser, url: string): Promise<any> {
const page = await browser.newPage();
try {
await page.goto(url, { waitUntil: "networkidle0", timeout: 30000 });
const data = await page.evaluate(() => ({
title: document.title,
url: window.location.href,
text: document.body.innerText.slice(0, 500) // 前500个字符
}));
await page.close();
return { success: true, url, data };
} catch (error) {
await page.close();
return {
success: false,
url,
error: error instanceof Error ? error.message : "未知错误"
};
}
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { urls } = await request.json<{ urls: string[] }>();
if (!urls || urls.length === 0) {
return new Response("缺少 urls 数组", { status: 400 });
}
const browser = await puppeteer.launch(env.MYBROWSER);
// 并行爬取所有URL(每个URL对应一个标签页)
const results = await Promise.all(
urls.map(url => scrapeUrl(browser, url))
);
await browser.close();
return Response.json({ results });
}
};Pattern 5: AI-Enhanced Scraping
模式5:AI增强爬取
Combine Browser Rendering with Workers AI to extract structured data.
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
AI: Ai;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url");
// Scrape page content
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(url!, { waitUntil: "networkidle0" });
const bodyContent = await page.$eval("body", el => el.innerHTML);
await browser.close();
// Extract structured data with AI
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
messages: [
{
role: "user",
content: `Extract product information as JSON from this HTML. Include: name, price, description, availability.\n\nHTML:\n${bodyContent.slice(0, 4000)}`
}
]
});
// Parse AI response
let productData;
try {
productData = JSON.parse(response.response);
} catch {
productData = { raw: response.response };
}
return Response.json({ url, product: productData });
}
};将浏览器渲染与Workers AI结合以提取结构化数据。
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
AI: Ai;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { searchParams } = new URL(request.url);
const url = searchParams.get("url");
// 爬取页面内容
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(url!, { waitUntil: "networkidle0" });
const bodyContent = await page.$eval("body", el => el.innerHTML);
await browser.close();
// 使用AI提取结构化数据
const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
messages: [
{
role: "user",
content: `从以下HTML中提取产品信息为JSON格式。包含:名称、价格、描述、库存状态。\n\nHTML:\n${bodyContent.slice(0, 4000)}`
}
]
});
// 解析AI响应
let productData;
try {
productData = JSON.parse(response.response);
} catch {
productData = { raw: response.response };
}
return Response.json({ url, product: productData });
}
};Pattern 6: Form Filling and Automation
模式6:表单填写与自动化
Automate form submissions and multi-step workflows.
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { email, password } = await request.json<{
email: string;
password: string;
}>();
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
// Navigate to login page
await page.goto("https://example.com/login");
// Fill form
await page.type('input[name="email"]', email);
await page.type('input[name="password"]', password);
// Submit and wait for navigation
await page.click('button[type="submit"]');
await page.waitForNavigation();
// Extract result
const result = await page.evaluate(() => ({
url: window.location.href,
title: document.title,
loggedIn: document.querySelector(".user-profile") !== null
}));
await browser.close();
return Response.json(result);
}
};自动化表单提交和多步骤工作流。
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { email, password } = await request.json<{
email: string;
password: string;
}>();
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
// 导航到登录页面
await page.goto("https://example.com/login");
// 填写表单
await page.type('input[name="email"]', email);
await page.type('input[name="password"]', password);
// 提交并等待导航完成
await page.click('button[type="submit"]');
await page.waitForNavigation();
// 提取结果
const result = await page.evaluate(() => ({
url: window.location.href,
title: document.title,
loggedIn: document.querySelector(".user-profile") !== null
}));
await browser.close();
return Response.json(result);
}
};Pricing & Limits
定价与限制
Free Tier (Workers Free)
免费套餐(Workers Free)
| Feature | Limit |
|---|---|
| Browser Duration | 10 minutes per day |
| Concurrent Browsers | 3 per account |
| New Browsers per Minute | 3 per minute |
| REST API Requests | 6 per minute |
| Browser Timeout | 60 seconds (idle) |
| 特性 | 限制 |
|---|---|
| 浏览器运行时长 | 每天10分钟 |
| 并发浏览器数 | 每个账户3个 |
| 每分钟新浏览器启动数 | 每分钟3个 |
| REST API请求数 | 每分钟6个 |
| 浏览器超时时间 | 60秒(空闲) |
Paid Tier (Workers Paid)
付费套餐(Workers Paid)
| Feature | Included | Beyond Included |
|---|---|---|
| Browser Duration | 10 hours per month | $0.09 per additional browser hour |
| Concurrent Browsers | 10 (monthly average) | $2.00 per additional concurrent browser |
| New Browsers per Minute | 30 per minute | Request higher limit |
| REST API Requests | 180 per minute | Request higher limit |
| Browser Timeout | 60 seconds (can extend to 10 minutes with | - |
| Max Concurrent Browsers | 30 per account | Request higher limit |
| 特性 | 包含额度 | 超出部分 |
|---|---|---|
| 浏览器运行时长 | 每月10小时 | 每额外浏览器小时$0.09 |
| 并发浏览器数 | 10个(月均) | 每额外并发浏览器$2.00 |
| 每分钟新浏览器启动数 | 每分钟30个 | 可申请更高限制 |
| REST API请求数 | 每分钟180个 | 可申请更高限制 |
| 浏览器超时时间 | 60秒(可通过 | - |
| 最大并发浏览器数 | 每个账户30个 | 可申请更高限制 |
Pricing Calculation
定价计算
Duration Charges:
- Charged per browser hour
- Rounded to nearest hour at end of billing cycle
- Failed requests (timeouts) are NOT charged
Concurrency Charges:
- Monthly average of daily peak usage
- Example: 10 browsers for 15 days, 20 browsers for 15 days = (10×15 + 20×15) / 30 = 15 average
- 15 average - 10 included = 5 × $2.00 = $10.00
Example Monthly Bill:
- 50 browser hours used: (50 - 10) × $0.09 = $3.60
- 15 concurrent browsers average: (15 - 10) × $2.00 = $10.00
- Total: $13.60
时长费用:
- 按浏览器小时计费
- 计费周期结束时四舍五入到最近的小时
- 失败请求(超时)不收费
并发费用:
- 每日峰值使用量的月平均值
- 示例:10个浏览器运行15天,20个浏览器运行15天 = (10×15 + 20×15) / 30 = 15个平均值
- 15个平均值 - 10个包含额度 = 5个 × $2.00 = $10.00
月度账单示例:
- 使用50个浏览器小时:(50 - 10) × $0.09 = $3.60
- 并发浏览器月均15个:(15 - 10) × $2.00 = $10.00
- 总计:$13.60
Rate Limiting
速率限制
Per-Second Rate:
Rate limits are enforced per-second. Example:
- 180 requests per minute = 3 requests per second
- You cannot send all 180 at once; they must be spread evenly
Handling Rate Limits:
typescript
async function launchWithRetry(env: Env, maxRetries = 3): Promise<Browser> {
for (let i = 0; i < maxRetries; i++) {
try {
return await puppeteer.launch(env.MYBROWSER);
} catch (error) {
if (i === maxRetries - 1) throw error;
// Check if rate limited
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
// Wait before retry
const delay = limits.timeUntilNextAllowedBrowserAcquisition || 1000;
await new Promise(resolve => setTimeout(resolve, delay));
}
}
}
throw new Error("Failed to launch browser");
}每秒速率:
速率限制按秒执行。示例:
- 每分钟180个请求 = 每秒3个请求
- 不能一次性发送180个请求;必须均匀分布
处理速率限制:
typescript
async function launchWithRetry(env: Env, maxRetries = 3): Promise<Browser> {
for (let i = 0; i < maxRetries; i++) {
try {
return await puppeteer.launch(env.MYBROWSER);
} catch (error) {
if (i === maxRetries - 1) throw error;
// 检查是否达到速率限制
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
// 等待后重试
const delay = limits.timeUntilNextAllowedBrowserAcquisition || 1000;
await new Promise(resolve => setTimeout(resolve, delay));
}
}
}
throw new Error("启动浏览器失败");
}Known Issues Prevention
已知问题预防
This skill prevents 6 documented issues:
本指南可预防6个已记录的问题:
Issue #1: XPath Selectors Not Supported
问题1:XPath选择器不支持
Error: "XPath selector not supported" or selector failures
Source: https://developers.cloudflare.com/browser-rendering/faq/#why-cant-i-use-an-xpath-selector-when-using-browser-rendering-with-puppeteer
Why It Happens: XPath poses a security risk to Workers
Prevention: Use CSS selectors or with XPathEvaluator
page.evaluate()Solution:
typescript
// ❌ Don't use XPath directly (not supported)
// await page.$x('/html/body/div/h1')
// ✅ Use CSS selector
const heading = await page.$("div > h1");
// ✅ Or use XPath in page.evaluate()
const innerHtml = await page.evaluate(() => {
return new XPathEvaluator()
.createExpression("/html/body/div/h1")
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
.singleNodeValue.innerHTML;
});错误: "XPath selector not supported" 或选择器失败
来源: https://developers.cloudflare.com/browser-rendering/faq/#why-cant-i-use-an-xpath-selector-when-using-browser-rendering-with-puppeteer
原因: XPath对Workers存在安全风险
预防方案: 使用CSS选择器或 结合XPathEvaluator
page.evaluate()解决方案:
typescript
// ❌ 不要直接使用XPath(不支持)
// await page.$x('/html/body/div/h1')
// ✅ 使用CSS选择器
const heading = await page.$("div > h1");
// ✅ 或在page.evaluate()中使用XPathEvaluator
const innerHtml = await page.evaluate(() => {
return new XPathEvaluator()
.createExpression("/html/body/div/h1")
.evaluate(document, XPathResult.FIRST_ORDERED_NODE_TYPE)
.singleNodeValue.innerHTML;
});Issue #2: Browser Binding Not Passed
问题2:未传递浏览器绑定
Error: "Cannot read properties of undefined (reading 'fetch')"
Source: https://developers.cloudflare.com/browser-rendering/faq/#cannot-read-properties-of-undefined-reading-fetch
Why It Happens: called without browser binding
Prevention: Always pass to launch
puppeteer.launch()env.MYBROWSERSolution:
typescript
// ❌ Missing browser binding
const browser = await puppeteer.launch(); // Error!
// ✅ Pass binding
const browser = await puppeteer.launch(env.MYBROWSER);错误: "Cannot read properties of undefined (reading 'fetch')"
来源: https://developers.cloudflare.com/browser-rendering/faq/#cannot-read-properties-of-undefined-reading-fetch
原因: 调用 时未传递浏览器绑定
预防方案: 始终将 传递给launch方法
puppeteer.launch()env.MYBROWSER解决方案:
typescript
// ❌ 缺少浏览器绑定
const browser = await puppeteer.launch(); // 错误!
// ✅ 传递绑定
const browser = await puppeteer.launch(env.MYBROWSER);Issue #3: Browser Timeout (60 seconds)
问题3:浏览器超时(60秒)
Error: Browser closes unexpectedly after 60 seconds
Source: https://developers.cloudflare.com/browser-rendering/platform/limits/#note-on-browser-timeout
Why It Happens: Default timeout is 60 seconds of inactivity
Prevention: Use option to extend up to 10 minutes
keep_aliveSolution:
typescript
// Extend timeout to 5 minutes for long-running tasks
const browser = await puppeteer.launch(env.MYBROWSER, {
keep_alive: 300000 // 5 minutes = 300,000 ms
});Note: Browser closes if no devtools commands for the specified duration.
错误: 浏览器在60秒后意外关闭
来源: https://developers.cloudflare.com/browser-rendering/platform/limits/#note-on-browser-timeout
原因: 默认空闲超时为60秒
预防方案: 使用 选项延长至最多10分钟
keep_alive解决方案:
typescript
// 为长时间运行的任务将超时延长至5分钟
const browser = await puppeteer.launch(env.MYBROWSER, {
keep_alive: 300000 // 5分钟 = 300,000毫秒
});注意: 如果在指定时长内没有devtools命令,浏览器将关闭。
Issue #4: Concurrency Limits Reached
问题4:达到并发限制
Error: "Rate limit exceeded" or new browser launch fails
Source: https://developers.cloudflare.com/browser-rendering/platform/limits/
Why It Happens: Exceeded concurrent browser limit (3 free, 10-30 paid)
Prevention: Reuse sessions, use tabs instead of multiple browsers, check limits before launching
Solutions:
typescript
// 1. Check limits before launching
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
return new Response("Concurrency limit reached", { status: 429 });
}
// 2. Reuse sessions
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
}
// 3. Use tabs instead of multiple browsers
const browser = await puppeteer.launch(env.MYBROWSER);
const page1 = await browser.newPage();
const page2 = await browser.newPage(); // Same browser, different tabs错误: "Rate limit exceeded" 或新浏览器启动失败
来源: https://developers.cloudflare.com/browser-rendering/platform/limits/
原因: 超出并发浏览器限制(免费套餐3个,付费套餐10-30个)
预防方案: 复用会话、使用标签页而非多个浏览器、启动前检查限制
解决方案:
typescript
// 1. 启动前检查限制
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
return new Response("达到并发限制", { status: 429 });
}
// 2. 复用会话
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
const browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
}
// 3. 使用标签页而非多个浏览器
const browser = await puppeteer.launch(env.MYBROWSER);
const page1 = await browser.newPage();
const page2 = await browser.newPage(); // 同一个浏览器,不同标签页Issue #5: Local Development Request Size Limit
问题5:本地开发请求大小限制
Error: Request larger than 1MB fails in
Source: https://developers.cloudflare.com/browser-rendering/faq/#does-local-development-support-all-browser-rendering-features
Why It Happens: Local development limitation
Prevention: Use in browser binding for local dev
wrangler devremote: trueSolution:
jsonc
// wrangler.jsonc for local development
{
"browser": {
"binding": "MYBROWSER",
"remote": true // Use real headless browser during dev
}
}错误: 中请求超过1MB失败
来源: https://developers.cloudflare.com/browser-rendering/faq/#does-local-development-support-all-browser-rendering-features
原因: 本地开发限制
预防方案: 本地开发时在浏览器绑定中使用
wrangler devremote: true解决方案:
jsonc
// 本地开发的wrangler.jsonc
{
"browser": {
"binding": "MYBROWSER",
"remote": true // 开发时使用真实无头浏览器
}
}Issue #6: Bot Protection Always Triggered
问题6:始终触发机器人防护
Error: Website blocks requests as bot traffic
Source: https://developers.cloudflare.com/browser-rendering/faq/#will-browser-rendering-bypass-cloudflares-bot-protection
Why It Happens: Browser Rendering requests always identified as bots
Prevention: Cannot bypass; if scraping your own zone, create WAF skip rule
Solution:
typescript
// ❌ Cannot bypass bot protection
// Requests will always be identified as bots
// ✅ If scraping your own Cloudflare zone:
// 1. Go to Security > WAF > Custom rules
// 2. Create skip rule with custom header:
// Header: X-Custom-Auth
// Value: your-secret-token
// 3. Pass header in your scraping requests
// Note: Automatic headers are included:
// - cf-biso-request-id
// - cf-biso-devtools错误: 网站将请求识别为机器人流量并阻止
来源: https://developers.cloudflare.com/browser-rendering/faq/#will-browser-rendering-bypass-cloudflares-bot-protection
原因: 浏览器渲染请求始终被识别为机器人
预防方案: 无法绕过;如果爬取自己的zone,创建WAF跳过规则
解决方案:
typescript
// ❌ 无法绕过机器人防护
// 请求始终会被识别为机器人
// ✅ 如果爬取自己的Cloudflare zone:
// 1. 进入Security > WAF > Custom rules
// 2. 使用自定义头创建跳过规则:
// 头:X-Custom-Auth
// 值:your-secret-token
// 3. 在爬取请求中传递该头
// 注意:会自动包含以下头:
// - cf-biso-request-id
// - cf-biso-devtoolsProduction Checklist
生产环境检查清单
Before deploying Browser Rendering Workers to production:
在将浏览器渲染Worker部署到生产环境前:
Configuration
配置
- Browser binding configured in wrangler.jsonc
- nodejs_compat flag enabled (required for Browser Rendering)
- Keep-alive timeout set if tasks take > 60 seconds
- Remote binding enabled for local development if needed
- 已配置浏览器绑定 在wrangler.jsonc中
- 已启用nodejs_compat标志(浏览器渲染必需)
- 已设置keep-alive超时 如果任务耗时超过60秒
- 已启用远程绑定 (如果本地开发需要)
Error Handling
错误处理
- Retry logic implemented for rate limits
- Timeout handling for page.goto()
- Browser cleanup in try-finally blocks
- Concurrency limit checks before launching browsers
- Graceful degradation when browser unavailable
- 已实现重试逻辑 处理速率限制
- 已处理page.goto()超时
- 已在try-finally块中清理浏览器资源
- 已在启动浏览器前检查并发限制
- 已实现优雅降级 当浏览器不可用时
Performance
性能
- Session reuse implemented for high-traffic routes
- Multiple tabs used instead of multiple browsers
- Incognito contexts for session isolation
- KV caching for repeated screenshots/PDFs
- Batch operations to maximize browser utilization
- 已实现会话复用 针对高流量路由
- 已使用标签页 而非多个浏览器
- 已使用无痕上下文 实现会话隔离
- 已使用KV缓存 存储重复的截图/PDF
- 已实现批量操作 最大化浏览器利用率
Monitoring
监控
- Log browser session IDs for debugging
- Track browser duration for billing estimates
- Monitor concurrency usage with puppeteer.limits()
- Alert on rate limit errors
- Dashboard monitoring at https://dash.cloudflare.com/?to=/:account/workers/browser-rendering
- 已记录浏览器会话ID 用于调试
- 已跟踪浏览器运行时长 用于账单估算
- 已使用puppeteer.limits()监控并发使用情况
- 已设置速率限制错误警报
- 已在https://dash.cloudflare.com/?to=/:account/workers/browser-rendering配置仪表板监控
Security
安全
- Input validation for URLs (prevent SSRF)
- Timeout limits to prevent abuse
- Rate limiting on public endpoints
- Authentication for sensitive scraping endpoints
- WAF rules if scraping your own zone
- 已实现URL输入验证 防止SSRF
- 已设置超时限制 防止滥用
- 已在公共端点设置速率限制
- 已为敏感爬取端点实现认证
- 已为自己的zone设置WAF规则
Testing
测试
- Test screenshot capture with various page sizes
- Test PDF generation with custom HTML
- Test scraping with dynamic content (networkidle0)
- Test error scenarios (invalid URLs, timeouts)
- Load test concurrency limits
- 已测试不同页面尺寸的截图捕获
- 已测试自定义HTML的PDF生成
- 已测试动态内容的爬取(使用networkidle0)
- 已测试错误场景(无效URL、超时)
- 已测试并发限制的负载
Error Handling Template
错误处理模板
Complete error handling for production use:
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
async function withBrowser<T>(
env: Env,
fn: (browser: Browser) => Promise<T>
): Promise<T> {
let browser: Browser | null = null;
try {
// Check limits
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
throw new Error("Rate limit reached. Retry after: " + limits.timeUntilNextAllowedBrowserAcquisition + "ms");
}
// Launch or connect
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
try {
browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
} catch {
browser = await puppeteer.launch(env.MYBROWSER);
}
} else {
browser = await puppeteer.launch(env.MYBROWSER);
}
// Execute user function
const result = await fn(browser);
// Disconnect (keep session alive)
await browser.disconnect();
return result;
} catch (error) {
// Close on error
if (browser) {
await browser.close();
}
throw error;
}
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
try {
const screenshot = await withBrowser(env, async (browser) => {
const page = await browser.newPage();
await page.goto("https://example.com", {
waitUntil: "networkidle0",
timeout: 30000
});
return await page.screenshot();
});
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
} catch (error) {
console.error("Browser error:", error);
return new Response(
JSON.stringify({
error: error instanceof Error ? error.message : "Unknown error"
}),
{ status: 500, headers: { "content-type": "application/json" } }
);
}
}
};生产环境使用的完整错误处理模板:
typescript
import puppeteer from "@cloudflare/puppeteer";
interface Env {
MYBROWSER: Fetcher;
}
async function withBrowser<T>(
env: Env,
fn: (browser: Browser) => Promise<T>
): Promise<T> {
let browser: Browser | null = null;
try {
// 检查限制
const limits = await puppeteer.limits(env.MYBROWSER);
if (limits.allowedBrowserAcquisitions === 0) {
throw new Error("达到速率限制。重试等待时间:" + limits.timeUntilNextAllowedBrowserAcquisition + "毫秒");
}
// 启动或连接
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSessions = sessions.filter(s => !s.connectionId);
if (freeSessions.length > 0) {
try {
browser = await puppeteer.connect(env.MYBROWSER, freeSessions[0].sessionId);
} catch {
browser = await puppeteer.launch(env.MYBROWSER);
}
} else {
browser = await puppeteer.launch(env.MYBROWSER);
}
// 执行用户函数
const result = await fn(browser);
// 断开连接(保持会话存活)
await browser.disconnect();
return result;
} catch (error) {
// 出错时关闭
if (browser) {
await browser.close();
}
throw error;
}
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
try {
const screenshot = await withBrowser(env, async (browser) => {
const page = await browser.newPage();
await page.goto("https://example.com", {
waitUntil: "networkidle0",
timeout: 30000
});
return await page.screenshot();
});
return new Response(screenshot, {
headers: { "content-type": "image/png" }
});
} catch (error) {
console.error("浏览器错误:", error);
return new Response(
JSON.stringify({
error: error instanceof Error ? error.message : "未知错误"
}),
{ status: 500, headers: { "content-type": "application/json" } }
);
}
}
};Using Bundled Resources
使用捆绑资源
Templates (templates/)
模板(templates/)
Ready-to-use code templates for common patterns:
- - Minimal screenshot example
basic-screenshot.ts - - Screenshot with KV caching
screenshot-with-kv-cache.ts - - Generate PDFs from HTML or URLs
pdf-generation.ts - - Basic web scraping pattern
web-scraper-basic.ts - - Batch scrape multiple URLs
web-scraper-batch.ts - - Session reuse for performance
session-reuse.ts - - Scraping with Workers AI
ai-enhanced-scraper.ts - - Playwright alternative example
playwright-example.ts - - Browser binding configuration
wrangler-browser-config.jsonc
Usage:
bash
undefined适用于常见模式的即用型代码模板:
- - 极简截图示例
basic-screenshot.ts - - 带KV缓存的截图
screenshot-with-kv-cache.ts - - 从HTML或URL生成PDF
pdf-generation.ts - - 基础网页爬取模式
web-scraper-basic.ts - - 批量爬取多个URL
web-scraper-batch.ts - - 会话复用提升性能
session-reuse.ts - - 结合Workers AI的爬取
ai-enhanced-scraper.ts - - Playwright替代示例
playwright-example.ts - - 浏览器绑定配置
wrangler-browser-config.jsonc
使用方法:
bash
undefinedCopy template to your project
将模板复制到你的项目
cp ~/.claude/skills/cloudflare-browser-rendering/templates/basic-screenshot.ts src/index.ts
undefinedcp ~/.claude/skills/cloudflare-browser-rendering/templates/basic-screenshot.ts src/index.ts
undefinedReferences (references/)
参考文档(references/)
Deep-dive documentation:
- - Complete session reuse guide
session-management.md - - Detailed pricing breakdown
pricing-and-limits.md - - All known issues and solutions
common-errors.md - - Feature comparison and migration
puppeteer-vs-playwright.md
When to load: Reference when implementing advanced patterns or debugging specific issues.
深度文档:
- - 完整会话复用指南
session-management.md - - 详细定价分解
pricing-and-limits.md - - 所有已知问题及解决方案
common-errors.md - - 特性对比及迁移指南
puppeteer-vs-playwright.md
加载时机: 实现高级模式或调试特定问题时参考。
Dependencies
依赖项
Required:
- - Puppeteer for Workers
@cloudflare/puppeteer@1.0.4 - - Cloudflare CLI
wrangler@4.43.0+
Optional:
- - Playwright for Workers (alternative)
@cloudflare/playwright@1.0.0 - - TypeScript types
@cloudflare/workers-types@4.20251014.0+
Related Skills:
- - Worker setup with Hono
cloudflare-worker-base - - KV caching for screenshots
cloudflare-kv - - R2 storage for generated files
cloudflare-r2 - - AI-enhanced scraping
cloudflare-workers-ai
必需:
- - Workers版Puppeteer
@cloudflare/puppeteer@1.0.4 - - Cloudflare CLI
wrangler@4.43.0+
可选:
- - Workers版Playwright(替代方案)
@cloudflare/playwright@1.0.0 - - TypeScript类型定义
@cloudflare/workers-types@4.20251014.0+
相关技能:
- - 基于Hono的Worker配置
cloudflare-worker-base - - 用于截图缓存的KV存储
cloudflare-kv - - 用于生成文件的R2存储
cloudflare-r2 - - AI增强爬取
cloudflare-workers-ai
Official Documentation
官方文档
- Browser Rendering Docs: https://developers.cloudflare.com/browser-rendering/
- Puppeteer API: https://pptr.dev/api/
- Playwright API: https://playwright.dev/docs/api/class-playwright
- Cloudflare Puppeteer Fork: https://github.com/cloudflare/puppeteer
- Cloudflare Playwright Fork: https://github.com/cloudflare/playwright
- Pricing: https://developers.cloudflare.com/browser-rendering/platform/pricing/
- Limits: https://developers.cloudflare.com/browser-rendering/platform/limits/
- 浏览器渲染文档:https://developers.cloudflare.com/browser-rendering/
- Puppeteer API:https://pptr.dev/api/
- Playwright API:https://playwright.dev/docs/api/class-playwright
- Cloudflare Puppeteer分支:https://github.com/cloudflare/puppeteer
- Cloudflare Playwright分支:https://github.com/cloudflare/playwright
- 定价:https://developers.cloudflare.com/browser-rendering/platform/pricing/
- 限制:https://developers.cloudflare.com/browser-rendering/platform/limits/
Package Versions (Verified 2025-10-22)
依赖版本(2025-10-22验证)
json
{
"dependencies": {
"@cloudflare/puppeteer": "^1.0.4"
},
"devDependencies": {
"@cloudflare/workers-types": "^4.20251014.0",
"wrangler": "^4.43.0"
}
}Alternative (Playwright):
json
{
"dependencies": {
"@cloudflare/playwright": "^1.0.0"
}
}json
{
"dependencies": {
"@cloudflare/puppeteer": "^1.0.4"
},
"devDependencies": {
"@cloudflare/workers-types": "^4.20251014.0",
"wrangler": "^4.43.0"
}
}替代方案(Playwright):
json
{
"dependencies": {
"@cloudflare/playwright": "^1.0.0"
}
}Troubleshooting
故障排除
Problem: "Cannot read properties of undefined (reading 'fetch')"
问题:"Cannot read properties of undefined (reading 'fetch')"
Solution: Pass browser binding to puppeteer.launch():
typescript
const browser = await puppeteer.launch(env.MYBROWSER); // Not just puppeteer.launch()解决方案: 将浏览器绑定传递给puppeteer.launch():
typescript
const browser = await puppeteer.launch(env.MYBROWSER); // 不要只传puppeteer.launch()Problem: XPath selectors not working
问题:XPath选择器不工作
Solution: Use CSS selectors or page.evaluate() with XPathEvaluator (see Issue #1)
解决方案: 使用CSS选择器或page.evaluate()结合XPathEvaluator(见问题1)
Problem: Browser closes after 60 seconds
问题:浏览器60秒后关闭
Solution: Extend timeout with keep_alive:
typescript
const browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 300000 });解决方案: 使用keep_alive延长超时:
typescript
const browser = await puppeteer.launch(env.MYBROWSER, { keep_alive: 300000 });Problem: Rate limit reached
问题:达到速率限制
Solution: Reuse sessions, use tabs, check limits before launching (see Issue #4)
解决方案: 复用会话、使用标签页、启动前检查限制(见问题4)
Problem: Local dev request > 1MB fails
问题:本地开发请求超过1MB失败
Solution: Enable remote binding in wrangler.jsonc:
jsonc
{ "browser": { "binding": "MYBROWSER", "remote": true } }解决方案: 在wrangler.jsonc中启用远程绑定:
jsonc
{ "browser": { "binding": "MYBROWSER", "remote": true } }Problem: Website blocks as bot
问题:网站将请求识别为机器人并阻止
Solution: Cannot bypass. If your own zone, create WAF skip rule (see Issue #6)
Questions? Issues?
- Check for detailed solutions
references/common-errors.md - Review for performance optimization
references/session-management.md - Verify browser binding is configured in wrangler.jsonc
- Check official docs: https://developers.cloudflare.com/browser-rendering/
- Ensure compatibility flag is enabled
nodejs_compat
解决方案: 无法绕过。如果是自己的zone,创建WAF跳过规则(见问题6)
有问题?遇到错误?
- 查看 获取详细解决方案
references/common-errors.md - 查看 获取性能优化建议
references/session-management.md - 验证wrangler.jsonc中是否配置了浏览器绑定
- 查看官方文档:https://developers.cloudflare.com/browser-rendering/
- 确保已启用兼容性标志
nodejs_compat