cloudflare-browser-rendering
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseCloudflare Browser Rendering
Cloudflare Browser Rendering
Control headless browsers with Cloudflare's Workers Browser Rendering API. Automate tasks, take screenshots, convert pages to PDFs, extract data, and test web apps.
通过Cloudflare的Workers Browser Rendering API控制无头浏览器。实现任务自动化、网页截图、页面转PDF、数据提取以及Web应用测试。
When to Use This Skill
何时使用该功能
Use Cloudflare Browser Rendering when you need to:
- Take screenshots of web pages (PNG, JPEG, WebP)
- Generate PDFs from HTML/CSS or web pages
- Scrape dynamic content that requires JavaScript execution
- Extract structured data from websites (JSON-LD, Schema.org, Open Graph)
- Convert web pages to Markdown or extract links
- Automate browser interactions for testing or workflows
- Integrate browser automation with Cloudflare Workers
- Build AI-powered web scrapers with Workers AI
- Deploy MCP servers for LLM agent browser control
- Create web crawlers with Queues integration
在以下场景中使用Cloudflare Browser Rendering:
- 截取网页截图(PNG、JPEG、WebP格式)
- 从HTML/CSS或网页生成PDF
- 抓取需要执行JavaScript的动态内容
- 从网站提取结构化数据(JSON-LD、Schema.org、Open Graph)
- 将网页转换为Markdown或提取链接
- 自动化浏览器交互以进行测试或工作流
- 将浏览器自动化与Cloudflare Workers集成
- 结合Workers AI构建AI驱动的网页抓取工具
- 部署MCP服务器供LLM Agent控制浏览器
- 集成Queues创建网页爬虫
Integration Approaches
集成方式
1. REST API (Simple, No Worker Required)
1. REST API(简单无需Worker)
Quick integration using HTTP endpoints. Ideal for one-off tasks or external service integration.
Available Endpoints:
- - Capture PNG/JPEG/WebP screenshots
/screenshot - - Generate PDF documents
/pdf - - Extract fully rendered HTML
/content - - Convert pages to Markdown
/markdown - - Extract data via CSS selectors
/scrape - - Extract and analyze page links
/links - - Extract JSON-LD, Schema.org metadata
/json - - Debug with multi-step browser states
/snapshot
Authentication:
bash
curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/screenshot" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'Rate Limits:
- 60 requests/minute
- 10 concurrent requests
- 100 burst per 5 minutes
通过HTTP端点快速集成,适合一次性任务或外部服务对接。
可用端点:
- - 捕获PNG/JPEG/WebP格式截图
/screenshot - - 生成PDF文档
/pdf - - 提取完全渲染后的HTML
/content - - 将页面转换为Markdown
/markdown - - 通过CSS选择器提取数据
/scrape - - 提取并分析页面链接
/links - - 提取JSON-LD、Schema.org元数据
/json - - 多步骤浏览器状态调试
/snapshot
认证方式:
bash
curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/screenshot" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'速率限制:
- 每分钟60次请求
- 最多10个并发请求
- 每5分钟最多100次突发请求
2. Workers Bindings with Puppeteer (Low-Level Control)
2. Workers绑定 + Puppeteer(底层控制)
Full Puppeteer API access within Cloudflare Workers for maximum control.
Setup (wrangler.toml):
toml
name = "browser-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"
browser = { binding = "MYBROWSER" }
[[kv_namespaces]]
binding = "KV"
id = "your-kv-namespace-id"Basic Screenshot Worker:
typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
const screenshot = await page.screenshot({ type: 'png' });
await browser.close();
return new Response(screenshot, {
headers: { 'Content-Type': 'image/png' },
});
},
};Key Puppeteer Methods:
- - Start new browser
puppeteer.launch(binding) - - Create new page
browser.newPage() - - Navigate to URL
page.goto(url, options) - - Capture screenshot
page.screenshot(options) - - Get HTML content
page.content() - - Generate PDF
page.pdf(options) - - Execute JS in page context
page.evaluate(fn) - - Disconnect keeping session alive
browser.disconnect() - - Close and end session
browser.close() - - Reconnect to session
puppeteer.connect(binding, sessionId)
Session Reuse (Critical for Cost Optimization):
typescript
// Disconnect instead of close to keep session alive
await browser.disconnect();
// Retrieve and reconnect to existing session
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSession = sessions.find((s) => !s.connectionId);
if (freeSession) {
const browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
}在Cloudflare Workers中获得完整的Puppeteer API访问权限,实现最大程度的控制。
配置(wrangler.toml):
toml
name = "browser-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"
browser = { binding = "MYBROWSER" }
[[kv_namespaces]]
binding = "KV"
id = "your-kv-namespace-id"基础截图Worker:
typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
const screenshot = await page.screenshot({ type: 'png' });
await browser.close();
return new Response(screenshot, {
headers: { 'Content-Type': 'image/png' },
});
},
};核心Puppeteer方法:
- - 启动新浏览器
puppeteer.launch(binding) - - 创建新页面
browser.newPage() - - 导航至指定URL
page.goto(url, options) - - 捕获截图
page.screenshot(options) - - 获取HTML内容
page.content() - - 生成PDF
page.pdf(options) - - 在页面上下文中执行JS
page.evaluate(fn) - - 断开连接并保持会话活跃
browser.disconnect() - - 关闭并终止会话
browser.close() - - 重新连接至会话
puppeteer.connect(binding, sessionId)
会话复用(成本优化关键):
typescript
// 使用disconnect而非close以保持会话活跃
await browser.disconnect();
// 获取并重新连接至现有会话
const sessions = await puppeteer.sessions(env.MYBROWSER);
const freeSession = sessions.find((s) => !s.connectionId);
if (freeSession) {
const browser = await puppeteer.connect(env.MYBROWSER, freeSession.sessionId);
}3. Workers Bindings with Playwright (Testing Focus)
3. Workers绑定 + Playwright(测试导向)
Playwright provides advanced testing features, assertions, and debugging.
Setup:
bash
npm create cloudflare@latest -- browser-worker
cd browser-worker
npm install
wrangler dev # Local testing
wrangler deploy # ProductionAdvanced Playwright Worker:
typescript
import { Hono } from 'hono';
const app = new Hono<{ Bindings: Env }>();
app.get('/screenshot/:url', async (c) => {
const browser = await c.env.MYBROWSER.launch();
const page = await browser.newPage();
await page.goto(c.req.param('url'));
await page.waitForLoadState('networkidle');
const screenshot = await page.screenshot({ fullPage: true });
await browser.close();
return c.body(screenshot, 200, {
'Content-Type': 'image/png',
});
});
export default app;Playwright-Specific Features:
- Storage state persistence with KV
- Tracing for debugging
- Advanced assertions ()
expect(page).toHaveTitle() - Network interception
- Multiple contexts for tab pooling
Storage State Caching:
typescript
// Save authentication state
const state = await page.context().storageState();
await env.KV.put('auth-state', JSON.stringify(state));
// Restore authentication state
const savedState = await env.KV.get('auth-state', 'json');
const context = await browser.newContext({ storageState: savedState });Playwright提供高级测试功能、断言和调试能力。
配置:
bash
npm create cloudflare@latest -- browser-worker
cd browser-worker
npm install
wrangler dev # 本地测试
wrangler deploy # 生产部署高级Playwright Worker:
typescript
import { Hono } from 'hono';
const app = new Hono<{ Bindings: Env }>();
app.get('/screenshot/:url', async (c) => {
const browser = await c.env.MYBROWSER.launch();
const page = await browser.newPage();
await page.goto(c.req.param('url'));
await page.waitForLoadState('networkidle');
const screenshot = await page.screenshot({ fullPage: true });
await browser.close();
return c.body(screenshot, 200, {
'Content-Type': 'image/png',
});
});
export default app;Playwright专属功能:
- 结合KV存储实现状态持久化
- 追踪调试功能
- 高级断言()
expect(page).toHaveTitle() - 网络拦截
- 多上下文标签池
存储状态缓存:
typescript
// 保存认证状态
const state = await page.context().storageState();
await env.KV.put('auth-state', JSON.stringify(state));
// 恢复认证状态
const savedState = await env.KV.get('auth-state', 'json');
const context = await browser.newContext({ storageState: savedState });4. MCP Server (AI Agent Integration)
4. MCP服务器(AI Agent集成)
Deploy Model Context Protocol server for LLM agent browser control.
Features:
- No vision models needed (uses accessibility tree)
- Simple natural language instructions
- Built on Playwright with Browser Rendering
- Pre-configured server templates available
Use Case: Enable AI agents to interact with web pages using structured accessibility data instead of screenshots.
部署模型上下文协议(Model Context Protocol)服务器,供LLM Agent控制浏览器。
特性:
- 无需视觉模型(使用可访问性树)
- 简单自然语言指令
- 基于Playwright和Browser Rendering构建
- 提供预配置服务器模板
使用场景: 让AI Agent通过结构化可访问性数据与网页交互,无需依赖截图。
5. Stagehand (AI-Powered Automation)
5. Stagehand(AI驱动自动化)
Natural language browser automation powered by AI.
Example:
typescript
import { Stagehand } from '@stagehand-ai/stagehand';
const stagehand = new Stagehand(env.MYBROWSER);
await stagehand.init();
// Natural language instructions
await stagehand.act('click the login button');
await stagehand.act('fill in email with user@example.com');
const data = await stagehand.extract('get all product prices');
await stagehand.close();基于AI的自然语言浏览器自动化工具。
示例:
typescript
import { Stagehand } from '@stagehand-ai/stagehand';
const stagehand = new Stagehand(env.MYBROWSER);
await stagehand.init();
// 自然语言指令
await stagehand.act('点击登录按钮');
await stagehand.act('在邮箱输入框中填写user@example.com');
const data = await stagehand.extract('获取所有产品价格');
await stagehand.close();Configuration Patterns
配置模式
Wrangler Configuration (Browser Binding)
Wrangler配置(浏览器绑定)
Basic Setup:
toml
name = "my-browser-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"
browser = { binding = "MYBROWSER" }Advanced Setup with Durable Objects and R2:
toml
browser = { binding = "MYBROWSER" }
[[durable_objects.bindings]]
name = "BROWSER"
class_name = "Browser"
[[r2_buckets]]
binding = "BUCKET"
bucket_name = "my-screenshots"
[[migrations]]
tag = "v1"
new_classes = ["Browser"]基础配置:
toml
name = "my-browser-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"
browser = { binding = "MYBROWSER" }结合Durable Objects和R2的高级配置:
toml
browser = { binding = "MYBROWSER" }
[[durable_objects.bindings]]
name = "BROWSER"
class_name = "Browser"
[[r2_buckets]]
binding = "BUCKET"
bucket_name = "my-screenshots"
[[migrations]]
tag = "v1"
new_classes = ["Browser"]Timeout Configuration
超时配置
Default Timeouts:
- : 30s (max 60s)
goToOptions.timeout - : up to 60s
waitForSelector - : up to 5 minutes
actionTimeout - Workers CPU time: 30s (standard), 15 minutes (unbound)
Custom Timeout Examples:
typescript
// Puppeteer
await page.goto(url, {
timeout: 60000, // 60 seconds
waitUntil: 'networkidle2',
});
await page.waitForSelector('.content', { timeout: 45000 });
// Playwright
await page.goto(url, {
timeout: 60000,
waitUntil: 'networkidle',
});
await page.locator('.element').click({ timeout: 10000 });默认超时:
- : 30秒(最大60秒)
goToOptions.timeout - : 最多60秒
waitForSelector - : 最多5分钟
actionTimeout - Workers CPU时间:30秒(标准),15分钟(无限制)
自定义超时示例:
typescript
// Puppeteer
await page.goto(url, {
timeout: 60000, // 60秒
waitUntil: 'networkidle2',
});
await page.waitForSelector('.content', { timeout: 45000 });
// Playwright
await page.goto(url, {
timeout: 60000,
waitUntil: 'networkidle',
});
await page.locator('.element').click({ timeout: 10000 });Viewport and Screenshot Options
视口和截图选项
typescript
// Set viewport size
await page.setViewport({ width: 1920, height: 1080 });
// Screenshot options
const screenshot = await page.screenshot({
type: 'png', // "png" | "jpeg" | "webp"
quality: 90, // JPEG/WebP only, 0-100
fullPage: true, // Capture full scrollable page
clip: {
// Crop to specific area
x: 0,
y: 0,
width: 800,
height: 600,
},
});typescript
// 设置视口大小
await page.setViewport({ width: 1920, height: 1080 });
// 截图选项
const screenshot = await page.screenshot({
type: 'png', // "png" | "jpeg" | "webp"
quality: 90, // 仅JPEG/WebP可用,0-100
fullPage: true, // 捕获整个可滚动页面
clip: {
// 裁剪至指定区域
x: 0,
y: 0,
width: 800,
height: 600,
},
});PDF Generation Options
PDF生成选项
typescript
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
margin: {
top: '1cm',
right: '1cm',
bottom: '1cm',
left: '1cm',
},
displayHeaderFooter: true,
headerTemplate: '<div>Header</div>',
footerTemplate: '<div>Footer</div>',
});typescript
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
margin: {
top: '1cm',
right: '1cm',
bottom: '1cm',
left: '1cm',
},
displayHeaderFooter: true,
headerTemplate: '<div>Header</div>',
footerTemplate: '<div>Footer</div>',
});Limits and Pricing
限制与定价
Free Plan
免费套餐
- Usage: 10 minutes/day
- Concurrent: 3 browsers max
- Rate Limits: 3 new browsers/minute, 6 requests/minute
- Cost: Free
- 使用时长: 每日10分钟
- 并发数: 最多3个浏览器
- 速率限制: 每分钟3个新浏览器,每分钟6次请求
- 费用: 免费
Paid Plan (Workers Paid)
付费套餐(Workers付费版)
- Usage: 10 hours/month included
- Concurrent: 30 browsers max
- Rate Limits: 30 new browsers/minute, 180 requests/minute
- Overage Pricing:
- Additional usage: $0.09/hour
- Additional concurrency: $2.00/concurrent browser
- 使用时长: 每月包含10小时
- 并发数: 最多30个浏览器
- 速率限制: 每分钟30个新浏览器,每分钟180次请求
- 超额定价:
- 额外使用时长:$0.09/小时
- 额外并发数:$2.00/并发浏览器
REST API Pricing
REST API定价
- Free: 100 requests/day
- Paid: 10,000 requests/month included
- Overage: $0.09/additional hour of browser time
Cost Optimization Tips:
- Use instead of
disconnect()for session reuseclose() - Enable Keep-Alive (up to 10 minutes)
- Pool tabs using browser contexts instead of multiple browsers
- Cache authentication state with KV storage
- Implement Durable Objects for persistent sessions
- 免费额度: 每日100次请求
- 付费额度: 每月包含10,000次请求
- 超额费用: $0.09/额外浏览器小时
成本优化技巧:
- 使用而非
disconnect()实现会话复用close() - 启用Keep-Alive(最长10分钟)
- 使用浏览器上下文而非多个浏览器实现标签池
- 结合KV存储缓存认证状态
- 实现Durable Objects以保持持久会话
Common Use Cases
常见使用场景
1. Screenshot Capture with Caching
1. 带缓存的截图捕获
typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
const targetUrl = url.searchParams.get('url');
// Check cache
const cached = await env.KV.get(targetUrl, 'arrayBuffer');
if (cached) {
return new Response(cached, {
headers: { 'Content-Type': 'image/png' },
});
}
// Generate screenshot
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(targetUrl);
const screenshot = await page.screenshot();
await browser.close();
// Cache for 24 hours
await env.KV.put(targetUrl, screenshot, {
expirationTtl: 86400,
});
return new Response(screenshot, {
headers: { 'Content-Type': 'image/png' },
});
},
};typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
const targetUrl = url.searchParams.get('url');
// 检查缓存
const cached = await env.KV.get(targetUrl, 'arrayBuffer');
if (cached) {
return new Response(cached, {
headers: { 'Content-Type': 'image/png' },
});
}
// 生成截图
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(targetUrl);
const screenshot = await page.screenshot();
await browser.close();
// 缓存24小时
await env.KV.put(targetUrl, screenshot, {
expirationTtl: 86400,
});
return new Response(screenshot, {
headers: { 'Content-Type': 'image/png' },
});
},
};2. PDF Certificate Generator
2. PDF证书生成器
typescript
async function generateCertificate(name: string, env: Env) {
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
const html = `
<!DOCTYPE html>
<html>
<head>
<style>
body { font-family: Arial; text-align: center; padding: 50px; }
h1 { color: #2c3e50; font-size: 48px; }
</style>
</head>
<body>
<h1>Certificate of Achievement</h1>
<p>Awarded to: <strong>${name}</strong></p>
</body>
</html>
`;
await page.setContent(html);
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
});
await browser.close();
return pdf;
}typescript
async function generateCertificate(name: string, env: Env) {
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
const html = `
<!DOCTYPE html>
<html>
<head>
<style>
body { font-family: Arial; text-align: center; padding: 50px; }
h1 { color: #2c3e50; font-size: 48px; }
</style>
</head>
<body>
<h1>Certificate of Achievement</h1>
<p>Awarded to: <strong>${name}</strong></p>
</body>
</html>
`;
await page.setContent(html);
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
});
await browser.close();
return pdf;
}3. AI-Powered Web Scraper
3. AI驱动的网页抓取工具
typescript
import { Ai } from '@cloudflare/ai';
export default {
async fetch(request: Request, env: Env): Promise<Response> {
// Render page with Browser Rendering
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com');
const content = await page.content();
await browser.close();
// Extract data with Workers AI
const ai = new Ai(env.AI);
const response = await ai.run('@hf/thebloke/deepseek-coder-6.7b-instruct-awq', {
messages: [
{
role: 'system',
content: 'Extract top 5 article titles and URLs as JSON array',
},
{
role: 'user',
content: content,
},
],
});
return Response.json(response);
},
};typescript
import { Ai } from '@cloudflare/ai';
export default {
async fetch(request: Request, env: Env): Promise<Response> {
// 使用Browser Rendering渲染页面
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com');
const content = await page.content();
await browser.close();
// 利用Workers AI提取数据
const ai = new Ai(env.AI);
const response = await ai.run('@hf/thebloke/deepseek-coder-6.7b-instruct-awq', {
messages: [
{
role: 'system',
content: '提取前5篇文章的标题和URL,以JSON数组形式返回',
},
{
role: 'user',
content: content,
},
],
});
return Response.json(response);
},
};4. Web Crawler with Queues
4. 结合Queues的网页爬虫
typescript
export default {
async queue(batch: MessageBatch<any>, env: Env): Promise<void> {
const browser = await puppeteer.launch(env.MYBROWSER);
for (const message of batch.messages) {
const page = await browser.newPage();
await page.goto(message.body.url);
// Extract links
const links = await page.evaluate(() => {
return Array.from(document.querySelectorAll('a')).map((a) => a.href);
});
// Queue new links
for (const link of links) {
await env.QUEUE.send({ url: link });
}
await page.close();
}
await browser.close();
},
};typescript
export default {
async queue(batch: MessageBatch<any>, env: Env): Promise<void> {
const browser = await puppeteer.launch(env.MYBROWSER);
for (const message of batch.messages) {
const page = await browser.newPage();
await page.goto(message.body.url);
// 提取链接
const links = await page.evaluate(() => {
return Array.from(document.querySelectorAll('a')).map((a) => a.href);
});
// 将新链接加入队列
for (const link of links) {
await env.QUEUE.send({ url: link });
}
await page.close();
}
await browser.close();
},
};5. Durable Objects for Persistent Sessions
5. 持久会话的Durable Objects
typescript
export class Browser {
state: DurableObjectState;
browser: any;
lastUsed: number;
constructor(state: DurableObjectState, env: Env) {
this.state = state;
this.lastUsed = Date.now();
}
async fetch(request: Request, env: Env) {
// Initialize browser on first request
if (!this.browser) {
this.browser = await puppeteer.launch(env.MYBROWSER);
}
// Set keep-alive alarm
this.lastUsed = Date.now();
await this.state.storage.setAlarm(Date.now() + 10000);
const page = await this.browser.newPage();
await page.goto(new URL(request.url).searchParams.get('url'));
const screenshot = await page.screenshot();
await page.close();
return new Response(screenshot, {
headers: { 'Content-Type': 'image/png' },
});
}
async alarm() {
// Close browser if idle for 60 seconds
if (Date.now() - this.lastUsed > 60000) {
await this.browser?.close();
this.browser = null;
} else {
await this.state.storage.setAlarm(Date.now() + 10000);
}
}
}typescript
export class Browser {
state: DurableObjectState;
browser: any;
lastUsed: number;
constructor(state: DurableObjectState, env: Env) {
this.state = state;
this.lastUsed = Date.now();
}
async fetch(request: Request, env: Env) {
// 首次请求时初始化浏览器
if (!this.browser) {
this.browser = await puppeteer.launch(env.MYBROWSER);
}
// 设置保活告警
this.lastUsed = Date.now();
await this.state.storage.setAlarm(Date.now() + 10000);
const page = await this.browser.newPage();
await page.goto(new URL(request.url).searchParams.get('url'));
const screenshot = await page.screenshot();
await page.close();
return new Response(screenshot, {
headers: { 'Content-Type': 'image/png' },
});
}
async alarm() {
// 闲置60秒后关闭浏览器
if (Date.now() - this.lastUsed > 60000) {
await this.browser?.close();
this.browser = null;
} else {
await this.state.storage.setAlarm(Date.now() + 10000);
}
}
}Best Practices
最佳实践
1. Session Management
1. 会话管理
- Always use instead of
disconnect()to keep sessions alive for reuseclose() - Implement session pooling to reduce concurrency costs
- Set Keep-Alive to maximum (10 minutes) for sustained workflows
- Track session IDs and connection states
- 始终使用而非
disconnect()以保持会话活跃以便复用close() - 实现会话池以降低并发成本
- 设置最长Keep-Alive时间(10分钟)以支持持续工作流
- 追踪会话ID和连接状态
2. Performance Optimization
2. 性能优化
- Cache frequently accessed content in KV storage
- Use browser contexts instead of multiple browsers for tab pooling
- Implement Durable Objects for persistent, reusable sessions
- Choose appropriate strategy (load, networkidle0, networkidle2)
waitUntil - Set realistic timeouts to avoid unnecessary waiting
- 在KV存储中缓存频繁访问的内容
- 使用浏览器上下文而非多个浏览器实现标签池
- 实现Durable Objects以保持持久、可复用的会话
- 选择合适的策略(load、networkidle0、networkidle2)
waitUntil - 设置合理的超时时间以避免不必要的等待
3. Error Handling
3. 错误处理
- Implement Retry-After awareness for 429 rate limit errors
- Handle timeout errors gracefully with fallback strategies
- Check session availability before attempting reconnection
- Validate responses before caching or returning data
- 针对429速率限制错误,实现Retry-After机制
- 优雅处理超时错误并提供回退策略
- 重新连接前检查会话可用性
- 缓存或返回数据前验证响应
4. Cost Management
4. 成本管理
- Monitor usage via Cloudflare dashboard
- Use session reuse to dramatically reduce concurrency costs
- Implement intelligent caching strategies
- Consider batch processing for multiple URLs
- Set appropriate alarm intervals for Durable Objects cleanup
- 通过Cloudflare控制台监控使用情况
- 利用会话复用大幅降低并发成本
- 实现智能缓存策略
- 考虑对多个URL进行批量处理
- 为Durable Objects清理设置合理的告警间隔
5. Security
5. 安全
- Validate all user-provided URLs before navigation
- Implement proper authentication for Workers endpoints
- Use Web Bot Auth signatures for additional protection
- Sanitize extracted content before processing
- Set appropriate CORS headers
- 导航前验证所有用户提供的URL
- 为Workers端点实现适当的认证
- 使用Web Bot Auth签名增强保护
- 处理前清理提取的内容
- 设置合适的CORS头
Troubleshooting
故障排查
Common Issues
常见问题
Timeout Errors:
- Increase timeout:
page.goto(url, { timeout: 60000 }) - Change waitUntil:
{ waitUntil: "domcontentloaded" } - Check network conditions and target site performance
Rate Limit (429) Errors:
- Implement exponential backoff with Retry-After header
- Reduce request frequency
- Upgrade to paid plan for higher limits
Session Connection Failures:
- Check session availability before connecting
- Handle race conditions with try-catch
- Verify browser hasn't timed out (10-minute Keep-Alive limit)
Memory Issues:
- Close pages when done:
await page.close() - Disconnect browsers properly:
await browser.disconnect() - Implement Durable Objects cleanup alarms
Font Rendering Issues:
- Use supported fonts (100+ pre-installed)
- Inject custom fonts via CDN or base64
- Check font-family declarations in CSS
超时错误:
- 增加超时时间:
page.goto(url, { timeout: 60000 }) - 更改waitUntil策略:
{ waitUntil: "domcontentloaded" } - 检查网络状况和目标站点性能
速率限制(429)错误:
- 实现带Retry-After头的指数退避机制
- 降低请求频率
- 升级至付费套餐以获取更高限制
会话连接失败:
- 连接前检查会话可用性
- 使用try-catch处理竞争条件
- 验证浏览器是否已超时(Keep-Alive限制为10分钟)
内存问题:
- 使用完毕后关闭页面:
await page.close() - 正确断开浏览器连接:
await browser.disconnect() - 实现Durable Objects清理告警
字体渲染问题:
- 使用支持的字体(预装100+种)
- 通过CDN或base64注入自定义字体
- 检查CSS中的font-family声明
API Reference Quick Lookup
API参考速查
REST API Global Parameters
REST API全局参数
- (required) - Target webpage URL
url - - Wait time in milliseconds (0-30000)
waitDelay - - Navigation timeout (0-60000ms)
goto.timeout - - Wait strategy (load, domcontentloaded, networkidle)
goto.waitUntil
- (必填)- 目标网页URL
url - - 等待时间(毫秒,0-30000)
waitDelay - - 导航超时(0-60000毫秒)
goto.timeout - - 等待策略(load、domcontentloaded、networkidle)
goto.waitUntil
Puppeteer Key Methods
Puppeteer核心方法
- - Start browser
puppeteer.launch(binding) - - Reconnect to session
puppeteer.connect(binding, sessionId) - - List active sessions
puppeteer.sessions(binding) - - Create new page
browser.newPage() - - Disconnect keeping session alive
browser.disconnect() - - Close and terminate session
browser.close() - - Navigate
page.goto(url, options) - - Capture screenshot
page.screenshot(options) - - Generate PDF
page.pdf(options) - - Get HTML
page.content() - - Execute JavaScript
page.evaluate(fn)
- - 启动浏览器
puppeteer.launch(binding) - - 重新连接至会话
puppeteer.connect(binding, sessionId) - - 列出活跃会话
puppeteer.sessions(binding) - - 创建新页面
browser.newPage() - - 断开连接并保持会话活跃
browser.disconnect() - - 关闭并终止会话
browser.close() - - 导航
page.goto(url, options) - - 捕获截图
page.screenshot(options) - - 生成PDF
page.pdf(options) - - 获取HTML
page.content() - - 执行JavaScript
page.evaluate(fn)
Playwright Key Methods
Playwright核心方法
- - Start browser
env.MYBROWSER.launch() - - Create new page
browser.newPage() - - Create context with state
browser.newContext(options) - - Navigate
page.goto(url, options) - - Capture screenshot
page.screenshot(options) - - Generate PDF
page.pdf(options) - - Find element
page.locator(selector) - - Wait for load
page.waitForLoadState(state) - - Get authentication state
context.storageState()
- - 启动浏览器
env.MYBROWSER.launch() - - 创建新页面
browser.newPage() - - 创建带状态的上下文
browser.newContext(options) - - 导航
page.goto(url, options) - - 捕获截图
page.screenshot(options) - - 生成PDF
page.pdf(options) - - 查找元素
page.locator(selector) - - 等待加载完成
page.waitForLoadState(state) - - 获取认证状态
context.storageState()
Supported Fonts
支持的字体
Pre-installed fonts include:
- System: Arial, Verdana, Times New Roman, Georgia, Courier New
- Open Source: Noto Sans, Noto Serif, Roboto, Open Sans, Lato
- International: Noto Sans CJK (Chinese, Japanese, Korean), Noto Sans Arabic, Hebrew, Thai
- Emoji: Noto Color Emoji
Custom Font Injection:
html
<link href="https://fonts.googleapis.com/css2?family=Poppins" rel="stylesheet" />预装字体包括:
- 系统字体: Arial、Verdana、Times New Roman、Georgia、Courier New
- 开源字体: Noto Sans、Noto Serif、Roboto、Open Sans、Lato
- 国际化字体: Noto Sans CJK(中、日、韩)、Noto Sans Arabic、希伯来文、泰文
- 表情符号: Noto Color Emoji
自定义字体注入:
html
<link href="https://fonts.googleapis.com/css2?family=Poppins" rel="stylesheet" />Deployment Checklist
部署检查清单
-
Setup:
- Install Wrangler:
npm install -g wrangler - Login:
wrangler login - Create project:
npm create cloudflare@latest
- Install Wrangler:
-
Configuration:
- Add browser binding to
wrangler.toml - Configure KV namespaces for caching (optional)
- Set up R2 buckets for storage (optional)
- Define Durable Objects if using persistent sessions
- Add browser binding to
-
Testing:
- Test locally:
wrangler dev - Verify session management
- Test timeout configurations
- Validate error handling
- Test locally:
-
Deployment:
- Deploy to production:
wrangler deploy - Monitor usage in Cloudflare dashboard
- Set up alerts for rate limits
- Verify cost optimization strategies
- Deploy to production:
-
环境搭建:
- 安装Wrangler:
npm install -g wrangler - 登录:
wrangler login - 创建项目:
npm create cloudflare@latest
- 安装Wrangler:
-
配置:
- 在中添加浏览器绑定
wrangler.toml - 配置KV命名空间用于缓存(可选)
- 设置R2存储桶用于存储(可选)
- 定义Durable Objects(若使用持久会话)
- 在
-
测试:
- 本地测试:
wrangler dev - 验证会话管理
- 测试超时配置
- 验证错误处理
- 本地测试:
-
部署:
- 部署至生产环境:
wrangler deploy - 在Cloudflare控制台监控使用情况
- 设置速率限制告警
- 验证成本优化策略
- 部署至生产环境:
Resources
资源
- Official Documentation: https://developers.cloudflare.com/browser-rendering/
- Puppeteer Docs: https://pptr.dev/
- Playwright Docs: https://playwright.dev/
- Workers Documentation: https://developers.cloudflare.com/workers/
- Wrangler CLI: https://developers.cloudflare.com/workers/wrangler/
- 官方文档: https://developers.cloudflare.com/browser-rendering/
- Puppeteer文档: https://pptr.dev/
- Playwright文档: https://playwright.dev/
- Workers文档: https://developers.cloudflare.com/workers/
- Wrangler CLI: https://developers.cloudflare.com/workers/wrangler/
Implementation Workflow
实现流程
When implementing Cloudflare Browser Rendering:
-
Choose Integration Method:
- REST API for simple, external integration
- Workers + Puppeteer for low-level control
- Workers + Playwright for testing and advanced features
- MCP Server for AI agent integration
- Stagehand for natural language automation
-
Set Up Configuration:
- Create with appropriate bindings
wrangler.toml - Install dependencies (or
@cloudflare/puppeteer)@cloudflare/workers-playwright - Configure KV, R2, or Durable Objects as needed
- Create
-
Implement Core Logic:
- Browser lifecycle management (launch, disconnect, close)
- Navigation and waiting strategies
- Content extraction or screenshot/PDF generation
- Error handling and retries
-
Optimize for Cost:
- Implement session reuse with
disconnect() - Add Keep-Alive for sustained usage
- Cache results in KV storage
- Use Durable Objects for persistent sessions
- Implement session reuse with
-
Deploy and Monitor:
- Test locally with
wrangler dev - Deploy with
wrangler deploy - Monitor usage and costs in dashboard
- Adjust rate limiting and caching strategies
- Test locally with
实现Cloudflare Browser Rendering时:
-
选择集成方式:
- 简单外部集成选REST API
- 底层控制选Workers + Puppeteer
- 测试和高级功能选Workers + Playwright
- AI Agent集成选MCP服务器
- 自然语言自动化选Stagehand
-
配置环境:
- 创建带合适绑定的
wrangler.toml - 安装依赖(或
@cloudflare/puppeteer)@cloudflare/workers-playwright - 按需配置KV、R2或Durable Objects
- 创建带合适绑定的
-
实现核心逻辑:
- 浏览器生命周期管理(启动、断开、关闭)
- 导航和等待策略
- 内容提取或截图/PDF生成
- 错误处理和重试
-
成本优化:
- 用实现会话复用
disconnect() - 启用Keep-Alive支持持续使用
- 在KV存储中缓存结果
- 用Durable Objects保持持久会话
- 用
-
部署与监控:
- 本地测试:
wrangler dev - 部署:
wrangler deploy - 在控制台监控使用情况和成本
- 调整速率限制和缓存策略
- 本地测试:
Version Support
版本支持
- Puppeteer: v22.13.1
- Playwright: v1.55.0
- Node.js Compatibility: Required for Workers integration
- Browser Version: Chromium-based (updated regularly by Cloudflare)
- Puppeteer: v22.13.1
- Playwright: v1.55.0
- Node.js兼容性: Workers集成需要Node.js环境
- 浏览器版本: 基于Chromium(Cloudflare定期更新)