puppeteer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePuppeteer
Puppeteer
Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol.
Puppeteer 是一个 Node 库,它提供了高层级 API,可通过 DevTools Protocol 控制 Chrome 或 Chromium 浏览器。
When to Use
使用场景
- Chrome Specific: If testing cross-browser isn't a priority (or you only care about Chromium).
- Web Scraping: Excellent for scraping SPAs because it renders JS.
- PDF/Screenshots: The industry standard for "HTML to PDF" generation.
- Chrome 专属场景:如果跨浏览器测试不是优先级(或者你只关注 Chromium 浏览器)。
- 网页抓取:非常适合抓取单页应用(SPA),因为它可以渲染 JavaScript。
- PDF/截图生成:是“HTML 转 PDF”生成的行业标准工具。
Quick Start
快速开始
javascript
import puppeteer from "puppeteer";
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://developer.chrome.com/");
await page.pdf({ path: "dv.pdf", format: "A4" });
await browser.close();
})();javascript
import puppeteer from "puppeteer";
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://developer.chrome.com/");
await page.pdf({ path: "dv.pdf", format: "A4" });
await browser.close();
})();Core Concepts
核心概念
DevTools Protocol (CDP)
DevTools Protocol (CDP)
Puppeteer talks directly to Chrome via CDP. This allows deeper control (intercepting network at a low level, CPU profiling) than WebDriver.
Puppeteer 通过 CDP 直接与 Chrome 通信。这比 WebDriver 能实现更深度的控制(比如底层拦截网络请求、CPU 性能分析)。
Headless by Default
默认无头模式
Puppeteer launches Chrome in headless mode by default. Use to see it.
headless: falsePuppeteer 默认以无头模式启动 Chrome。使用 参数可以显示浏览器界面。
headless: falseBest Practices (2025)
2025年最佳实践
Do:
- Use : Before clicking or scraping.
page.waitForSelector - Use plugins: If scraping, use
stealthto avoid detection.puppeteer-extra-plugin-stealth - Use Playwright: Consider switching. Playwright is maintained by the team that built Puppeteer (after moving to Microsoft) and has a better API.
Don't:
- Don't leak browsers: Always ensure is called in a
browser.close()block or via a test runner hook.finally
建议做:
- 使用 :在点击元素或抓取内容前调用该方法。
page.waitForSelector - 使用插件:如果用于网页抓取,使用
stealth插件来避免被检测到。puppeteer-extra-plugin-stealth - 考虑切换到 Playwright:Playwright 由原 Puppeteer 开发团队(加入微软后)维护,拥有更优秀的 API。
不建议做:
- 不要遗漏关闭浏览器:务必在块中或通过测试运行器钩子调用
finally,确保浏览器被关闭。browser.close()