dev-browser
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDev Browser Skill
Dev Browser Skill
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
可在脚本执行期间维持页面状态的浏览器自动化工具。编写小型、目标明确的脚本以逐步完成任务。当你验证了工作流的某一部分,且存在重复工作时,可以编写脚本在单次执行中完成这些重复任务。
Choosing Your Approach
Choosing Your Approach
- Local/source-available sites: Read the source code first to write selectors directly
- Unknown page layouts: Use to discover elements and
getAISnapshot()to interact with themselectSnapshotRef() - Visual feedback: Take screenshots to see what the user sees
- 本地/可获取源码的网站:先阅读源码以直接编写选择器
- 未知页面布局:使用发现元素,使用
getAISnapshot()与元素交互selectSnapshotRef() - 视觉反馈:截取屏幕截图以查看用户所见内容
Setup
Setup
Installation: See references/installation.md for detailed setup instructions including Windows support.
Two modes available. Ask the user if unclear which to use.
安装:查看references/installation.md获取包含Windows支持的详细设置说明。
提供两种模式。若不确定使用哪种,请询问用户。
Standalone Mode (Default)
Standalone Mode (Default)
Launches a new Chromium browser for fresh automation sessions.
bash
./skills/dev-browser/server.sh &Add flag if user requests it. Wait for the message before running scripts.
--headlessReady为新的自动化会话启动一个新的Chromium浏览器。
bash
./skills/dev-browser/server.sh &若用户要求,添加参数。运行脚本前请等待消息。
--headlessReadyExtension Mode
Extension Mode
Connects to user's existing Chrome browser. Use this when:
- The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
- The user asks you to use the extension
Important: The core flow is still the same. You create named pages inside of their browser.
Start the relay server:
bash
cd skills/dev-browser && npm i && npm run start-extension &Wait for followed by in the console. To know that a client has connected and the browser is ready to be controlled.
Workflow:
Waiting for extension to connect...Extension connected- Scripts call just like the normal mode to create new pages / connect to existing ones.
client.page("name") - Automation runs on the user's actual browser session
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
连接到用户已有的Chrome浏览器。在以下情况使用:
- 用户已登录网站,且希望你在已认证的环境中执行操作(非本地开发环境)。
- 用户要求使用该扩展程序
重要提示:核心流程保持不变。你需要在用户的浏览器中创建命名页面。
启动中继服务器:
bash
cd skills/dev-browser && npm i && npm run start-extension &等待控制台中先出现,随后出现消息,以此确认客户端已连接且浏览器可被控制。
工作流:
Waiting for extension to connect...Extension connected- 脚本调用,与普通模式相同,用于创建新页面/连接到现有页面。
client.page("name") - 自动化操作在用户的实际浏览器会话中运行
如果扩展程序尚未连接,请告知用户启动并激活它。下载链接:https://github.com/SawyerHood/dev-browser/releases
Writing Scripts
Writing Scripts
Run all scripts fromdirectory. Theskills/dev-browser/import alias requires this directory's config.@/
Execute scripts inline using heredocs:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
// Create page with custom viewport size (optional)
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOFWrite to files only when the script needs reuse, is complex, or user explicitly requests it.
tmp/所有脚本请在目录下运行。skills/dev-browser/导入别名依赖该目录的配置。@/
使用here-doc内联执行脚本:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";
const client = await connect();
// Create page with custom viewport size (optional)
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com");
await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF仅在以下情况将脚本写入文件:脚本需要复用、脚本复杂,或用户明确要求。
tmp/Key Principles
Key Principles
- Small scripts: Each script does ONE thing (navigate, click, fill, check)
- Evaluate state: Log/return state at the end to decide next steps
- Descriptive page names: Use ,
"checkout", not"login""main" - Disconnect to exit: - pages persist on server
await client.disconnect() - Plain JS in evaluate: runs in browser - no TypeScript syntax
page.evaluate()
- 小型脚本:每个脚本只完成一项任务(导航、点击、填写、检查)
- 评估状态:在脚本末尾记录/返回状态,以决定后续步骤
- 描述性页面名称:使用、
"checkout"而非"login""main" - 断开连接以退出:- 页面会在服务器上保留
await client.disconnect() - evaluate中使用纯JS:在浏览器中运行 - 不支持TypeScript语法
page.evaluate()
Workflow Loop
Workflow Loop
Follow this pattern for complex tasks:
- Write a script to perform one action
- Run it and observe the output
- Evaluate - did it work? What's the current state?
- Decide - is the task complete or do we need another script?
- Repeat until task is done
处理复杂任务时遵循以下模式:
- 编写脚本以执行一个操作
- 运行脚本并观察输出
- 评估 - 脚本是否生效?当前状态是什么?
- 决策 - 任务是否完成,还是需要编写另一个脚本?
- 重复直到任务完成
No TypeScript in Browser Context
No TypeScript in Browser Context
Code passed to runs in the browser, which doesn't understand TypeScript:
page.evaluate()typescript
// ✅ Correct: plain JavaScript
const text = await page.evaluate(() => {
return document.body.innerText;
});
// ❌ Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation breaks in browser!
return el.innerText;
});传递给的代码在浏览器中运行,而浏览器不支持TypeScript:
page.evaluate()typescript
// ✅ 正确:纯JavaScript
const text = await page.evaluate(() => {
return document.body.innerText;
});
// ❌ 错误:TypeScript语法会在运行时失败
const text = await page.evaluate(() => {
const el: HTMLElement = document.body; // Type annotation breaks in browser!
return el.innerText;
});Scraping Data
Scraping Data
For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See references/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.
爬取大型数据集时,拦截并重放网络请求,而非滚动DOM。查看references/scraping.md获取包含请求捕获、模式发现和分页API重放的完整指南。
Client API
Client API
typescript
const client = await connect();
// Get or create named page (viewport only applies to new pages)
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by refThe object is a standard Playwright Page.
pagetypescript
const client = await connect();
// Get or create named page (viewport only applies to new pages)
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by refpageWaiting
Waiting
typescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URLtypescript
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // 导航后
await page.waitForSelector(".results"); // 针对特定元素
await page.waitForURL("**/success"); // 针对特定URLInspecting Page State
Inspecting Page State
Screenshots
Screenshots
typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });ARIA Snapshot (Element Discovery)
ARIA Snapshot (Element Discovery)
Use to discover page elements. Returns YAML-formatted accessibility tree:
getAISnapshot()yaml
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
- link "328 comments" [ref=e9]
- contentinfo:
- textbox [ref=e10]
- /placeholder: "Search"Interpreting refs:
- - Element reference for interaction (visible, clickable elements only)
[ref=eN] - ,
[checked],[disabled]- Element states[expanded] - - Heading level
[level=N] - ,
/url:- Element properties/placeholder:
Interacting with refs:
typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();使用发现页面元素。返回YAML格式的无障碍树:
getAISnapshot()yaml
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
- link "328 comments" [ref=e9]
- contentinfo:
- textbox [ref=e10]
- /placeholder: "Search"引用解释:
- - 用于交互的元素引用(仅可见、可点击元素)
[ref=eN] - ,
[checked],[disabled]- 元素状态[expanded] - - 标题层级
[level=N] - ,
/url:- 元素属性/placeholder:
与引用交互:
typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();Error Recovery
Error Recovery
Page state persists after failures. Debug with:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF页面状态在失败后会保留。使用以下脚本调试:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";
const client = await connect();
const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" });
console.log({
url: page.url(),
title: await page.title(),
bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});
await client.disconnect();
EOF