dev-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Dev Browser Skill

Dev Browser Skill

Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
可在脚本执行期间维持页面状态的浏览器自动化工具。编写小型、目标明确的脚本以逐步完成任务。当你验证了工作流的某一部分,且存在重复工作时,可以编写脚本在单次执行中完成这些重复任务。

Choosing Your Approach

Choosing Your Approach

  • Local/source-available sites: Read the source code first to write selectors directly
  • Unknown page layouts: Use
    getAISnapshot()
    to discover elements and
    selectSnapshotRef()
    to interact with them
  • Visual feedback: Take screenshots to see what the user sees
  • 本地/可获取源码的网站:先阅读源码以直接编写选择器
  • 未知页面布局:使用
    getAISnapshot()
    发现元素,使用
    selectSnapshotRef()
    与元素交互
  • 视觉反馈:截取屏幕截图以查看用户所见内容

Setup

Setup

Installation: See references/installation.md for detailed setup instructions including Windows support.
Two modes available. Ask the user if unclear which to use.
安装:查看references/installation.md获取包含Windows支持的详细设置说明。
提供两种模式。若不确定使用哪种,请询问用户。

Standalone Mode (Default)

Standalone Mode (Default)

Launches a new Chromium browser for fresh automation sessions.
bash
./skills/dev-browser/server.sh &
Add
--headless
flag if user requests it. Wait for the
Ready
message before running scripts.
为新的自动化会话启动一个新的Chromium浏览器。
bash
./skills/dev-browser/server.sh &
若用户要求,添加
--headless
参数。运行脚本前请等待
Ready
消息。

Extension Mode

Extension Mode

Connects to user's existing Chrome browser. Use this when:
  • The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
  • The user asks you to use the extension
Important: The core flow is still the same. You create named pages inside of their browser.
Start the relay server:
bash
cd skills/dev-browser && npm i && npm run start-extension &
Wait for
Waiting for extension to connect...
followed by
Extension connected
in the console. To know that a client has connected and the browser is ready to be controlled. Workflow:
  1. Scripts call
    client.page("name")
    just like the normal mode to create new pages / connect to existing ones.
  2. Automation runs on the user's actual browser session
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
连接到用户已有的Chrome浏览器。在以下情况使用:
  • 用户已登录网站,且希望你在已认证的环境中执行操作(非本地开发环境)。
  • 用户要求使用该扩展程序
重要提示:核心流程保持不变。你需要在用户的浏览器中创建命名页面。
启动中继服务器:
bash
cd skills/dev-browser && npm i && npm run start-extension &
等待控制台中先出现
Waiting for extension to connect...
,随后出现
Extension connected
消息,以此确认客户端已连接且浏览器可被控制。 工作流:
  1. 脚本调用
    client.page("name")
    ,与普通模式相同,用于创建新页面/连接到现有页面。
  2. 自动化操作在用户的实际浏览器会话中运行
如果扩展程序尚未连接,请告知用户启动并激活它。下载链接:https://github.com/SawyerHood/dev-browser/releases

Writing Scripts

Writing Scripts

Run all scripts from
skills/dev-browser/
directory.
The
@/
import alias requires this directory's config.
Execute scripts inline using heredocs:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";

const client = await connect();
// Create page with custom viewport size (optional)
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });

await page.goto("https://example.com");
await waitForPageLoad(page);

console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
Write to
tmp/
files only when
the script needs reuse, is complex, or user explicitly requests it.
所有脚本请在
skills/dev-browser/
目录下运行。
@/
导入别名依赖该目录的配置。
使用here-doc内联执行脚本:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect, waitForPageLoad } from "@/client.js";

const client = await connect();
// Create page with custom viewport size (optional)
const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });

await page.goto("https://example.com");
await waitForPageLoad(page);

console.log({ title: await page.title(), url: page.url() });
await client.disconnect();
EOF
仅在以下情况将脚本写入
tmp/
文件
:脚本需要复用、脚本复杂,或用户明确要求。

Key Principles

Key Principles

  1. Small scripts: Each script does ONE thing (navigate, click, fill, check)
  2. Evaluate state: Log/return state at the end to decide next steps
  3. Descriptive page names: Use
    "checkout"
    ,
    "login"
    , not
    "main"
  4. Disconnect to exit:
    await client.disconnect()
    - pages persist on server
  5. Plain JS in evaluate:
    page.evaluate()
    runs in browser - no TypeScript syntax
  1. 小型脚本:每个脚本只完成一项任务(导航、点击、填写、检查)
  2. 评估状态:在脚本末尾记录/返回状态,以决定后续步骤
  3. 描述性页面名称:使用
    "checkout"
    "login"
    而非
    "main"
  4. 断开连接以退出
    await client.disconnect()
    - 页面会在服务器上保留
  5. evaluate中使用纯JS
    page.evaluate()
    在浏览器中运行 - 不支持TypeScript语法

Workflow Loop

Workflow Loop

Follow this pattern for complex tasks:
  1. Write a script to perform one action
  2. Run it and observe the output
  3. Evaluate - did it work? What's the current state?
  4. Decide - is the task complete or do we need another script?
  5. Repeat until task is done
处理复杂任务时遵循以下模式:
  1. 编写脚本以执行一个操作
  2. 运行脚本并观察输出
  3. 评估 - 脚本是否生效?当前状态是什么?
  4. 决策 - 任务是否完成,还是需要编写另一个脚本?
  5. 重复直到任务完成

No TypeScript in Browser Context

No TypeScript in Browser Context

Code passed to
page.evaluate()
runs in the browser, which doesn't understand TypeScript:
typescript
// ✅ Correct: plain JavaScript
const text = await page.evaluate(() => {
  return document.body.innerText;
});

// ❌ Wrong: TypeScript syntax will fail at runtime
const text = await page.evaluate(() => {
  const el: HTMLElement = document.body; // Type annotation breaks in browser!
  return el.innerText;
});
传递给
page.evaluate()
的代码在浏览器中运行,而浏览器不支持TypeScript:
typescript
// ✅ 正确:纯JavaScript
const text = await page.evaluate(() => {
  return document.body.innerText;
});

// ❌ 错误:TypeScript语法会在运行时失败
const text = await page.evaluate(() => {
  const el: HTMLElement = document.body; // Type annotation breaks in browser!
  return el.innerText;
});

Scraping Data

Scraping Data

For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See references/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.
爬取大型数据集时,拦截并重放网络请求,而非滚动DOM。查看references/scraping.md获取包含请求捕获、模式发现和分页API重放的完整指南。

Client API

Client API

typescript
const client = await connect();

// Get or create named page (viewport only applies to new pages)
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });

const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)

// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
The
page
object is a standard Playwright Page.
typescript
const client = await connect();

// Get or create named page (viewport only applies to new pages)
const page = await client.page("name");
const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });

const pages = await client.list(); // List all page names
await client.close("name"); // Close a page
await client.disconnect(); // Disconnect (pages persist)

// ARIA Snapshot methods
const snapshot = await client.getAISnapshot("name"); // Get accessibility tree
const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
page
对象是标准的Playwright Page。

Waiting

Waiting

typescript
import { waitForPageLoad } from "@/client.js";

await waitForPageLoad(page); // After navigation
await page.waitForSelector(".results"); // For specific elements
await page.waitForURL("**/success"); // For specific URL
typescript
import { waitForPageLoad } from "@/client.js";

await waitForPageLoad(page); // 导航后
await page.waitForSelector(".results"); // 针对特定元素
await page.waitForURL("**/success"); // 针对特定URL

Inspecting Page State

Inspecting Page State

Screenshots

Screenshots

typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });
typescript
await page.screenshot({ path: "tmp/screenshot.png" });
await page.screenshot({ path: "tmp/full.png", fullPage: true });

ARIA Snapshot (Element Discovery)

ARIA Snapshot (Element Discovery)

Use
getAISnapshot()
to discover page elements. Returns YAML-formatted accessibility tree:
yaml
- banner:
  - link "Hacker News" [ref=e1]
  - navigation:
    - link "new" [ref=e2]
- main:
  - list:
    - listitem:
      - link "Article Title" [ref=e8]
      - link "328 comments" [ref=e9]
- contentinfo:
  - textbox [ref=e10]
    - /placeholder: "Search"
Interpreting refs:
  • [ref=eN]
    - Element reference for interaction (visible, clickable elements only)
  • [checked]
    ,
    [disabled]
    ,
    [expanded]
    - Element states
  • [level=N]
    - Heading level
  • /url:
    ,
    /placeholder:
    - Element properties
Interacting with refs:
typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need

const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();
使用
getAISnapshot()
发现页面元素。返回YAML格式的无障碍树:
yaml
- banner:
  - link "Hacker News" [ref=e1]
  - navigation:
    - link "new" [ref=e2]
- main:
  - list:
    - listitem:
      - link "Article Title" [ref=e8]
      - link "328 comments" [ref=e9]
- contentinfo:
  - textbox [ref=e10]
    - /placeholder: "Search"
引用解释:
  • [ref=eN]
    - 用于交互的元素引用(仅可见、可点击元素)
  • [checked]
    ,
    [disabled]
    ,
    [expanded]
    - 元素状态
  • [level=N]
    - 标题层级
  • /url:
    ,
    /placeholder:
    - 元素属性
与引用交互:
typescript
const snapshot = await client.getAISnapshot("hackernews");
console.log(snapshot); // Find the ref you need

const element = await client.selectSnapshotRef("hackernews", "e2");
await element.click();

Error Recovery

Error Recovery

Page state persists after failures. Debug with:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";

const client = await connect();
const page = await client.page("hackernews");

await page.screenshot({ path: "tmp/debug.png" });
console.log({
  url: page.url(),
  title: await page.title(),
  bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});

await client.disconnect();
EOF
页面状态在失败后会保留。使用以下脚本调试:
bash
cd skills/dev-browser && npx tsx <<'EOF'
import { connect } from "@/client.js";

const client = await connect();
const page = await client.page("hackernews");

await page.screenshot({ path: "tmp/debug.png" });
console.log({
  url: page.url(),
  title: await page.title(),
  bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)),
});

await client.disconnect();
EOF