better-agent-browser

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Better Agent Browser

更优Agent浏览器

Extends

agent-browser

CLI with four capabilities:

Login Watch — Auto-detect login walls, notify user, poll until login completes
CAPTCHA / Automation Block Detection — Distinguish solvable CAPTCHAs from unsolvable automation blocks
Site Experience — Per-domain patterns that accumulate over time as agents learn site quirks
Parallel Tabs — Operate many tabs simultaneously via CDP proxy, sharing one Chrome and one login state

This skill complements agent-browser, not replaces it.

agent-browser

is the execution layer. This skill adds detection, experience, and coordination on top.

${SKILL_PATH}

is set by skills.sh to this skill's install directory.

扩展

agent-browser

CLI的四项能力：

登录监听（Login Watch） — 自动检测登录拦截墙，通知用户，轮询直到登录完成
CAPTCHA/自动化拦截检测 — 区分可人工解决的CAPTCHA和无法绕过的自动化拦截
站点适配经验（Site Experience） — 按域名存储的适配规则，随Agent遇到和解决站点问题不断积累
并行标签（Parallel Tabs） — 通过CDP代理同时操作多个标签页，共享同一个Chrome实例和登录状态

本技能是agent-browser的补充，而非替代。

agent-browser

是执行层，本技能在其之上增加了检测、经验沉淀和协同能力。

${SKILL_PATH}

由skills.sh自动设置为本技能的安装目录。

Language

语言要求

Match user's language.

匹配用户使用的语言。

Layers — Start Simple, Escalate When Needed

层级架构 — 从简单起步，按需升级

Layer 0a: agent-browser open              ← local dev / public pages (no CDP)
Layer 0b: agent-browser connect <port>    ← external sites needing login / anti-bot
Layer 1:  + login-watch / captcha-watch   ← when hitting login walls or anti-bot
        + site-patterns                 ← read before, write after
Layer 2:  + CDP proxy                     ← parallel multi-tab only

Always start at Layer 0. Escalate only when you hit a specific problem.

Layer 0a: agent-browser open              ← 本地开发/公开页面（无需CDP）
Layer 0b: agent-browser connect <port>    ← 需要登录/反爬的外部站点
Layer 1:  + login-watch / captcha-watch   ← 遇到登录墙或反爬拦截时启用
        + site-patterns                 ← 操作前读取，操作后回写
Layer 2:  + CDP proxy                     ← 仅在需要并行多标签时启用

始终从Layer 0开始执行，仅在遇到明确问题时才升级层级。

Layer 0: Direct agent-browser

Layer 0: 直接使用agent-browser

Do you need the user's login state or need to bypass anti-bot?

你需要用户的登录状态，或是需要绕过反爬机制吗？

0a: No login needed

0a: 无需登录

Local dev, testing, public pages. Just use

agent-browser open

— launches its own browser, no setup.

bash

agent-browser open http://localhost:3000
agent-browser snapshot -i
agent-browser click @e3
agent-browser screenshot page.png

本地开发、测试、公开页面场景，直接使用

agent-browser open

即可，会启动独立浏览器，无需额外配置：

bash

agent-browser open http://localhost:3000
agent-browser snapshot -i
agent-browser click @e3
agent-browser screenshot page.png

0b: External sites

0b: 外部站点

Connect to the agent's dedicated Chrome profile at

~/.chrome-debug-profile

. This is a persistent profile — login state, cookies, and site data survive across sessions and agents. Once the user logs into a site here, it stays logged in for all future agent sessions.

Use
browser-connect.sh
to connect safely — it checks if another agent is already using Layer 0b and auto-routes to Layer 2 if so:

bash

RESULT=$(bash ${SKILL_PATH}/scripts/browser-connect.sh 9333)
MODE=$(echo "$RESULT" | jq -r '.mode')

`mode`	Meaning	Action
`layer0b`	Lock acquired, connected	Use `agent-browser` normally
`layer2`	Another agent holds the lock	Use CDP proxy (Layer 2) instead
`unavailable`	Chrome not running	Guide user to start Chrome

After connecting:

bash

agent-browser open https://github.com/settings
agent-browser snapshot -i

Default assumption: already logged in. This profile accumulates login state over time. Navigate directly — don't pre-check login. Only escalate to Layer 1's login-watch if you hit a login wall.

Tab management:

bash

agent-browser tab list           # [index] title - url
agent-browser tab 0              # Switch by index (0-based)
agent-browser tab new https://...  # Open new tab
agent-browser tab close          # Close current tab
agent-browser get url            # Which tab am I on?

Tab discipline: Don't touch existing tabs. Work in tabs you create (

tab new

open

). Close your tabs when done.

When done, always disconnect to release the lock for other agents:

bash

bash ${SKILL_PATH}/scripts/browser-disconnect.sh 9333

First-time Chrome setup (once per machine boot, when debug port is not running):

bash

undefined

连接到存放在

~/.chrome-debug-profile

的Agent专属Chrome配置文件，这是持久化配置 — 登录状态、Cookie、站点数据会在会话和Agent之间保留。用户在此配置中登录站点后，后续所有Agent会话都会保持登录状态。

使用
browser-connect.sh
安全连接 — 它会检查是否有其他Agent正在使用Layer 0b，如果是会自动路由到Layer 2：

bash

RESULT=$(bash ${SKILL_PATH}/scripts/browser-connect.sh 9333)
MODE=$(echo "$RESULT" | jq -r '.mode')

`mode`	含义	操作
`layer0b`	已获取锁，连接成功	正常使用 `agent-browser` 即可
`layer2`	其他Agent持有锁	改用CDP代理（Layer 2）
`unavailable`	Chrome未运行	引导用户启动Chrome

连接成功后：

bash

agent-browser open https://github.com/settings
agent-browser snapshot -i

默认假设：已登录。 该配置会逐步积累登录状态，直接导航即可，不要预先检查登录状态。仅在遇到登录墙时才升级到Layer 1的登录监听功能。

标签管理：

bash

agent-browser tab list           # [索引] 标题 - url
agent-browser tab 0              # 按索引切换标签（从0开始计数）
agent-browser tab new https://...  # 打开新标签页
agent-browser tab close          # 关闭当前标签页
agent-browser get url            # 查看当前标签页地址

标签使用规范： 不要操作已存在的标签页，仅在你创建的标签页（

tab new

open

）中工作，使用完毕后关闭你创建的标签页。

操作完成后必须断开连接，释放锁供其他Agent使用：

bash

bash ${SKILL_PATH}/scripts/browser-disconnect.sh 9333

Chrome首次启动配置（每台机器重启后执行一次，调试端口未运行时需要）：

bash

undefined

1. Quit Chrome (Cmd+Q)

1. 退出Chrome（Cmd+Q）

2. Start the agent's dedicated Chrome profile

2. 启动Agent专属Chrome配置

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &

3. Verify

3. 验证启动成功

curl -s http://127.0.0.1:9333/json/version


Port 9222 is often occupied by Electron — prefer **9333**.

**Get CDP port programmatically:**

```bash
agent-browser get cdp-url   # → ws://127.0.0.1:9333/devtools/browser/...

Diagnostics:

bash ${SKILL_PATH}/scripts/check-deps.sh

→ JSON with agent-browser, Node, CDP, proxy status.

curl -s http://127.0.0.1:9333/json/version


端口9222常被Electron占用，优先使用**9333**端口。

**程序化获取CDP端口：**

```bash
agent-browser get cdp-url   # → ws://127.0.0.1:9333/devtools/browser/...

诊断命令：

bash ${SKILL_PATH}/scripts/check-deps.sh

→ 返回包含agent-browser、Node、CDP、代理状态的JSON。

Layer 1: Login Watch + CAPTCHA Watch + Site Patterns

Layer 1: 登录监听 + CAPTCHA监听 + 站点适配规则

Site Patterns — Read Before, Write After

站点适配规则 — 操作前读取，操作后回写

Accumulated experience about specific domains. Grows as agents encounter and solve problems.

Before navigating to an external domain:

bash

PATTERN_FILE="${SKILL_PATH}/references/site-patterns/${DOMAIN}.md"

File exists → Read it. Treat as hints (may be outdated), not guarantees.
- ```
requires_cdp: true
```
  and not connected to real Chrome → STOP,
```
connect
```
  first.
File doesn't exist → Proceed. You may create one later.

After successful operations, write back what you learned:

markdown

---
domain: example.com
aliases: [示例, Example]
requires_cdp: true
updated: 2026-04-05
---

积累的特定域名适配经验，随Agent遇到和解决问题不断更新。

导航到外部域名前：

bash

PATTERN_FILE="${SKILL_PATH}/references/site-patterns/${DOMAIN}.md"

文件存在 → 读取内容，作为参考提示（可能过时），不保证完全有效。
- 若
```
requires_cdp: true
```
  且未连接到真实Chrome → 停止操作，先执行
```
connect
```
  连接。
文件不存在 → 继续操作，后续可以创建对应规则文件。

操作成功后，回写你总结的适配经验：

markdown

---
domain: example.com
aliases: [示例, Example]
requires_cdp: true
updated: 2026-04-05
---

Platform Characteristics

平台特性

Architecture, anti-bot, login needs, content loading.

架构、反爬机制、登录要求、内容加载方式。

Effective Patterns

有效适配方案

Verified URLs, selectors, strategies. Tag each with date.

验证过的URL、选择器、操作策略，每条都标注日期。

Known Traps

已知坑点

What fails and why. Tag each with date.


Rules:
- Only write verified facts, not guesses
- Tag findings with dates — patterns decay
- If a pattern stops working, update the file

**List available patterns:**

```bash
ls ${SKILL_PATH}/references/site-patterns/*.md | grep -v _template

失效的操作及原因，每条都标注日期。


规则：
- 仅写入已验证的事实，不要写猜测内容
- 所有发现都标注日期 — 适配规则会随站点更新失效
- 如果某条规则不再生效，更新对应文件

**列出所有可用适配规则：**

```bash
ls ${SKILL_PATH}/references/site-patterns/*.md | grep -v _template

Login Watch — Auto-Detect and Wait

登录监听 — 自动检测并等待

Don't pre-assume login is needed. Try to access content first. Only if the page shows a login wall:

Tell the user: "Please log in to [site] in your Chrome."
Start login-watch in the background:

bash

bash ${SKILL_PATH}/scripts/login-watch.sh --interval 5 --timeout 300

MUST run in background with long timeout — user may take minutes to log in.

When it returns
```
logged_in: true
```
, navigate to the target URL and continue.

Exit codes:

Exit	Meaning	Action
`0`	Login detected	Navigate to target and continue
`1`	Timeout	Tell user login-watch timed out, ask them to confirm when done
`2`	Error	agent-browser not connected, fix connection

Output:

{ "logged_in": bool, "url": string, "elapsed": number, "hint": string }

Login detection keywords: "Sign in", "Log in", "Create account", "Enter your password", etc. If a site uses unusual login wall text, the detection may miss it — fall back to asking the user.

不要预先假设需要登录，先尝试访问目标内容，仅当页面出现登录墙时：

告知用户："请在你的Chrome中登录[站点名称]。"
后台启动登录监听：

bash

bash ${SKILL_PATH}/scripts/login-watch.sh --interval 5 --timeout 300

必须后台运行并设置较长超时时间 — 用户可能需要几分钟完成登录。

当返回
```
logged_in: true
```
时，导航到目标URL继续操作。

退出码说明：

退出码	含义	操作
`0`	检测到登录完成	导航到目标页面继续操作
`1`	超时	告知用户登录监听超时，让用户完成登录后确认
`2`	错误	agent-browser未连接，修复连接问题

输出：

{ "logged_in": bool, "url": string, "elapsed": number, "hint": string }

登录检测关键词： "Sign in"、"Log in"、"Create account"、"Enter your password"等。如果站点使用不常见的登录墙文本，检测可能失效，降级为直接询问用户。

CAPTCHA Watch — Check After Navigating

CAPTCHA监听 — 导航后检查

Run only when you suspect a block (blank page, challenge page, Cloudflare interstitial):

bash

bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>

Use the port Chrome is running on. Get it from

agent-browser get cdp-url

if unsure.

Exit	Meaning	Action
`0`	No CAPTCHA	Continue
`1`	Solvable CAPTCHA (real Chrome)	Screenshot → notify user → poll (background, same pattern as login-watch)
`2`	Automation block (unsolvable)	STOP. Must use real Chrome. Never retry in session mode.
`3`	Error	Retry once

Solvable CAPTCHA polling:

bash

agent-browser screenshot captcha.png
echo "CAPTCHA detected. Please solve it in Chrome."

仅当你怀疑出现拦截时运行（空白页、验证页、Cloudflare拦截页）：

bash

bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>

使用Chrome运行的端口，不确定的话可以通过

agent-browser get cdp-url

获取。

退出码	含义	操作
`0`	无CAPTCHA	继续操作
`1`	可人工解决的CAPTCHA（真实Chrome环境）	截图 → 通知用户 → 轮询（后台运行，和登录监听模式一致）
`2`	自动化拦截（无法解决）	停止操作，必须使用真实Chrome，不要在会话模式下重试。
`3`	错误	重试一次

可解决CAPTCHA轮询逻辑：

bash

agent-browser screenshot captcha.png
echo "检测到CAPTCHA，请在Chrome中完成验证。"

Run in background — same pattern as login-watch

后台运行 — 和登录监听模式一致

for i in $(seq 1 24); do sleep 5 RESULT=$(bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>) if echo "$RESULT" | jq -e '.captcha == false' >/dev/null 2>&1; then break fi done

---

for i in $(seq 1 24); do sleep 5 RESULT=$(bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>) if echo "$RESULT" | jq -e '.captcha == false' >/dev/null 2>&1; then break fi done

---

Layer 2: CDP Proxy (Parallel Multi-Tab)

Layer 2: CDP代理（并行多标签）

HTTP API to manage multiple tabs in the user's real Chrome. The only way to do parallel multi-tab while sharing login state and clean fingerprint.

Why not
agent-browser --session
? Sessions spawn Playwright instances —

navigator.webdriver=true

, blocked by anti-bot, no shared login.

通过HTTP API管理用户真实Chrome中的多个标签页，是实现并行多标签同时共享登录状态、干净指纹的唯一方案。

为什么不使用
agent-browser --session
？会话模式会启动Playwright实例，

navigator.webdriver=true

，会被反爬机制拦截，也无法共享登录状态。

Preflight

前置检查

bash

bash ${SKILL_PATH}/scripts/check-deps.sh [CDP_PORT]

Output:

{ ready, mode, agent_browser, node, chrome_cdp, proxy, hint }

`ready`	`mode`	Action
`true`	`cdp`	Proceed
`true`	`session`	Layer 2 unavailable. Fall back to Layer 0.
`false`	—	STOP. Fix per `hint` .

bash

bash ${SKILL_PATH}/scripts/check-deps.sh [CDP_PORT]

输出：

{ ready, mode, agent_browser, node, chrome_cdp, proxy, hint }

。

`ready`	`mode`	操作
`true`	`cdp`	继续操作
`true`	`session`	Layer 2不可用，降级到Layer 0
`false`	—	停止操作，按照 `hint` 提示修复问题

Start Proxy

启动代理

bash

undefined

bash

undefined

Reuse if already running

如果已经运行则复用

curl -sf http://127.0.0.1:3456/health >/dev/null 2>&1 ||
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"

undefined

curl -sf http://127.0.0.1:3456/health >/dev/null 2>&1 ||
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"

undefined

Batch Operations

批量操作

bash

TABS=$(curl -s -X POST "$PROXY/batch" \
  -H 'Content-Type: application/json' \
  -d '{"urls":["https://site1.com","https://site2.com"]}')
TARGETS=$(echo "$TABS" | jq '[.[].targetId]')

curl -s -X POST "$PROXY/batch-eval" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS,\"expression\":\"document.title\"}"

curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"

bash

TABS=$(curl -s -X POST "$PROXY/batch" \
  -H 'Content-Type: application/json' \
  -d '{"urls":["https://site1.com","https://site2.com"]}')
TARGETS=$(echo "$TABS" | jq '[.[].targetId]')

curl -s -X POST "$PROXY/batch-eval" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS,\"expression\":\"document.title\"}"

curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"

Parallel Sub-Agents

并行子Agent

Main Agent
├── Start CDP proxy (once)
├── Sub-Agent A: /new → targetId-A → /eval → /close
├── Sub-Agent B: /new → targetId-B → /eval → /close
└── Sub-Agent C: /new → targetId-C → /eval → /close
    All share same Chrome, login state, cookies

主Agent
├── 启动CDP代理（仅一次）
├── 子Agent A: /new → targetId-A → /eval → /close
├── 子Agent B: /new → targetId-B → /eval → /close
└── 子Agent C: /new → targetId-C → /eval → /close
    所有子Agent共享同一个Chrome、登录状态、Cookie

CAPTCHA Watch for Proxy Tabs

代理标签页的CAPTCHA监听

bash

bash ${SKILL_PATH}/scripts/captcha-watch.sh --proxy-eval <targetId>

bash

bash ${SKILL_PATH}/scripts/captcha-watch.sh --proxy-eval <targetId>

Cleanup

清理

Tab discipline applies. Only close tabs you opened. Never close user's tabs. Don't kill the proxy if other agents may be using it.

bash

curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"

Full API:

references/api.md

适用标签使用规范。 仅关闭你打开的标签页，永远不要关闭用户的标签页。如果有其他Agent可能在使用代理，不要停止代理进程。

bash

curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"

完整API参考：

references/api.md

。

Scripts Reference

脚本参考

Script	Purpose	Layer	Run in background?
`browser-connect.sh [PORT]`	Safe connect with lock-based conflict detection	0b	No
`browser-disconnect.sh [PORT]`	Release lock and close browser	0b	No
`check-deps.sh [PORT]`	Environment diagnostics	0+	No
`login-watch.sh --interval N --timeout N`	Poll until login wall disappears	1	Yes
`captcha-watch.sh --cdp PORT`	Detect CAPTCHA/automation block	1	Polling loop: yes
`captcha-watch.sh --proxy-eval ID`	Same, for proxy tabs	2	Polling loop: yes
`cdp-proxy.mjs`	HTTP API for parallel tabs	2	Yes (daemon)

脚本	用途	适用层级	是否需要后台运行？
`browser-connect.sh [PORT]`	基于锁的冲突检测的安全连接	0b	否
`browser-disconnect.sh [PORT]`	释放锁并关闭浏览器	0b	否
`check-deps.sh [PORT]`	环境诊断	0+	否
`login-watch.sh --interval N --timeout N`	轮询直到登录墙消失	1	是
`captcha-watch.sh --cdp PORT`	检测CAPTCHA/自动化拦截	1	轮询循环：是
`captcha-watch.sh --proxy-eval ID`	同上，适用于代理标签页	2	轮询循环：是
`cdp-proxy.mjs`	并行标签页的HTTP API	2	是（守护进程）

Degradation

降级方案

Missing	Impact
Chrome CDP	Layer 0b/1/2 unavailable. 0a still works.
CDP proxy	Layer 2 unavailable. 0/1 still work.
agent-browser	Fatal.
Node.js 22+	Layer 2 unavailable. 0/1 still work.
Site pattern file	Proceed without experience. Risk: unexpected blocks.

缺失依赖	影响
Chrome CDP	Layer 0b/1/2不可用，0a仍可正常工作
CDP代理	Layer 2不可用，0/1仍可正常工作
agent-browser	致命错误
Node.js 22+	Layer 2不可用，0/1仍可正常工作
站点适配规则文件	无适配经验继续操作，风险：出现意外拦截

Troubleshooting

故障排查

Symptom	Fix
`connect` fails	Chrome not running with debug port. See setup in Layer 0.
CAPTCHA unsolvable (exit 2)	`navigator.webdriver=true` . Switch to real Chrome.
`webdriver=true` in CDP	Connected to Playwright/Electron. Use real Chrome.
Proxy port 3456 in use	`CDP_PROXY_PORT=3457 node cdp-proxy.mjs &`
`/new` hangs	Slow page. `?timeout=30000` .
Tabs accumulate	`/batch-close` . Check `/list` .
login-watch misses login wall	Site uses unusual text. Fall back to asking user.

现象	修复方案
`connect` 失败	Chrome未启动调试端口，参考Layer 0的启动配置
CAPTCHA无法解决（退出码2）	`navigator.webdriver=true` ，切换到真实Chrome
CDP中 `webdriver=true`	连接到了Playwright/Electron，使用真实Chrome
代理端口3456被占用	`CDP_PROXY_PORT=3457 node cdp-proxy.mjs &`
`/new` 请求挂起	页面加载慢，添加 `?timeout=30000` 参数
标签页堆积	调用 `/batch-close` ，查看 `/list` 接口
登录监听未识别登录墙	站点使用不常见文本，降级为直接询问用户

References (read on demand)

参考文档（按需读取）

Document	When
`references/api.md`	Layer 2: proxy endpoints, batch examples
`references/site-patterns/*.md`	Layer 1: before navigating to known domains

文档	适用场景
`references/api.md`	Layer 2：代理端点、批量操作示例
`references/site-patterns/*.md`	Layer 1：导航到已知域名前参考