better-agent-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Better Agent Browser

更优Agent浏览器

Extends
agent-browser
CLI with four capabilities:
  1. Login Watch — Auto-detect login walls, notify user, poll until login completes
  2. CAPTCHA / Automation Block Detection — Distinguish solvable CAPTCHAs from unsolvable automation blocks
  3. Site Experience — Per-domain patterns that accumulate over time as agents learn site quirks
  4. Parallel Tabs — Operate many tabs simultaneously via CDP proxy, sharing one Chrome and one login state
This skill complements agent-browser, not replaces it.
agent-browser
is the execution layer. This skill adds detection, experience, and coordination on top.
${SKILL_PATH}
is set by skills.sh to this skill's install directory.
扩展
agent-browser
CLI的四项能力:
  1. 登录监听(Login Watch) — 自动检测登录拦截墙,通知用户,轮询直到登录完成
  2. CAPTCHA/自动化拦截检测 — 区分可人工解决的CAPTCHA和无法绕过的自动化拦截
  3. 站点适配经验(Site Experience) — 按域名存储的适配规则,随Agent遇到和解决站点问题不断积累
  4. 并行标签(Parallel Tabs) — 通过CDP代理同时操作多个标签页,共享同一个Chrome实例和登录状态
本技能是agent-browser的补充,而非替代。
agent-browser
是执行层,本技能在其之上增加了检测、经验沉淀和协同能力。
${SKILL_PATH}
由skills.sh自动设置为本技能的安装目录。

Language

语言要求

Match user's language.

匹配用户使用的语言。

Layers — Start Simple, Escalate When Needed

层级架构 — 从简单起步,按需升级

Layer 0a: agent-browser open              ← local dev / public pages (no CDP)
Layer 0b: agent-browser connect <port>    ← external sites needing login / anti-bot
Layer 1:  + login-watch / captcha-watch   ← when hitting login walls or anti-bot
        + site-patterns                 ← read before, write after
Layer 2:  + CDP proxy                     ← parallel multi-tab only
Always start at Layer 0. Escalate only when you hit a specific problem.

Layer 0a: agent-browser open              ← 本地开发/公开页面(无需CDP)
Layer 0b: agent-browser connect <port>    ← 需要登录/反爬的外部站点
Layer 1:  + login-watch / captcha-watch   ← 遇到登录墙或反爬拦截时启用
        + site-patterns                 ← 操作前读取,操作后回写
Layer 2:  + CDP proxy                     ← 仅在需要并行多标签时启用
始终从Layer 0开始执行,仅在遇到明确问题时才升级层级。

Layer 0: Direct agent-browser

Layer 0: 直接使用agent-browser

Do you need the user's login state or need to bypass anti-bot?
你需要用户的登录状态,或是需要绕过反爬机制吗?

0a: No login needed

0a: 无需登录

Local dev, testing, public pages. Just use
agent-browser open
— launches its own browser, no setup.
bash
agent-browser open http://localhost:3000
agent-browser snapshot -i
agent-browser click @e3
agent-browser screenshot page.png
本地开发、测试、公开页面场景,直接使用
agent-browser open
即可,会启动独立浏览器,无需额外配置:
bash
agent-browser open http://localhost:3000
agent-browser snapshot -i
agent-browser click @e3
agent-browser screenshot page.png

0b: External sites

0b: 外部站点

Connect to the agent's dedicated Chrome profile at
~/.chrome-debug-profile
. This is a persistent profile — login state, cookies, and site data survive across sessions and agents. Once the user logs into a site here, it stays logged in for all future agent sessions.
Use
browser-connect.sh
to connect safely
— it checks if another agent is already using Layer 0b and auto-routes to Layer 2 if so:
bash
RESULT=$(bash ${SKILL_PATH}/scripts/browser-connect.sh 9333)
MODE=$(echo "$RESULT" | jq -r '.mode')
mode
MeaningAction
layer0b
Lock acquired, connectedUse
agent-browser
normally
layer2
Another agent holds the lockUse CDP proxy (Layer 2) instead
unavailable
Chrome not runningGuide user to start Chrome
After connecting:
bash
agent-browser open https://github.com/settings
agent-browser snapshot -i
Default assumption: already logged in. This profile accumulates login state over time. Navigate directly — don't pre-check login. Only escalate to Layer 1's login-watch if you hit a login wall.
Tab management:
bash
agent-browser tab list           # [index] title - url
agent-browser tab 0              # Switch by index (0-based)
agent-browser tab new https://...  # Open new tab
agent-browser tab close          # Close current tab
agent-browser get url            # Which tab am I on?
Tab discipline: Don't touch existing tabs. Work in tabs you create (
tab new
/
open
). Close your tabs when done.
When done, always disconnect to release the lock for other agents:
bash
bash ${SKILL_PATH}/scripts/browser-disconnect.sh 9333
First-time Chrome setup (once per machine boot, when debug port is not running):
bash
undefined
连接到存放在
~/.chrome-debug-profile
Agent专属Chrome配置文件,这是持久化配置 — 登录状态、Cookie、站点数据会在会话和Agent之间保留。用户在此配置中登录站点后,后续所有Agent会话都会保持登录状态。
使用
browser-connect.sh
安全连接
— 它会检查是否有其他Agent正在使用Layer 0b,如果是会自动路由到Layer 2:
bash
RESULT=$(bash ${SKILL_PATH}/scripts/browser-connect.sh 9333)
MODE=$(echo "$RESULT" | jq -r '.mode')
mode
含义操作
layer0b
已获取锁,连接成功正常使用
agent-browser
即可
layer2
其他Agent持有锁改用CDP代理(Layer 2)
unavailable
Chrome未运行引导用户启动Chrome
连接成功后:
bash
agent-browser open https://github.com/settings
agent-browser snapshot -i
默认假设:已登录。 该配置会逐步积累登录状态,直接导航即可,不要预先检查登录状态。仅在遇到登录墙时才升级到Layer 1的登录监听功能。
标签管理:
bash
agent-browser tab list           # [索引] 标题 - url
agent-browser tab 0              # 按索引切换标签(从0开始计数)
agent-browser tab new https://...  # 打开新标签页
agent-browser tab close          # 关闭当前标签页
agent-browser get url            # 查看当前标签页地址
标签使用规范: 不要操作已存在的标签页,仅在你创建的标签页(
tab new
/
open
)中工作,使用完毕后关闭你创建的标签页。
操作完成后必须断开连接,释放锁供其他Agent使用:
bash
bash ${SKILL_PATH}/scripts/browser-disconnect.sh 9333
Chrome首次启动配置(每台机器重启后执行一次,调试端口未运行时需要):
bash
undefined

1. Quit Chrome (Cmd+Q)

1. 退出Chrome(Cmd+Q)

2. Start the agent's dedicated Chrome profile

2. 启动Agent专属Chrome配置

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome
--remote-debugging-port=9333 '--remote-allow-origins=*'
--user-data-dir="$HOME/.chrome-debug-profile" 2>/dev/null &

3. Verify

3. 验证启动成功


Port 9222 is often occupied by Electron — prefer **9333**.

**Get CDP port programmatically:**

```bash
agent-browser get cdp-url   # → ws://127.0.0.1:9333/devtools/browser/...
Diagnostics:
bash ${SKILL_PATH}/scripts/check-deps.sh
→ JSON with agent-browser, Node, CDP, proxy status.


端口9222常被Electron占用,优先使用**9333**端口。

**程序化获取CDP端口:**

```bash
agent-browser get cdp-url   # → ws://127.0.0.1:9333/devtools/browser/...
诊断命令:
bash ${SKILL_PATH}/scripts/check-deps.sh
→ 返回包含agent-browser、Node、CDP、代理状态的JSON。

Layer 1: Login Watch + CAPTCHA Watch + Site Patterns

Layer 1: 登录监听 + CAPTCHA监听 + 站点适配规则

Site Patterns — Read Before, Write After

站点适配规则 — 操作前读取,操作后回写

Accumulated experience about specific domains. Grows as agents encounter and solve problems.
Before navigating to an external domain:
bash
PATTERN_FILE="${SKILL_PATH}/references/site-patterns/${DOMAIN}.md"
  • File exists → Read it. Treat as hints (may be outdated), not guarantees.
    • requires_cdp: true
      and not connected to real Chrome → STOP,
      connect
      first.
  • File doesn't exist → Proceed. You may create one later.
After successful operations, write back what you learned:
markdown
---
domain: example.com
aliases: [示例, Example]
requires_cdp: true
updated: 2026-04-05
---
积累的特定域名适配经验,随Agent遇到和解决问题不断更新。
导航到外部域名前:
bash
PATTERN_FILE="${SKILL_PATH}/references/site-patterns/${DOMAIN}.md"
  • 文件存在 → 读取内容,作为参考提示(可能过时),不保证完全有效。
    • requires_cdp: true
      且未连接到真实Chrome → 停止操作,先执行
      connect
      连接。
  • 文件不存在 → 继续操作,后续可以创建对应规则文件。
操作成功后,回写你总结的适配经验:
markdown
---
domain: example.com
aliases: [示例, Example]
requires_cdp: true
updated: 2026-04-05
---

Platform Characteristics

平台特性

Architecture, anti-bot, login needs, content loading.
架构、反爬机制、登录要求、内容加载方式。

Effective Patterns

有效适配方案

Verified URLs, selectors, strategies. Tag each with date.
验证过的URL、选择器、操作策略,每条都标注日期。

Known Traps

已知坑点

What fails and why. Tag each with date.

Rules:
- Only write verified facts, not guesses
- Tag findings with dates — patterns decay
- If a pattern stops working, update the file

**List available patterns:**

```bash
ls ${SKILL_PATH}/references/site-patterns/*.md | grep -v _template
失效的操作及原因,每条都标注日期。

规则:
- 仅写入已验证的事实,不要写猜测内容
- 所有发现都标注日期 — 适配规则会随站点更新失效
- 如果某条规则不再生效,更新对应文件

**列出所有可用适配规则:**

```bash
ls ${SKILL_PATH}/references/site-patterns/*.md | grep -v _template

Login Watch — Auto-Detect and Wait

登录监听 — 自动检测并等待

Don't pre-assume login is needed. Try to access content first. Only if the page shows a login wall:
  1. Tell the user: "Please log in to [site] in your Chrome."
  2. Start login-watch in the background:
bash
bash ${SKILL_PATH}/scripts/login-watch.sh --interval 5 --timeout 300
MUST run in background with long timeout — user may take minutes to log in.
  1. When it returns
    logged_in: true
    , navigate to the target URL and continue.
Exit codes:
ExitMeaningAction
0
Login detectedNavigate to target and continue
1
TimeoutTell user login-watch timed out, ask them to confirm when done
2
Erroragent-browser not connected, fix connection
Output:
{ "logged_in": bool, "url": string, "elapsed": number, "hint": string }
Login detection keywords: "Sign in", "Log in", "Create account", "Enter your password", etc. If a site uses unusual login wall text, the detection may miss it — fall back to asking the user.
不要预先假设需要登录,先尝试访问目标内容,仅当页面出现登录墙时:
  1. 告知用户:"请在你的Chrome中登录[站点名称]。"
  2. 后台启动登录监听:
bash
bash ${SKILL_PATH}/scripts/login-watch.sh --interval 5 --timeout 300
必须后台运行并设置较长超时时间 — 用户可能需要几分钟完成登录。
  1. 当返回
    logged_in: true
    时,导航到目标URL继续操作。
退出码说明:
退出码含义操作
0
检测到登录完成导航到目标页面继续操作
1
超时告知用户登录监听超时,让用户完成登录后确认
2
错误agent-browser未连接,修复连接问题
输出:
{ "logged_in": bool, "url": string, "elapsed": number, "hint": string }
登录检测关键词: "Sign in"、"Log in"、"Create account"、"Enter your password"等。如果站点使用不常见的登录墙文本,检测可能失效,降级为直接询问用户。

CAPTCHA Watch — Check After Navigating

CAPTCHA监听 — 导航后检查

Run only when you suspect a block (blank page, challenge page, Cloudflare interstitial):
bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>
Use the port Chrome is running on. Get it from
agent-browser get cdp-url
if unsure.
ExitMeaningAction
0
No CAPTCHAContinue
1
Solvable CAPTCHA (real Chrome)Screenshot → notify user → poll (background, same pattern as login-watch)
2
Automation block (unsolvable)STOP. Must use real Chrome. Never retry in session mode.
3
ErrorRetry once
Solvable CAPTCHA polling:
bash
agent-browser screenshot captcha.png
echo "CAPTCHA detected. Please solve it in Chrome."
仅当你怀疑出现拦截时运行(空白页、验证页、Cloudflare拦截页):
bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>
使用Chrome运行的端口,不确定的话可以通过
agent-browser get cdp-url
获取。
退出码含义操作
0
无CAPTCHA继续操作
1
可人工解决的CAPTCHA(真实Chrome环境)截图 → 通知用户 → 轮询(后台运行,和登录监听模式一致)
2
自动化拦截(无法解决)停止操作,必须使用真实Chrome,不要在会话模式下重试。
3
错误重试一次
可解决CAPTCHA轮询逻辑:
bash
agent-browser screenshot captcha.png
echo "检测到CAPTCHA,请在Chrome中完成验证。"

Run in background — same pattern as login-watch

后台运行 — 和登录监听模式一致

for i in $(seq 1 24); do sleep 5 RESULT=$(bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>) if echo "$RESULT" | jq -e '.captcha == false' >/dev/null 2>&1; then break fi done

---
for i in $(seq 1 24); do sleep 5 RESULT=$(bash ${SKILL_PATH}/scripts/captcha-watch.sh --cdp <PORT>) if echo "$RESULT" | jq -e '.captcha == false' >/dev/null 2>&1; then break fi done

---

Layer 2: CDP Proxy (Parallel Multi-Tab)

Layer 2: CDP代理(并行多标签)

HTTP API to manage multiple tabs in the user's real Chrome. The only way to do parallel multi-tab while sharing login state and clean fingerprint.
Why not
agent-browser --session
?
Sessions spawn Playwright instances —
navigator.webdriver=true
, blocked by anti-bot, no shared login.
通过HTTP API管理用户真实Chrome中的多个标签页,是实现并行多标签同时共享登录状态、干净指纹的唯一方案
为什么不使用
agent-browser --session
会话模式会启动Playwright实例,
navigator.webdriver=true
,会被反爬机制拦截,也无法共享登录状态。

Preflight

前置检查

bash
bash ${SKILL_PATH}/scripts/check-deps.sh [CDP_PORT]
Output:
{ ready, mode, agent_browser, node, chrome_cdp, proxy, hint }
.
ready
mode
Action
true
cdp
Proceed
true
session
Layer 2 unavailable. Fall back to Layer 0.
false
STOP. Fix per
hint
.
bash
bash ${SKILL_PATH}/scripts/check-deps.sh [CDP_PORT]
输出:
{ ready, mode, agent_browser, node, chrome_cdp, proxy, hint }
ready
mode
操作
true
cdp
继续操作
true
session
Layer 2不可用,降级到Layer 0
false
停止操作,按照
hint
提示修复问题

Start Proxy

启动代理

bash
undefined
bash
undefined

Reuse if already running

如果已经运行则复用

curl -sf http://127.0.0.1:3456/health >/dev/null 2>&1 ||
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"
undefined
curl -sf http://127.0.0.1:3456/health >/dev/null 2>&1 ||
node ${SKILL_PATH}/scripts/cdp-proxy.mjs & PROXY="http://127.0.0.1:3456"
undefined

Batch Operations

批量操作

bash
TABS=$(curl -s -X POST "$PROXY/batch" \
  -H 'Content-Type: application/json' \
  -d '{"urls":["https://site1.com","https://site2.com"]}')
TARGETS=$(echo "$TABS" | jq '[.[].targetId]')

curl -s -X POST "$PROXY/batch-eval" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS,\"expression\":\"document.title\"}"

curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"
bash
TABS=$(curl -s -X POST "$PROXY/batch" \
  -H 'Content-Type: application/json' \
  -d '{"urls":["https://site1.com","https://site2.com"]}')
TARGETS=$(echo "$TABS" | jq '[.[].targetId]')

curl -s -X POST "$PROXY/batch-eval" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS,\"expression\":\"document.title\"}"

curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"

Parallel Sub-Agents

并行子Agent

Main Agent
├── Start CDP proxy (once)
├── Sub-Agent A: /new → targetId-A → /eval → /close
├── Sub-Agent B: /new → targetId-B → /eval → /close
└── Sub-Agent C: /new → targetId-C → /eval → /close
    All share same Chrome, login state, cookies
主Agent
├── 启动CDP代理(仅一次)
├── 子Agent A: /new → targetId-A → /eval → /close
├── 子Agent B: /new → targetId-B → /eval → /close
└── 子Agent C: /new → targetId-C → /eval → /close
    所有子Agent共享同一个Chrome、登录状态、Cookie

CAPTCHA Watch for Proxy Tabs

代理标签页的CAPTCHA监听

bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --proxy-eval <targetId>
bash
bash ${SKILL_PATH}/scripts/captcha-watch.sh --proxy-eval <targetId>

Cleanup

清理

Tab discipline applies. Only close tabs you opened. Never close user's tabs. Don't kill the proxy if other agents may be using it.
bash
curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"
Full API:
references/api.md
.

适用标签使用规范。 仅关闭你打开的标签页,永远不要关闭用户的标签页。如果有其他Agent可能在使用代理,不要停止代理进程。
bash
curl -s -X POST "$PROXY/batch-close" \
  -H 'Content-Type: application/json' \
  -d "{\"targets\":$TARGETS}"
完整API参考:
references/api.md

Scripts Reference

脚本参考

ScriptPurposeLayerRun in background?
browser-connect.sh [PORT]
Safe connect with lock-based conflict detection0bNo
browser-disconnect.sh [PORT]
Release lock and close browser0bNo
check-deps.sh [PORT]
Environment diagnostics0+No
login-watch.sh --interval N --timeout N
Poll until login wall disappears1Yes
captcha-watch.sh --cdp PORT
Detect CAPTCHA/automation block1Polling loop: yes
captcha-watch.sh --proxy-eval ID
Same, for proxy tabs2Polling loop: yes
cdp-proxy.mjs
HTTP API for parallel tabs2Yes (daemon)

脚本用途适用层级是否需要后台运行?
browser-connect.sh [PORT]
基于锁的冲突检测的安全连接0b
browser-disconnect.sh [PORT]
释放锁并关闭浏览器0b
check-deps.sh [PORT]
环境诊断0+
login-watch.sh --interval N --timeout N
轮询直到登录墙消失1
captcha-watch.sh --cdp PORT
检测CAPTCHA/自动化拦截1轮询循环:是
captcha-watch.sh --proxy-eval ID
同上,适用于代理标签页2轮询循环:是
cdp-proxy.mjs
并行标签页的HTTP API2(守护进程)

Degradation

降级方案

MissingImpact
Chrome CDPLayer 0b/1/2 unavailable. 0a still works.
CDP proxyLayer 2 unavailable. 0/1 still work.
agent-browserFatal.
Node.js 22+Layer 2 unavailable. 0/1 still work.
Site pattern fileProceed without experience. Risk: unexpected blocks.
缺失依赖影响
Chrome CDPLayer 0b/1/2不可用,0a仍可正常工作
CDP代理Layer 2不可用,0/1仍可正常工作
agent-browser致命错误
Node.js 22+Layer 2不可用,0/1仍可正常工作
站点适配规则文件无适配经验继续操作,风险:出现意外拦截

Troubleshooting

故障排查

SymptomFix
connect
fails
Chrome not running with debug port. See setup in Layer 0.
CAPTCHA unsolvable (exit 2)
navigator.webdriver=true
. Switch to real Chrome.
webdriver=true
in CDP
Connected to Playwright/Electron. Use real Chrome.
Proxy port 3456 in use
CDP_PROXY_PORT=3457 node cdp-proxy.mjs &
/new
hangs
Slow page.
?timeout=30000
.
Tabs accumulate
/batch-close
. Check
/list
.
login-watch misses login wallSite uses unusual text. Fall back to asking user.
现象修复方案
connect
失败
Chrome未启动调试端口,参考Layer 0的启动配置
CAPTCHA无法解决(退出码2)
navigator.webdriver=true
,切换到真实Chrome
CDP中
webdriver=true
连接到了Playwright/Electron,使用真实Chrome
代理端口3456被占用
CDP_PROXY_PORT=3457 node cdp-proxy.mjs &
/new
请求挂起
页面加载慢,添加
?timeout=30000
参数
标签页堆积调用
/batch-close
,查看
/list
接口
登录监听未识别登录墙站点使用不常见文本,降级为直接询问用户

References (read on demand)

参考文档(按需读取)

DocumentWhen
references/api.md
Layer 2: proxy endpoints, batch examples
references/site-patterns/*.md
Layer 1: before navigating to known domains
文档适用场景
references/api.md
Layer 2:代理端点、批量操作示例
references/site-patterns/*.md
Layer 1:导航到已知域名前参考