mini-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

mini-browser (mb) — Browser CLI for Agents

mini-browser (mb) — 面向Agent的浏览器CLI

mb
is a browser CLI where each command is a small Unix tool. It talks to Chrome over CDP (port 9222) via puppeteer-core.
mb
是一款浏览器CLI工具,每个命令都是一个小型Unix工具。它通过puppeteer-core与Chrome的CDP(端口9222)进行通信。

Setup (only if not already available)

安装配置(仅当未就绪时需要)

Setup is only needed when
mb
is not installed or Chrome is not reachable. Run these checks first — if both pass, skip straight to the Command Reference.
仅当未安装
mb
或无法连接到Chrome时,才需要进行配置。先运行以下检查——如果两项都通过,可直接跳至命令参考部分。

Check if ready

检查是否就绪

bash
undefined
bash
undefined

1. Is mb installed?

1. Is mb installed?

which mb && echo "mb: ok" || echo "mb: MISSING"
which mb && echo "mb: ok" || echo "mb: MISSING"

2. Is Chrome listening on CDP?

2. Is Chrome listening on CDP?

curl -sf http://127.0.0.1:9222/json/version > /dev/null && echo "chrome: ok" || echo "chrome: NOT RUNNING"

If **both** print "ok", everything is ready — go use `mb` commands directly.
curl -sf http://127.0.0.1:9222/json/version > /dev/null && echo "chrome: ok" || echo "chrome: NOT RUNNING"

如果**两项**都输出“ok”,则一切就绪——可直接使用`mb`命令。

Install (only if
mb
is missing)

安装(仅当
mb
缺失时)

bash
npm install -g @runablehq/mini-browser
bash
npm install -g @runablehq/mini-browser

Start Chrome (only if not running)

启动Chrome(仅当未运行时)

bash
mb-start-chrome
This launches Chrome with
--remote-debugging-port=9222
, a fresh profile, and a 1024×768 window. It no-ops if Chrome is already running.
To kill and relaunch:
bash
mb-restart-chrome
bash
mb-start-chrome
该命令会启动Chrome,并启用
--remote-debugging-port=9222
参数,使用全新配置文件,窗口大小为1024×768。如果Chrome已在运行,该命令不会执行任何操作。
如需重启Chrome:
bash
mb-restart-chrome

Verify

验证

bash
mb go "https://example.com" && mb text
bash
mb go "https://example.com" && mb text

Environment Variables

环境变量

VariableDefaultDescription
CHROME_PORT
9222
CDP port
CHROME_BIN
auto-detectedPath to Chrome/Chromium binary
CHROME_PID_FILE
<scripts>/.chrome-pid
PID file location
CHROME_USER_DATA_DIR
<scripts>/.chrome-profile
Chrome profile directory
变量名默认值描述
CHROME_PORT
9222
CDP端口
CHROME_BIN
自动检测Chrome/Chromium二进制文件路径
CHROME_PID_FILE
<scripts>/.chrome-pid
PID文件位置
CHROME_USER_DATA_DIR
<scripts>/.chrome-profile
Chrome配置文件目录

Command Reference

命令参考

Navigation

导航

CommandDescription
mb go <url>
Navigate to URL (waits for networkidle)
mb url
Print current URL
mb back
Go back
mb forward
Go forward
命令描述
mb go <url>
导航至指定URL(等待网络空闲)
mb url
打印当前URL
mb back
返回上一页
mb forward
前进至下一页

Observation

页面观测

CommandDescription
mb text [selector]
Visible text content (default: body)
mb shot [file]
Screenshot to PNG (default: ./shot.png)
mb snap
List interactive elements with coordinates
命令描述
mb text [selector]
获取可见文本内容(默认:body元素)
mb shot [file]
截取屏幕截图保存为PNG(默认:./shot.png)
mb snap
列出当前视口中的可交互元素及其坐标

Interaction

页面交互

CommandDescription
mb click <x> <y>
Click at coordinates
mb type [x y] <text>
Type text (with coords: selects first)
mb fill <k=v...>
Fill form fields by label/name/placeholder
mb key <key...>
Press keys (Enter, Tab, Meta+a)
mb move <x> <y>
Hover at coordinates
mb drag <x1> <y1> <x2> <y2>
Drag between points
mb scroll [dir] [px]
Scroll (default: down 500)
命令描述
mb click <x> <y>
在指定坐标位置点击
mb type [x y] <text>
输入文本(若指定坐标:先选中第一个匹配元素)
mb fill <k=v...>
通过标签/名称/占位符填充表单字段
mb key <key...>
按下指定按键(Enter、Tab、Meta+a等)
mb move <x> <y>
将鼠标悬停在指定坐标位置
mb drag <x1> <y1> <x2> <y2>
在两点之间拖拽
mb scroll [dir] [px]
滚动页面(默认:向下滚动500像素)

Recording

录制功能

CommandDescription
mb record start <file>
Start recording (.webm, .mp4, .gif)
mb record stop
Stop recording and save
mb record status
Check if recording is active
命令描述
mb record start <file>
开始录制(支持.webm、.mp4、.gif格式)
mb record stop
停止录制并保存文件
mb record status
检查录制是否处于活跃状态

Tabs

标签页管理

CommandDescription
mb tab list
List open tabs
mb tab new [url]
Open new tab, print index
mb tab close [n]
Close tab (default: last)
命令描述
mb tab list
列出所有打开的标签页
mb tab new [url]
新建标签页,返回标签页索引
mb tab close [n]
关闭指定标签页(默认:最后一个标签页)

Other

其他命令

CommandDescription
mb js <code>
Run JavaScript in page context
mb wait <target>
Wait for ms / selector / networkidle / url:pattern
mb audit
Design audit (palette, typography, contrast, a11y, SEO)
mb logs
Stream console logs (Ctrl+C to stop)
命令描述
mb js <code>
在页面上下文环境中运行JavaScript代码
mb wait <target>
等待指定时长/选择器出现/网络空闲/URL匹配模式
mb audit
设计审核(包含调色板、排版、对比度、可访问性、SEO等)
mb logs
流式输出控制台日志(按Ctrl+C停止)

Flags

标志位

FlagDefaultDescription
--timeout <ms>
30000Command timeout
--tab <n>
0Target tab index
--json
falseStructured JSON output
--right
falseRight-click
--double
falseDouble-click
--fps <n>
30Recording frame rate
--scale <n>
1Recording scale factor
标志位默认值描述
--timeout <ms>
30000命令超时时间
--tab <n>
0目标标签页索引
--json
false输出结构化JSON格式
--right
false右键点击
--double
false双击
--fps <n>
30录制帧率
--scale <n>
1录制缩放比例

Usage Patterns

使用模式

Observe → Act loop

观测→操作循环

The standard agent loop: snapshot the page, pick an element, act on it.
bash
mb snap                          # list interactive elements with (x, y)
mb click 512 380                 # click the button at those coordinates
mb wait networkidle              # wait for the page to settle
mb snap                          # observe again
标准Agent工作流:快照页面,选择元素,执行操作。
bash
mb snap                          # list interactive elements with (x, y)
mb click 512 380                 # click the button at those coordinates
mb wait networkidle              # wait for the page to settle
mb snap                          # observe again

Fill and submit a form

填写并提交表单

bash
mb go "https://example.com/login"
mb fill "Email=user@example.com" "Password=hunter2"
mb key Enter
mb wait url:/dashboard
bash
mb go "https://example.com/login"
mb fill "Email=user@example.com" "Password=hunter2"
mb key Enter
mb wait url:/dashboard

Take a screenshot

截取屏幕截图

bash
mb shot page.png
bash
mb shot page.png

Extract text

提取文本

bash
mb text "main"                   # text from <main>
mb text "#content"               # text from #content
mb text                          # full body text
bash
mb text "main"                   # text from <main>
mb text "#content"               # text from #content
mb text                          # full body text

Run JavaScript

运行JavaScript代码

bash
mb js 'document.title'
echo 'document.querySelectorAll("a").length' | mb js -
bash
mb js 'document.title'
echo 'document.querySelectorAll("a").length' | mb js -

Record a screencast

录制屏幕视频

bash
mb record start demo.mp4 --fps 30 --scale 1
bash
mb record start demo.mp4 --fps 30 --scale 1

... interact with the page ...

... interact with the page ...

mb record stop
undefined
mb record stop
undefined

Design audit

设计审核

bash
mb audit                         # human-readable report
mb audit --json                  # structured JSON output
bash
mb audit                         # human-readable report
mb audit --json                  # structured JSON output

Dismiss overlays

移除弹窗

Cookie banners and modals block clicks. Remove them with JS:
bash
mb js 'document.querySelector("[class*=cookie]")?.remove()'
Cookie提示框和模态框会阻止点击操作。可通过JS移除:
bash
mb js 'document.querySelector("[class*=cookie]")?.remove()'

Wait strategies

等待策略

bash
mb wait 2000                     # sleep 2 seconds
mb wait ".modal"                 # wait for selector to appear
mb wait networkidle              # wait for no network activity
mb wait url:/dashboard           # wait for URL to contain string
bash
mb wait 2000                     # sleep 2 seconds
mb wait ".modal"                 # wait for selector to appear
mb wait networkidle              # wait for no network activity
mb wait url:/dashboard           # wait for URL to contain string

Important Notes

重要说明

  • Viewport is 1024×768.
    snap
    only returns elements in the current viewport — scroll and snap again to find more.
  • text
    uses querySelector
    — returns first match only. Use
    text "main"
    over
    text "p"
    for better results.
  • go
    waits for networkidle.
    For heavy SPAs, follow up with
    wait ".selector"
    .
  • type
    with coordinates triple-clicks first
    to select existing text, then types the replacement.
  • fill
    field matching order:
    aria-label → placeholder → name attr → id → label text → CSS selector (use
    #
    /
    .
    /
    [
    prefix).
  • --json
    output:
    snap
    [{role, name, x, y, state}]
    ,
    tab list
    [{index, url, title}]
    ,
    logs
    → JSON lines,
    audit
    → full audit object.
  • Recording state is stored in
    ~/.mb-recorder.json
    . Only one recording at a time.
  • tab close
    cannot close the last remaining tab.
  • 视口大小为1024×768
    snap
    命令仅返回当前视口中的元素——如需查找更多元素,可先滚动页面再执行
    snap
  • text
    命令使用querySelector
    ——仅返回第一个匹配结果。建议使用
    text "main"
    而非
    text "p"
    以获得更准确的结果。
  • go
    命令会等待网络空闲
    。对于复杂的SPA应用,可在之后执行
    wait ".selector"
    等待元素加载完成。
  • 指定坐标的
    type
    命令会先三击选中现有文本
    ,然后输入替换内容。
  • fill
    命令的字段匹配顺序
    :aria-label → placeholder → name属性 → id → 标签文本 → CSS选择器(使用#/./[前缀)。
  • --json
    格式输出
    snap
    [{role, name, x, y, state}]
    tab list
    [{index, url, title}]
    logs
    → JSON行,
    audit
    → 完整的审核对象。
  • 录制状态存储在
    ~/.mb-recorder.json
    中。同一时间只能进行一个录制任务。
  • tab close
    命令
    无法关闭最后一个剩余的标签页。

Troubleshooting

故障排除

ProblemFix
"Chrome not found"Set
CHROME_BIN=/path/to/chrome
Connection refusedRun
mb-start-chrome
first
Stale recording stateDelete
~/.mb-recorder.json
Chrome window wrong size
mb-restart-chrome
(creates fresh profile)
Element not in snap output
mb scroll down 500
then
mb snap
again
问题解决方法
"Chrome not found"设置环境变量
CHROME_BIN=/path/to/chrome
连接被拒绝先运行
mb-start-chrome
命令
录制状态异常删除文件
~/.mb-recorder.json
Chrome窗口大小异常执行
mb-restart-chrome
命令(会创建全新配置文件)
元素未出现在
snap
输出中
执行
mb scroll down 500
后再次运行
mb snap