cli-anything-browser

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

cli-anything-browser

cli-anything-browser

A command-line interface for browser automation using DOMShell's MCP server. Navigate web pages using filesystem commands:
ls
,
cd
,
cat
,
grep
,
click
.
一款基于DOMShell MCP服务器的浏览器自动化命令行界面。可通过文件系统命令(如
ls
cd
cat
grep
click
)浏览网页。

Installation

安装

Prerequisites

前置要求

  1. Node.js and npx (for DOMShell MCP server):
    bash
    # Install Node.js from https://nodejs.org/
    npx --version
  2. Chrome/Chromium with DOMShell extension:
    • Install extension in Chrome
    • Ensure Chrome is running before using CLI
  3. Python 3.10+
  1. Node.js 和 npx(用于DOMShell MCP服务器):
    bash
    # 从 https://nodejs.org/ 安装Node.js
    npx --version
  2. Chrome/Chromium 浏览器(需安装DOMShell扩展):
    • 在Chrome中安装扩展
    • 使用CLI前确保Chrome已运行
  3. Python 3.10+

Install CLI

安装CLI

bash
cd browser/agent-harness
pip install -e .
bash
cd browser/agent-harness
pip install -e .

Command Groups

命令组

page
— Page Navigation

page
— 页面导航

  • page open <url>
    — Navigate to URL
  • page reload
    — Reload current page
  • page back
    — Navigate back in history
  • page forward
    — Navigate forward in history
  • page info
    — Show current page info
  • page open <url>
    — 导航至指定URL
  • page reload
    — 重新加载当前页面
  • page back
    — 历史记录后退
  • page forward
    — 历史记录前进
  • page info
    — 显示当前页面信息

fs
— Filesystem Commands (Accessibility Tree)

fs
— 文件系统命令(无障碍树)

  • fs ls [path]
    — List elements at path
  • fs cd <path>
    — Change directory
  • fs cat [path]
    — Read element content
  • fs grep <pattern> [path]
    — Search for text pattern
  • fs pwd
    — Print working directory
  • fs ls [path]
    — 列出指定路径下的元素
  • fs cd <path>
    — 切换路径
  • fs cat [path]
    — 读取元素内容
  • fs grep <pattern> [path]
    — 搜索文本匹配模式
  • fs pwd
    — 打印当前路径

act
— Action Commands

act
— 操作命令

  • act click <path>
    — Click an element
  • act type <path> <text>
    — Type text into input
  • act click <path>
    — 点击指定元素
  • act type <path> <text>
    — 在输入框中输入文本

session
— Session Management

session
— 会话管理

  • session status
    — Show session state
  • session daemon-start
    — Start persistent daemon mode
  • session daemon-stop
    — Stop daemon mode
  • session status
    — 显示会话状态
  • session daemon-start
    — 启动持久化守护进程模式
  • session daemon-stop
    — 停止守护进程模式

Usage Examples

使用示例

Basic Navigation

基础导航

bash
undefined
bash
undefined

Open a page

打开页面

cli-anything-browser page open https://example.com
cli-anything-browser page open https://example.com

Explore structure

探索页面结构

cli-anything-browser fs ls / cli-anything-browser fs cd /main cli-anything-browser fs ls
cli-anything-browser fs ls / cli-anything-browser fs cd /main cli-anything-browser fs ls

Go back to root

返回根路径

cli-anything-browser fs cd /
undefined
cli-anything-browser fs cd /
undefined

Search and Click

搜索与点击

bash
cli-anything-browser fs grep "Login"
cli-anything-browser act click /main/button[0]
bash
cli-anything-browser fs grep "Login"
cli-anything-browser act click /main/button[0]

Form Fill

表单填充

bash
cli-anything-browser act type /main/input[0] "user@example.com"
cli-anything-browser act click /main/button[0]
bash
cli-anything-browser act type /main/input[0] "user@example.com"
cli-anything-browser act click /main/button[0]

JSON Output

JSON输出

bash
cli-anything-browser --json fs ls /
bash
cli-anything-browser --json fs ls /

Daemon Mode (Faster Interactive Use)

守护进程模式(提升交互式使用速度)

bash
undefined
bash
undefined

Start persistent connection

启动持久化连接

cli-anything-browser session daemon-start
cli-anything-browser session daemon-start

Run commands (uses persistent connection)

执行命令(复用持久化连接)

cli-anything-browser fs ls / cli-anything-browser fs cd /main
cli-anything-browser fs ls / cli-anything-browser fs cd /main

Stop daemon when done

使用完成后停止守护进程

cli-anything-browser session daemon-stop
undefined
cli-anything-browser session daemon-stop
undefined

Interactive REPL

交互式REPL

bash
cli-anything-browser
bash
cli-anything-browser

Path Syntax

路径语法

DOMShell uses a filesystem-like path for the Accessibility Tree:
/                           — Root (document)
/main                       — Main landmark
/main/div[0]                — First div in main
/main/div[0]/button[2]      — Third button in first div
  • Array indices are 0-based:
    button[0]
    is the first button
  • Use
    ..
    to go up one level
  • Use
    /
    for root
DOMShell为无障碍树提供类文件系统的路径格式:
/                           — 根节点(文档)
/main                       — 主要地标
/main/div[0]                — 主要地标下的第一个div元素
/main/div[0]/button[2]      — 第一个div下的第三个按钮元素
  • 数组索引为从0开始
    button[0]
    代表第一个按钮
  • 使用
    ..
    返回上一级路径
  • 使用
    /
    返回根路径

Agent-Specific Guidance

Agent专属指南

JSON Output for Parsing

用于解析的JSON输出

All commands support
--json
flag for machine-readable output:
bash
cli-anything-browser --json fs ls /
Returns:
json
{
  "path": "/",
  "entries": [
    {"name": "main", "role": "landmark", "path": "/main"}
  ]
}
所有命令均支持
--json
标志,以生成机器可读的输出:
bash
cli-anything-browser --json fs ls /
返回结果:
json
{
  "path": "/",
  "entries": [
    {"name": "main", "role": "landmark", "path": "/main"}
  ]
}

Error Handling

错误处理

The CLI provides clear error messages for common issues:
  • npx not found: Install Node.js from https://nodejs.org/
  • DOMShell not found: Run
    npx @apireno/domshell --version
  • MCP call failed: Install DOMShell Chrome extension
Check
is_available()
return value before running commands.
CLI针对常见问题提供清晰的错误提示:
  • 未找到npx:从https://nodejs.org/安装Node.js
  • 未找到DOMShell:执行
    npx @apireno/domshell --version
    检查
  • MCP调用失败:安装DOMShell Chrome扩展
执行命令前请检查
is_available()
的返回值。

Daemon Mode for Efficiency

提升效率的守护进程模式

For agent workflows with multiple commands, use daemon mode:
  1. Start daemon:
    cli-anything-browser session daemon-start
  2. Run commands: Each command reuses the MCP connection
  3. Stop daemon:
    cli-anything-browser session daemon-stop
This avoids the 1-3 second cold start overhead for each command.
对于包含多条命令的Agent工作流,建议使用守护进程模式:
  1. 启动守护进程:
    cli-anything-browser session daemon-start
  2. 执行命令:每个命令都会复用MCP连接
  3. 停止守护进程:
    cli-anything-browser session daemon-stop
此模式可避免每次命令执行时1-3秒的冷启动开销。

Links

链接

Security Considerations

安全注意事项

IMPORTANT: When using this CLI with AI agents, be aware of the following security considerations:
重要提示:将此CLI与AI Agent结合使用时,请注意以下安全事项:

URL Restrictions

URL限制

The browser harness validates all URLs before navigation:
  • Explicit scheme required: URLs must include
    http://
    or
    https://
    scheme (scheme-less URLs like
    example.com
    are rejected)
  • Blocked schemes:
    file://
    ,
    javascript://
    ,
    data://
    ,
    vbscript://
    ,
    about://
    ,
    chrome://
    , and browser-internal schemes
  • Allowed schemes:
    http://
    and
    https://
    only (configurable via
    CLI_ANYTHING_BROWSER_ALLOWED_SCHEMES
    )
  • Private network blocking: Optional via
    CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
    (disabled by default)
浏览器工具会在导航前验证所有URL:
  • 必须指定协议:URL需包含
    http://
    https://
    协议(无协议的URL如
    example.com
    会被拒绝)
  • 禁止的协议
    file://
    javascript://
    data://
    vbscript://
    about://
    chrome://
    及浏览器内部协议
  • 允许的协议:仅
    http://
    https://
    (可通过
    CLI_ANYTHING_BROWSER_ALLOWED_SCHEMES
    配置)
  • 私有网络拦截:可通过
    CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
    启用(默认禁用)

DOM Content Risks

DOM内容风险

The Accessibility Tree includes all visible and hidden elements on a page. Malicious websites could:
  • Craft ARIA labels with manipulative text (e.g., "Ignore previous instructions")
  • Use aria-hidden elements to inject content not visible to users
  • Create confusing DOM structures that mislead navigation
Mitigation: When interacting with untrusted websites, consider:
  1. Using the
    --json
    flag for structured output that's easier to parse safely
  2. Sanitizing or filtering DOM content before including it in prompts
  3. Limiting navigation to trusted domains
无障碍树包含页面上所有可见和隐藏的元素。恶意网站可能:
  • 构造带有误导性文本的ARIA标签(如“忽略之前的指令”)
  • 使用aria-hidden元素注入用户不可见的内容
  • 创建混淆的DOM结构误导导航
缓解措施:与不可信网站交互时,建议:
  1. 使用
    --json
    标志获取结构化输出,便于安全解析
  2. 在将DOM内容纳入提示前进行清理或过滤
  3. 限制仅导航至可信域名

Private Network Access

私有网络访问

By default, the browser can access localhost and private networks (192.168.x.x, 10.x.x.x, etc.). To block:
bash
export CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
cli-anything-browser page open http://localhost:8080  # Will be blocked
默认情况下,浏览器可访问本地主机和私有网络(如192.168.x.x、10.x.x.x等)。如需拦截:
bash
export CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
cli-anything-browser page open http://localhost:8080  # 会被拦截

Session Isolation

会话隔离

Multiple browser sessions share the same Chrome instance. Cookies and authentication state may persist across sessions. For sensitive operations, consider:
  1. Using Chrome's guest mode or incognito
  2. Clearing cookies between sessions
  3. Using separate Chrome profiles for different security contexts
多个浏览器会话共享同一Chrome实例。Cookie和认证状态可能在会话间保留。对于敏感操作,建议:
  1. 使用Chrome的访客模式或隐身模式
  2. 在会话间清除Cookie
  3. 为不同安全上下文使用独立的Chrome配置文件