cli-anything-browser
A command-line interface for browser automation using
DOMShell's MCP server. Navigate web pages using filesystem commands:
,
,
,
,
.
Installation
Prerequisites
-
Node.js and npx (for DOMShell MCP server):
bash
# Install Node.js from https://nodejs.org/
npx --version
-
Chrome/Chromium with
DOMShell extension:
- Install extension in Chrome
- Ensure Chrome is running before using CLI
-
Python 3.10+
Install CLI
bash
cd browser/agent-harness
pip install -e .
Command Groups
— Page Navigation
- — Navigate to URL
- — Reload current page
- — Navigate back in history
- — Navigate forward in history
- — Show current page info
— Filesystem Commands (Accessibility Tree)
- — List elements at path
- — Change directory
- — Read element content
- — Search for text pattern
- — Print working directory
— Action Commands
- — Click an element
- — Type text into input
— Session Management
- — Show session state
- — Start persistent daemon mode
- — Stop daemon mode
Usage Examples
Basic Navigation
bash
# Open a page
cli-anything-browser page open https://example.com
# Explore structure
cli-anything-browser fs ls /
cli-anything-browser fs cd /main
cli-anything-browser fs ls
# Go back to root
cli-anything-browser fs cd /
Search and Click
bash
cli-anything-browser fs grep "Login"
cli-anything-browser act click /main/button[0]
Form Fill
bash
cli-anything-browser act type /main/input[0] "user@example.com"
cli-anything-browser act click /main/button[0]
JSON Output
bash
cli-anything-browser --json fs ls /
Daemon Mode (Faster Interactive Use)
bash
# Start persistent connection
cli-anything-browser session daemon-start
# Run commands (uses persistent connection)
cli-anything-browser fs ls /
cli-anything-browser fs cd /main
# Stop daemon when done
cli-anything-browser session daemon-stop
Interactive REPL
Path Syntax
DOMShell uses a filesystem-like path for the Accessibility Tree:
/ — Root (document)
/main — Main landmark
/main/div[0] — First div in main
/main/div[0]/button[2] — Third button in first div
- Array indices are 0-based: is the first button
- Use to go up one level
- Use for root
Agent-Specific Guidance
JSON Output for Parsing
All commands support
flag for machine-readable output:
bash
cli-anything-browser --json fs ls /
Returns:
json
{
"path": "/",
"entries": [
{"name": "main", "role": "landmark", "path": "/main"}
]
}
Error Handling
The CLI provides clear error messages for common issues:
- npx not found: Install Node.js from https://nodejs.org/
- DOMShell not found: Run
npx @apireno/domshell --version
- MCP call failed: Install DOMShell Chrome extension
Check
return value before running commands.
Daemon Mode for Efficiency
For agent workflows with multiple commands, use daemon mode:
- Start daemon:
cli-anything-browser session daemon-start
- Run commands: Each command reuses the MCP connection
- Stop daemon:
cli-anything-browser session daemon-stop
This avoids the 1-3 second cold start overhead for each command.
Links
Security Considerations
IMPORTANT: When using this CLI with AI agents, be aware of the following security considerations:
URL Restrictions
The browser harness validates all URLs before navigation:
- Explicit scheme required: URLs must include or scheme (scheme-less URLs like are rejected)
- Blocked schemes: , , , , , , and browser-internal schemes
- Allowed schemes: and only (configurable via
CLI_ANYTHING_BROWSER_ALLOWED_SCHEMES
)
- Private network blocking: Optional via
CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
(disabled by default)
DOM Content Risks
The Accessibility Tree includes all visible and hidden elements on a page. Malicious websites could:
- Craft ARIA labels with manipulative text (e.g., "Ignore previous instructions")
- Use aria-hidden elements to inject content not visible to users
- Create confusing DOM structures that mislead navigation
Mitigation: When interacting with untrusted websites, consider:
- Using the flag for structured output that's easier to parse safely
- Sanitizing or filtering DOM content before including it in prompts
- Limiting navigation to trusted domains
Private Network Access
By default, the browser can access localhost and private networks (192.168.x.x, 10.x.x.x, etc.). To block:
bash
export CLI_ANYTHING_BROWSER_BLOCK_PRIVATE=true
cli-anything-browser page open http://localhost:8080 # Will be blocked
Session Isolation
Multiple browser sessions share the same Chrome instance. Cookies and authentication state may persist across sessions. For sensitive operations, consider:
- Using Chrome's guest mode or incognito
- Clearing cookies between sessions
- Using separate Chrome profiles for different security contexts