PhoneBase Cloud Phone Control
You have access to an Android cloud phone through the
CLI. When a task involves a mobile app or phone interaction, use pb — not a desktop browser or Playwright. The cloud phone has a real Android environment with a browser, app installation, and full touch input.
Installation Check
Before doing anything else, verify
is installed:
If the command is not found, install it with the official one-liner:
curl -fsSL https://get.phonebase.cloud | sh
This downloads a prebuilt binary for the current platform (macOS arm64/x64, Linux arm64/x64) from
https://github.com/phonebase-cloud/phonebase-cli/releases
. The installer prints the exact download URL before fetching, and places
in
or
. If it lands in
, follow the printed PATH hint, then re-run
to confirm.
Do
not try to invent alternative install commands or build from source — always use the curl one-liner above. Only proceed to Authentication after
succeeds.
Authentication
If pb is not yet configured, the user needs to authenticate first:
pb login # browser-based login (interactive)
pb apikey <key> # or set API key directly
pb status # verify authentication works
If any pb command returns a successful response, authentication is already in place — skip this step.
Connection
pb devices # list available devices
pb connect <id> # connect to a device (starts daemon automatically)
pb disconnect # disconnect when done
Why Aliases Matter
pb wraps common Android operations (am start, input tap, pm list, etc.) as simple CLI aliases. These aliases return structured JSON and handle errors consistently. Using
bypasses this — you lose structured output and error handling, and the command is harder to read.
Think of it like using
instead of manually running the git binary with raw arguments — the alias exists because it's the right interface.
| Shell command (avoid) | Alias (use this) |
|---|
pb shell "am start -a ACTION"
| |
pb shell "am force-stop PKG"
| |
pb shell "pm list packages"
| |
| |
pb shell "input text STR"
| |
pb shell "input swipe X1 Y1 X2 Y2"
| |
pb shell "input keyevent KEY"
| |
Alias parameter limits: supports
(action),
(component),
(data),
(type), and positional package name. It does not support extras flags like
or
. When you need extras or other advanced intent parameters, use the
JSON mode instead of falling back to
:
pb -j '{"action":"android.settings.ADD_ACCOUNT_SETTINGS","extras":{"account_types":"com.google"}}' activity/start_activity
The
flag sends a raw JSON body directly to the API path, bypassing alias parsing. This gives you full control over parameters while still getting structured JSON output. Reserve
only for commands that are not Android API calls — like
pb shell "cat /proc/cpuinfo"
or
pb shell "getprop ro.build.version.sdk"
.
Observing the Screen
is the primary way to see what's on screen. It returns a compact text tree with every UI element's text, resource ID, bounds (coordinates), and whether it's clickable. This is everything you need to decide what to tap next — and it's text, so you can reason about it directly.
takes a screenshot image. This is only useful when the screen contains visual-only content with no text elements — like a video player, game, or canvas. In every other case, dumpc gives you more actionable information faster.
Example: If dumpc shows
android.widget.Button text="NEXT" bounds=[756,2194][1020,2338]
, you know to tap the center:
. No screenshot needed.
Command Reference
Observe
| Command | Purpose |
|---|
| Compact UI tree — text, bounds, clickable state (preferred) |
| Full XML accessibility tree (when you need resource IDs or hierarchy) |
| Screenshot image (only for visual-only content like video/game) |
| UI inspection — accessibility tree + marked screenshot |
Touch & Input
| Command | Purpose |
|---|
| Tap at coordinates |
pb swipe <x1> <y1> <x2> <y2>
| Swipe between two points |
| Type text into focused field |
| Send key event (4=Back, 3=Home, 66=Enter, 82=Menu) |
App Management
| Command | Purpose |
|---|
| Launch app by package name |
| Start activity with flags (-a/-n/-d/-t) |
| Force stop an app |
| List all installed packages |
pb install <path|--uri url>
| Install APK from local file or download URL |
| Uninstall an app |
Browser & Navigation
| Command | Purpose |
|---|
| Open URL in best available browser on the phone |
| Show current foreground activity |
Files & Clipboard
| Command | Purpose |
|---|
| List files on device |
| Upload file to device |
| Download file from device |
| Get or set clipboard content |
System
| Command | Purpose |
|---|
| Raw shell command (only for non-API commands) |
| Screen resolution and density info |
Discovery
| Command | Purpose |
|---|
| List all available API paths (filtered, hides aliased ones) |
| Filter API paths by keyword |
| Show details of a specific alias or API path |
| Full help with alias list and usage |
When you encounter a task not covered by the aliases above, run
to discover additional API paths, or
to get parameter details.
Advanced: JSON Mode
For complex API calls that go beyond what aliases support, use
to pass a full JSON body:
pb -j '{"package_name":"com.example","class_name":".MainActivity"}' activity/start_activity
You can also read JSON from a file with
:
pb -f params.json activity/start_activity
pb -f - activity/start_activity # read from stdin
This is the preferred escape hatch when aliases don't cover your parameters — it still goes through the structured API and returns JSON. Only use
for raw Linux commands that aren't part of the phone's control API.
Output Format
Every pb command returns JSON to stdout:
json
{"code": 200, "data": ..., "msg": "OK"}
Human-readable messages and logs go to stderr — ignore stderr when parsing responses.
Interaction Pattern
The core loop for operating the phone:
- Observe — to see current screen state
- Locate — find the target element's bounds in the output
- Act —
pb tap <center_x> <center_y>
to interact (calculate center from bounds)
- Verify — again to confirm the action worked
- Repeat as needed
Common gestures:
- Scroll down:
pb swipe 540 1500 540 500
- Scroll up:
pb swipe 540 500 540 1500
- Go back:
- Go home:
Detailed Operation Guides
For multi-step tasks, read the relevant guide from
before starting. Each guide is a Claude-standard skill living at
:
| Guide | Path | Read it when... |
|---|
| install-app | ~/.phonebase/skills/install-app/SKILL.md
| You need to search for, download, and install an Android app |
| web-search | ~/.phonebase/skills/web-search/SKILL.md
| You need to search the web using the phone's browser |
Run
to see all installed guides with their enabled/disabled status. Only read guides that show
. Use
pb skills install <path-or-url>
to add third-party guides; they appear alongside the built-in ones. Additional guides may exist — run
to see what's there, then read
inside.