phonebase
Original:🇺🇸 English
Translated
Control Android cloud phones via the `pb` CLI. Make sure to use this skill whenever the user mentions logging into apps, installing apps, browsing on a phone, opening an app (Twitter, Google Play, WeChat, Instagram, TikTok, WhatsApp, Telegram, etc.), searching on phone, checking what's on screen, taking screenshots, tapping buttons, typing text, swiping, or any task that involves an Android device. This skill applies even if the user doesn't say "phone" or "pb" — phrases like "help me log in to Twitter", "install WeChat", "open Google Play", "search for something", or "check the screen" all imply phone operation. Always prefer pb over desktop browsers or Playwright for mobile and app tasks.
4installs
Added on
NPX Install
npx skill4agent add phonebase-cloud/phonebase-skills phonebaseTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →PhoneBase Cloud Phone Control
You have access to an Android cloud phone through the CLI. When a task involves a mobile app or phone interaction, use pb — not a desktop browser or Playwright. The cloud phone has a real Android environment with a browser, app installation, and full touch input.
pbInstallation Check
Before doing anything else, verify is installed:
pbpb --versionIf the command is not found, install it with the official one-liner:
curl -fsSL https://get.phonebase.cloud | shThis downloads a prebuilt binary for the current platform (macOS arm64/x64, Linux arm64/x64) from . The installer prints the exact download URL before fetching, and places in or . If it lands in , follow the printed PATH hint, then re-run to confirm.
https://github.com/phonebase-cloud/phonebase-cli/releasespb/usr/local/bin~/.local/bin~/.local/binpb --versionDo not try to invent alternative install commands or build from source — always use the curl one-liner above. Only proceed to Authentication after succeeds.
pb --versionAuthentication
If pb is not yet configured, the user needs to authenticate first:
pb login # browser-based login (interactive)
pb apikey <key> # or set API key directly
pb status # verify authentication worksIf any pb command returns a successful response, authentication is already in place — skip this step.
Connection
pb devices # list available devices
pb connect <id> # connect to a device (starts daemon automatically)
pb disconnect # disconnect when doneWhy Aliases Matter
pb wraps common Android operations (am start, input tap, pm list, etc.) as simple CLI aliases. These aliases return structured JSON and handle errors consistently. Using bypasses this — you lose structured output and error handling, and the command is harder to read.
pb shell "am start ..."Think of it like using instead of manually running the git binary with raw arguments — the alias exists because it's the right interface.
git log| Shell command (avoid) | Alias (use this) |
|---|---|
| |
| |
| |
| |
| |
| |
| |
Alias parameter limits: supports (action), (component), (data), (type), and positional package name. It does not support extras flags like or . When you need extras or other advanced intent parameters, use the JSON mode instead of falling back to :
pb start-a-n-d-t--es--ei-jpb shellpb -j '{"action":"android.settings.ADD_ACCOUNT_SETTINGS","extras":{"account_types":"com.google"}}' activity/start_activityThe flag sends a raw JSON body directly to the API path, bypassing alias parsing. This gives you full control over parameters while still getting structured JSON output. Reserve only for commands that are not Android API calls — like or .
-jpb shellpb shell "cat /proc/cpuinfo"pb shell "getprop ro.build.version.sdk"Observing the Screen
pb dumpcpb screencapExample: If dumpc shows , you know to tap the center: . No screenshot needed.
android.widget.Button text="NEXT" bounds=[756,2194][1020,2338]pb tap 888 2266Command Reference
Observe
| Command | Purpose |
|---|---|
| Compact UI tree — text, bounds, clickable state (preferred) |
| Full XML accessibility tree (when you need resource IDs or hierarchy) |
| Screenshot image (only for visual-only content like video/game) |
| UI inspection — accessibility tree + marked screenshot |
Touch & Input
| Command | Purpose |
|---|---|
| Tap at coordinates |
| Swipe between two points |
| Type text into focused field |
| Send key event (4=Back, 3=Home, 66=Enter, 82=Menu) |
App Management
| Command | Purpose |
|---|---|
| Launch app by package name |
| Start activity with flags (-a/-n/-d/-t) |
| Force stop an app |
| List all installed packages |
| Install APK from local file or download URL |
| Uninstall an app |
Browser & Navigation
| Command | Purpose |
|---|---|
| Open URL in best available browser on the phone |
| Show current foreground activity |
Files & Clipboard
| Command | Purpose |
|---|---|
| List files on device |
| Upload file to device |
| Download file from device |
| Get or set clipboard content |
System
| Command | Purpose |
|---|---|
| Raw shell command (only for non-API commands) |
| Screen resolution and density info |
Discovery
| Command | Purpose |
|---|---|
| List all available API paths (filtered, hides aliased ones) |
| Filter API paths by keyword |
| Show details of a specific alias or API path |
| Full help with alias list and usage |
When you encounter a task not covered by the aliases above, run to discover additional API paths, or to get parameter details.
pb listpb info <name>Advanced: JSON Mode
For complex API calls that go beyond what aliases support, use to pass a full JSON body:
-jpb -j '{"package_name":"com.example","class_name":".MainActivity"}' activity/start_activityYou can also read JSON from a file with :
-fpb -f params.json activity/start_activity
pb -f - activity/start_activity # read from stdinThis is the preferred escape hatch when aliases don't cover your parameters — it still goes through the structured API and returns JSON. Only use for raw Linux commands that aren't part of the phone's control API.
pb shellOutput Format
Every pb command returns JSON to stdout:
json
{"code": 200, "data": ..., "msg": "OK"}Human-readable messages and logs go to stderr — ignore stderr when parsing responses.
Interaction Pattern
The core loop for operating the phone:
- Observe — to see current screen state
pb dumpc - Locate — find the target element's bounds in the output
- Act — to interact (calculate center from bounds)
pb tap <center_x> <center_y> - Verify — again to confirm the action worked
pb dumpc - Repeat as needed
Common gestures:
- Scroll down:
pb swipe 540 1500 540 500 - Scroll up:
pb swipe 540 500 540 1500 - Go back:
pb keyevent 4 - Go home:
pb keyevent 3
Detailed Operation Guides
For multi-step tasks, read the relevant guide from before starting. Each guide is a Claude-standard skill living at :
~/.phonebase/skills/<name>/SKILL.md| Guide | Path | Read it when... |
|---|---|---|
| install-app | | You need to search for, download, and install an Android app |
| web-search | | You need to search the web using the phone's browser |
Run to see all installed guides with their enabled/disabled status. Only read guides that show . Use to add third-party guides; they appear alongside the built-in ones. Additional guides may exist — run to see what's there, then read inside.
pb skills listenabledpb skills install <path-or-url>ls ~/.phonebase/skills/<dir>/SKILL.md