serve-sim
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chineseserve-sim
serve-sim
Drive an Apple Simulator (iOS, iPad, Apple Watch) from an agent using the serve-sim CLI. serve-sim spawns a Swift helper that captures the simulator framebuffer via , exposes it as an MJPEG stream plus a binary WebSocket input channel, and serves a React preview UI on top. This skill teaches an agent the exact CLI surface, the gesture JSON shape, the gotchas, and the recommended workflows.
simctl io通过serve-sim CLI,让Agent操控Apple Simulator(iOS、iPad、Apple Watch)。serve-sim会启动一个Swift辅助程序,通过捕获模拟器的帧缓冲区,将其以MJPEG流和二进制WebSocket输入通道的形式暴露出来,并在其上提供React预览UI。本技能将教会Agent该CLI的具体用法、手势JSON格式、注意事项以及推荐工作流。
simctl ioWhen to use
使用场景
- The user wants an agent to tap, swipe, drag, pinch, or send hardware buttons to a running Apple Simulator.
- The user wants to stream a simulator to a browser (local, LAN, or tunneled) for review or remote control.
- The user wants to inject a synthetic camera feed (file, webcam, or animated placeholder) into a specific app on the simulator.
- The user wants to toggle CoreAnimation debug overlays (off-screen rendering, blended layers, slow animations) for performance work.
- The user wants to simulate a memory warning or rotate the device programmatically.
- The user wants to read the simulator's accessibility tree to find UI elements without pixel hunting.
- The user wants to grant, revoke, or reset an app's privacy permissions — camera, photos, location, contacts, or push notifications.
- 用户希望Agent对运行中的Apple Simulator执行点击、滑动、拖拽、捏合或触发硬件按键操作。
- 用户希望将模拟器画面流式传输到浏览器(本地、局域网或隧道)以便查看或远程控制。
- 用户希望向模拟器上的特定应用注入合成相机画面(文件、摄像头或动画占位符)。
- 用户希望切换CoreAnimation调试叠加层(屏幕外渲染、混合图层、慢速动画)以进行性能调优。
- 用户希望以编程方式模拟内存警告或旋转设备。
- 用户希望读取模拟器的无障碍树,无需像素级定位即可找到UI元素。
- 用户希望授予、撤销或重置应用的隐私权限——相机、照片、位置、联系人或推送通知权限。
When NOT to use
禁用场景
- Android emulators → use tooling.
adb shell - Building or installing an iOS app → use or
xcodebuild.xcrun simctl install - React Native in-app runtime debugging (Redux state, network inspection, component tree) → use rn-debugger tooling.
- Real iOS hardware devices → use or Xcode.
xcrun devicectl
- Android模拟器 → 使用工具集。
adb shell - iOS应用的构建或安装 → 使用或
xcodebuild。xcrun simctl install - React Native应用内运行时调试(Redux状态、网络检查、组件树)→ 使用rn-debugger工具集。
- 真实iOS硬件设备 → 使用或Xcode。
xcrun devicectl
Prerequisites
前置条件
Before any other action, verify the host satisfies these. If something is missing, tell the user exactly what to install — do not proceed.
| Requirement | Check command | Why |
|---|---|---|
| macOS host | | serve-sim only runs on macOS |
| Xcode CLI tools | | |
| Node.js ≥18 | | serve-sim is an npm package run via |
| macOS 14+ (optional) | | Required ONLY for |
A bundled helper script is available: . Run it; if it exits non-zero, surface the message to the user.
scripts/check-prereqs.shA booted simulator is required for most subcommands. Check with . If none are booted, tell the user to open Xcode → Simulator or to run .
xcrun simctl list devices bootedxcrun simctl boot <UDID>在执行任何操作之前,请确认主机满足以下条件。若有缺失,请明确告知用户需要安装的内容——不要继续执行。
| 要求 | 检查命令 | 原因 |
|---|---|---|
| macOS 主机 | | serve-sim 仅在 macOS 上运行 |
| Xcode CLI 工具 | | |
| Node.js ≥18 | | serve-sim 是通过 |
| macOS 14+(可选) | | 仅 |
我们提供了一个打包好的辅助脚本:。运行该脚本;若返回非零值,请将错误信息告知用户。
scripts/check-prereqs.sh大多数子命令需要已启动的模拟器。可通过检查。若没有已启动的模拟器,请告知用户打开Xcode → Simulator,或运行。
xcrun simctl list devices bootedxcrun simctl boot <UDID>Mental model
核心模型
text
┌──────────────┐ simctl io ┌─────────────────┐ MJPEG / WS ┌─────────┐
│ iOS Simulator│ ──────────► │ serve-sim-bin │ ───────────► │ Browser │
└──────────────┘ (Swift) │ (per-device) │ └─────────┘
└─────────────────┘
▲
state file in
$TMPDIR/serve-sim/
▲
┌──────────────────┐
│ serve-sim CLI │
└──────────────────┘Key invariants the agent must respect:
- All coordinates are normalized 0..1, with at top-left and
(0, 0)at bottom-right of the display. Never pass pixel coordinates.(1, 1) - One helper per device. Multiple booted simulators are supported by passing several device names or by attaching to all.
- State lives in . Use
$TMPDIR/serve-sim/server-{udid}.jsonto query it; do not read the JSON directly unless you know what you are doing.serve-sim --list - The orientation set via is remembered by the helper, and subsequent gestures are rotated client-side. An agent that sends raw coords after a rotation does not need to compensate manually.
rotate
text
┌──────────────┐ simctl io ┌─────────────────┐ MJPEG / WS ┌─────────┐
│ iOS Simulator│ ──────────► │ serve-sim-bin │ ───────────► │ Browser │
└──────────────┘ (Swift) │ (per-device) │ └─────────┘
└─────────────────┘
▲
state file in
$TMPDIR/serve-sim/
▲
┌──────────────────┐
│ serve-sim CLI │
└──────────────────┘Agent必须遵守的关键规则:
- 所有坐标均为归一化的0..1范围,对应屏幕左上角,
(0, 0)对应屏幕右下角。绝对不要传入像素坐标。(1, 1) - 每个设备对应一个辅助程序。支持多个已启动的模拟器,只需传入多个设备名称或附加到所有设备即可。
- 状态存储在中。使用
$TMPDIR/serve-sim/server-{udid}.json查询状态;除非明确知晓操作方式,否则不要直接读取该JSON文件。serve-sim --list - 通过设置的方向会被辅助程序记住,后续的手势会在客户端自动适配旋转。Agent在旋转后发送原始坐标时无需手动调整。
rotate
Common operations
常见操作
| Goal | Command | Notes |
|---|---|---|
| Start preview server | | Default preview at |
| Start headless / daemon | | Returns JSON with |
| Show stream in host's preview | | See "Showing the stream in your agent's preview" section. |
| List running streams | | Add |
| Stop all helpers | | Pass |
| Single tap | | |
| Multi-step gesture | | See references/gestures.md. |
| Hardware button | | Names: |
| Rotate device | | |
| Simulate memory warning | | Equivalent to Debug → Simulate Memory Warning. |
| CoreAnimation debug | | Options: |
| Inject camera feed | | (Re)launches the app with the camera dylib attached. macOS 14+ only. See references/camera.md. |
| Hot-swap camera source | | No app relaunch. |
| Manage app permissions | | Camera, photos, location, push notifications, contacts, etc. See references/permissions.md. |
| Read accessibility tree | | Returns axe-style JSON. See references/endpoints.md for all endpoints. |
Most subcommands accept to target a specific device when several are booted.
-d <udid|name>| 目标 | 命令 | 说明 |
|---|---|---|
| 启动预览服务器 | | 默认预览地址为 |
| 以无头/守护进程模式启动 | | 返回包含 |
| 在宿主预览面板中显示流 | | 请查看“在Agent预览面板中显示流”章节。 |
| 列出正在运行的流 | | 添加 |
| 停止所有辅助程序 | | 传入 |
| 单次点击 | | |
| 多步骤手势 | | 请参考references/gestures.md。 |
| 硬件按键操作 | | 按键名称: |
| 旋转设备 | | 方向选项: |
| 模拟内存警告 | | 等同于Debug → Simulate Memory Warning操作。 |
| CoreAnimation调试 | | 选项: |
| 注入相机画面 | | (重新)启动应用并附加相机动态库。仅支持macOS 14+。请参考references/camera.md。 |
| 热切换相机源 | | 无需重启应用。 |
| 管理应用权限 | | 支持相机、照片、位置、推送通知、联系人等权限。请参考references/permissions.md。 |
| 读取无障碍树 | | 返回axe风格的JSON数据。所有端点请参考references/endpoints.md。 |
大多数子命令支持参数,当多个模拟器启动时可指定目标设备。
-d <udid|name>Critical gotcha: prefer tap
over gesture
for taps
tapgesture关键注意事项:普通点击请优先使用tap
而非gesture
tapgestureEach call opens its own WebSocket. If you issue two back-to-back calls — one with and one with — the simulator receives them with enough latency between them that the touch is interpreted as a long-press, not a tap. This is a deliberate constraint of the protocol, not a bug to work around.
serve-sim gesturegesture{"type":"begin",...}{"type":"end",...}Rule: for any single-shot tap, use . Only use for drags, swipes, or multi-step interactions where you must thread the same socket across → × N → .
serve-sim tap <x> <y>gesturebeginmoveend每次调用都会打开一个独立的WebSocket。如果连续发出两个调用——一个是,另一个是——模拟器收到这两个调用的延迟会足够长,导致触摸被识别为长按而非点击。这是协议的刻意限制,并非需要解决的bug。
serve-sim gesturegesture{"type":"begin",...}{"type":"end",...}规则:对于任何单次点击操作,请使用。仅在执行拖拽、滑动或需要在 → × N → 流程中保持同一连接的多步骤交互时,才使用。
serve-sim tap <x> <y>beginmoveendgestureTargeting a specific device
指定目标设备
When multiple simulators are booted, every subcommand accepts . The name match is case-insensitive against the device name returned by . Examples:
-d <udid|name>xcrun simctl list devices bootedsh
npx serve-sim tap 0.5 0.5 -d "iPhone 16 Pro"
npx serve-sim button home -d ABC12345-...
npx serve-sim --list # show all running streamsIf the user has only one booted simulator, omit entirely. The skill should prefer auto-detection over hard-coding device names.
-d当多个模拟器启动时,所有子命令都支持参数。设备名称匹配对返回的设备名称不区分大小写。示例:
-d <udid|name>xcrun simctl list devices bootedsh
npx serve-sim tap 0.5 0.5 -d "iPhone 16 Pro"
npx serve-sim button home -d ABC12345-...
npx serve-sim --list # 显示所有正在运行的流如果用户仅启动了一个模拟器,可完全省略参数。本技能应优先使用自动检测,而非硬编码设备名称。
-dOutput modes
输出模式
By default, serve-sim prints human-readable status to stdout. For agent loops, prefer JSON output:
sh
npx serve-sim --list -q # JSON array of running streams
npx serve-sim --detach -q # JSON with pid/port/url after spawn
npx serve-sim camera status -q # JSON with {alive, source, mirror, ...}Parse output programmatically. Never parse the non- human output — it can change between versions.
-q-q默认情况下,serve-sim会向stdout输出人类可读的状态信息。对于Agent循环任务,建议使用JSON输出:
sh
npx serve-sim --list -q # 返回正在运行的流的JSON数组
npx serve-sim --detach -q # 启动后返回包含pid/port/url的JSON数据
npx serve-sim camera status -q # 返回包含{alive, source, mirror, ...}的JSON数据请以编程方式解析参数的输出。绝对不要解析非模式的人类可读输出——其格式可能会在不同版本间变化。
-q-qShowing the stream in your agent's preview
在Agent预览面板中显示流
When the user asks to "see the simulator here", "view it in preview", "open it in this tool", or similar, the goal is to stream the simulator into the same surface the user is chatting with. serve-sim returns a regular HTTP URL — the agent's job is to surface that URL and, if the host exposes a preview tool, hand it off.
Steps:
-
Start serve-sim and capture the URL:sh
npx serve-sim --detach -qThis returns JSON like. The{"pid":..., "port":3200, "url":"http://localhost:3200", "streamUrl":"http://localhost:3100", ...}field is the human-facing preview UI;urlis the raw MJPEG endpoint.streamUrl -
Always surface the URL plainly in your response so the user can fallback to opening it manually in any browser.
-
Probe your host's preview tool and hand off the URL if one exists. Examples of tool names you may see in your toolset:
- (Claude Code) — call it with
preview_start.{ url: "<url>" } - (some MCP setups).
mcp__Claude_Preview__preview_start - A ,
browser_open, or similar URL-opening tool — pass the URL.open_url - Cursor / Codex CLI / others may not expose a preview tool to the agent. In that case, just print the URL and tell the user how to open it (their browser, their IDE's built-in browser pane, etc.).
-
Do not assume any specific preview tool exists. Inspect the tools available to you in the current session. If one matches the description above, use it. If none does, fall back to step 2 (print the URL prominently).
The stream stays alive until . Multiple clients (the host's preview + the user's browser + a tunnel) can read the same URL simultaneously.
npx serve-sim --killSee references/workflows.md workflow "Show the simulator stream in the host's preview" for the full recipe.
当用户要求“在此处查看模拟器”、“在预览中查看”、“在本工具中打开”等类似需求时,目标是将模拟器画面流式传输到用户正在交互的界面中。serve-sim会返回一个常规HTTP URL——Agent的任务是展示该URL,若宿主提供了预览工具,则将URL传递给该工具。
步骤:
-
启动serve-sim并捕获URL:sh
npx serve-sim --detach -q返回的JSON数据类似。{"pid":..., "port":3200, "url":"http://localhost:3200", "streamUrl":"http://localhost:3100", ...}字段是面向人类的预览UI地址;url是原始MJPEG端点地址。streamUrl -
始终清晰展示URL,以便用户可以手动在任意浏览器中打开作为备选方案。
-
探测宿主的预览工具,若存在则传递URL。你可能在工具集中看到以下工具名称:
- (Claude Code)——调用时传入
preview_start。{ url: "<url>" } - (部分MCP环境)。
mcp__Claude_Preview__preview_start - 、
browser_open或类似的URL打开工具——传入URL即可。open_url - Cursor / Codex CLI / 其他工具可能未向Agent暴露预览工具。这种情况下,只需打印URL并告知用户如何打开(浏览器、IDE内置浏览器面板等)。
-
不要假设存在特定的预览工具。请检查当前会话中可用的工具。若有匹配上述描述的工具,则使用它;若无,则回到步骤2(突出显示URL)。
流会一直保持活跃,直到执行。多个客户端(宿主预览面板 + 用户浏览器 + 隧道)可同时读取同一个URL。
npx serve-sim --kill完整流程请参考references/workflows.md中的“在宿主预览面板中显示模拟器流”工作流。
Workflows
工作流
For complete end-to-end recipes (UI automation, camera testing, accessibility-driven taps, deep-link flows, preview handoff), see references/workflows.md. The reference covers the patterns documented in serve-sim's own .
AGENTS.md完整的端到端流程(UI自动化、相机测试、基于无障碍功能的点击、深度链接流程、预览传递)请参考references/workflows.md。该参考文档涵盖了serve-sim自身中记录的模式。
AGENTS.mdCleanup
清理
Always stop helpers when finished, unless the user explicitly wants them to keep running:
sh
npx serve-sim --kill # stop all
npx serve-sim --kill "iPhone 16 Pro" # stop oneOrphan helpers occupy ports 3200/3100 and prevent fresh starts.
完成操作后,请始终停止辅助程序,除非用户明确要求保持运行:
sh
npx serve-sim --kill # 停止所有辅助程序
npx serve-sim --kill "iPhone 16 Pro" # 停止特定设备的辅助程序孤立的辅助程序会占用3200/3100端口,导致无法启动新的实例。
Anti-patterns
反模式
- Do not pass pixel coordinates. All coords are normalized . If the user gives pixel values, divide by the screen dimensions reported by
0..1.GET /config - Do not use for plain taps. Use
gesture. See "Critical gotcha" above.tap - Do not assume is already running. Verify with
npx serve-simor by checking--list. If absent, start it explicitly.$TMPDIR/serve-sim/server-{udid}.json - Do not skip the prerequisites check on the first invocation in a session. Wrong macOS version, missing Xcode CLI tools, or Node <18 produce confusing errors downstream.
- Do not invent button names. Only these six are valid: ,
home,swipe_home,app_switcher,lock,siri. See references/buttons-rotation.md for the source-of-truth list.side_button - Do not parse the non-quiet human output. Use for JSON.
-q - Do not leave camera helpers running across unrelated tasks. Stop them with when done.
npx serve-sim camera --stop-webcam - Do not guess coordinates when an accessibility lookup returns no match. If you fetched the AX tree (e.g. ) to find a target element and the query returned no result, fail loudly — tapping a guessed spot is almost always worse than reporting "target not found" back to the user. See references/workflows.md workflow 1 for the guard pattern.
GET /ax
- 不要传入像素坐标。所有坐标均为归一化的0..1范围。若用户提供像素值,请除以返回的屏幕尺寸。
GET /config - 不要使用执行普通点击。请使用
gesture。请查看上方“关键注意事项”。tap - 不要假设已在运行。请通过
npx serve-sim或检查--list验证。若不存在,请显式启动它。$TMPDIR/serve-sim/server-{udid}.json - 不要跳过会话首次调用时的前置条件检查。错误的macOS版本、缺失的Xcode CLI工具或Node <18会导致后续出现难以排查的错误。
- 不要自定义按键名称。仅支持以下六个有效名称:、
home、swipe_home、app_switcher、lock、siri。请参考references/buttons-rotation.md获取权威列表。side_button - 不要解析非静默模式的人类可读输出。请使用参数获取JSON格式输出。
-q - 不要在无关任务中保持相机辅助程序运行。完成后请使用停止它。
npx serve-sim camera --stop-webcam - 当无障碍查询无匹配结果时,不要猜测坐标。若你获取了AX树(例如)以查找目标元素但查询无结果,请明确告知用户失败原因——点击猜测的位置几乎总是比报告“未找到目标”更糟糕。请参考references/workflows.md工作流1中的防护模式。
GET /ax
Reference index
参考索引
- references/gestures.md — exact gesture JSON shapes, edge values, multi-touch, drag/swipe recipes.
- references/buttons-rotation.md — the six valid buttons and the four orientations, with behavioral notes.
- references/camera.md — synthetic camera injection: placeholder, file, webcam, mirror modes, hot-swap.
- references/permissions.md — granting/revoking app privacy permissions, including push notifications.
- references/ca-debug.md — the five CoreAnimation debug flags and when each one helps.
- references/endpoints.md — HTTP and WebSocket endpoints for agents that bypass the CLI.
- references/workflows.md — end-to-end recipes for UI automation, camera testing, deep-link flows.
- references/gestures.md — 精确的手势JSON格式、边界值、多点触控、拖拽/滑动流程。
- references/buttons-rotation.md — 六个有效按键和四个设备方向,包含行为说明。
- references/camera.md — 合成相机注入:占位符、文件、摄像头、镜像模式、热切换。
- references/permissions.md — 授予/撤销应用隐私权限,包括推送通知。
- references/ca-debug.md — 五个CoreAnimation调试标志及其适用场景。
- references/endpoints.md — 供Agent绕过CLI使用的HTTP和WebSocket端点。
- references/workflows.md — UI自动化、相机测试、深度链接流程的端到端流程。