serve-sim

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

serve-sim

serve-sim

Drive an Apple Simulator (iOS, iPad, Apple Watch) from an agent using the serve-sim CLI. serve-sim spawns a Swift helper that captures the simulator framebuffer via
simctl io
, exposes it as an MJPEG stream plus a binary WebSocket input channel, and serves a React preview UI on top. This skill teaches an agent the exact CLI surface, the gesture JSON shape, the gotchas, and the recommended workflows.
通过serve-sim CLI,让Agent操控Apple Simulator(iOS、iPad、Apple Watch)。serve-sim会启动一个Swift辅助程序,通过
simctl io
捕获模拟器的帧缓冲区,将其以MJPEG流和二进制WebSocket输入通道的形式暴露出来,并在其上提供React预览UI。本技能将教会Agent该CLI的具体用法、手势JSON格式、注意事项以及推荐工作流。

When to use

使用场景

  • The user wants an agent to tap, swipe, drag, pinch, or send hardware buttons to a running Apple Simulator.
  • The user wants to stream a simulator to a browser (local, LAN, or tunneled) for review or remote control.
  • The user wants to inject a synthetic camera feed (file, webcam, or animated placeholder) into a specific app on the simulator.
  • The user wants to toggle CoreAnimation debug overlays (off-screen rendering, blended layers, slow animations) for performance work.
  • The user wants to simulate a memory warning or rotate the device programmatically.
  • The user wants to read the simulator's accessibility tree to find UI elements without pixel hunting.
  • The user wants to grant, revoke, or reset an app's privacy permissions — camera, photos, location, contacts, or push notifications.
  • 用户希望Agent对运行中的Apple Simulator执行点击、滑动、拖拽、捏合或触发硬件按键操作。
  • 用户希望将模拟器画面流式传输到浏览器(本地、局域网或隧道)以便查看或远程控制。
  • 用户希望向模拟器上的特定应用注入合成相机画面(文件、摄像头或动画占位符)。
  • 用户希望切换CoreAnimation调试叠加层(屏幕外渲染、混合图层、慢速动画)以进行性能调优。
  • 用户希望以编程方式模拟内存警告旋转设备
  • 用户希望读取模拟器的无障碍树,无需像素级定位即可找到UI元素。
  • 用户希望授予、撤销或重置应用的隐私权限——相机、照片、位置、联系人或推送通知权限。

When NOT to use

禁用场景

  • Android emulators → use
    adb shell
    tooling.
  • Building or installing an iOS app → use
    xcodebuild
    or
    xcrun simctl install
    .
  • React Native in-app runtime debugging (Redux state, network inspection, component tree) → use rn-debugger tooling.
  • Real iOS hardware devices → use
    xcrun devicectl
    or Xcode.
  • Android模拟器 → 使用
    adb shell
    工具集。
  • iOS应用的构建或安装 → 使用
    xcodebuild
    xcrun simctl install
  • React Native应用内运行时调试(Redux状态、网络检查、组件树)→ 使用rn-debugger工具集。
  • 真实iOS硬件设备 → 使用
    xcrun devicectl
    或Xcode。

Prerequisites

前置条件

Before any other action, verify the host satisfies these. If something is missing, tell the user exactly what to install — do not proceed.
RequirementCheck commandWhy
macOS host
uname -s
returns
Darwin
serve-sim only runs on macOS
Xcode CLI tools
xcrun --version
exits 0
simctl
is the underlying simulator driver
Node.js ≥18
node --version
≥18
serve-sim is an npm package run via
npx
macOS 14+ (optional)
sw_vers -productVersion
≥14
Required ONLY for
camera
subcommand
A bundled helper script is available:
scripts/check-prereqs.sh
. Run it; if it exits non-zero, surface the message to the user.
A booted simulator is required for most subcommands. Check with
xcrun simctl list devices booted
. If none are booted, tell the user to open Xcode → Simulator or to run
xcrun simctl boot <UDID>
.
在执行任何操作之前,请确认主机满足以下条件。若有缺失,请明确告知用户需要安装的内容——不要继续执行。
要求检查命令原因
macOS 主机
uname -s
返回
Darwin
serve-sim 仅在 macOS 上运行
Xcode CLI 工具
xcrun --version
正常退出(返回码0)
simctl
是底层模拟器驱动工具
Node.js ≥18
node --version
版本≥18
serve-sim 是通过
npx
运行的 npm 包
macOS 14+(可选)
sw_vers -productVersion
版本≥14
camera
子命令需要此版本
我们提供了一个打包好的辅助脚本:
scripts/check-prereqs.sh
。运行该脚本;若返回非零值,请将错误信息告知用户。
大多数子命令需要已启动的模拟器。可通过
xcrun simctl list devices booted
检查。若没有已启动的模拟器,请告知用户打开Xcode → Simulator,或运行
xcrun simctl boot <UDID>

Mental model

核心模型

text
┌──────────────┐  simctl io  ┌─────────────────┐  MJPEG / WS  ┌─────────┐
│ iOS Simulator│ ──────────► │ serve-sim-bin   │ ───────────► │ Browser │
└──────────────┘   (Swift)   │ (per-device)    │              └─────────┘
                             └─────────────────┘
                              state file in
                            $TMPDIR/serve-sim/
                            ┌──────────────────┐
                            │ serve-sim CLI    │
                            └──────────────────┘
Key invariants the agent must respect:
  • All coordinates are normalized 0..1, with
    (0, 0)
    at top-left and
    (1, 1)
    at bottom-right of the display. Never pass pixel coordinates.
  • One helper per device. Multiple booted simulators are supported by passing several device names or by attaching to all.
  • State lives in
    $TMPDIR/serve-sim/server-{udid}.json
    . Use
    serve-sim --list
    to query it; do not read the JSON directly unless you know what you are doing.
  • The orientation set via
    rotate
    is remembered by the helper
    , and subsequent gestures are rotated client-side. An agent that sends raw coords after a rotation does not need to compensate manually.
text
┌──────────────┐  simctl io  ┌─────────────────┐  MJPEG / WS  ┌─────────┐
│ iOS Simulator│ ──────────► │ serve-sim-bin   │ ───────────► │ Browser │
└──────────────┘   (Swift)   │ (per-device)    │              └─────────┘
                             └─────────────────┘
                              state file in
                            $TMPDIR/serve-sim/
                            ┌──────────────────┐
                            │ serve-sim CLI    │
                            └──────────────────┘
Agent必须遵守的关键规则:
  • 所有坐标均为归一化的0..1范围
    (0, 0)
    对应屏幕左上角,
    (1, 1)
    对应屏幕右下角。绝对不要传入像素坐标。
  • 每个设备对应一个辅助程序。支持多个已启动的模拟器,只需传入多个设备名称或附加到所有设备即可。
  • 状态存储在
    $TMPDIR/serve-sim/server-{udid}.json
    。使用
    serve-sim --list
    查询状态;除非明确知晓操作方式,否则不要直接读取该JSON文件。
  • 通过
    rotate
    设置的方向会被辅助程序记住
    ,后续的手势会在客户端自动适配旋转。Agent在旋转后发送原始坐标时无需手动调整。

Common operations

常见操作

GoalCommandNotes
Start preview server
npx serve-sim [device]
Default preview at
http://localhost:3200
, stream at
:3100
. Foreground process.
Start headless / daemon
npx serve-sim --detach [device]
Returns JSON with
pid
,
port
,
url
. Use for agent loops.
Show stream in host's preview
npx serve-sim --detach -q
→ hand off
url
to host preview tool
See "Showing the stream in your agent's preview" section.
List running streams
npx serve-sim --list
Add
-q
for JSON-only output.
Stop all helpers
npx serve-sim --kill
Pass
[device]
to stop a specific one.
Single tap
npx serve-sim tap <x> <y>
<x> <y>
in
0..1
. Use this, not
gesture
, for plain taps.
See "Critical gotcha" below.
Multi-step gesture
npx serve-sim gesture '<json>'
See references/gestures.md.
Hardware button
npx serve-sim button <name>
Names:
home
,
swipe_home
,
app_switcher
,
lock
,
siri
,
side_button
. See references/buttons-rotation.md.
Rotate device
npx serve-sim rotate <orientation>
portrait
,
portrait_upside_down
,
landscape_left
,
landscape_right
.
Simulate memory warning
npx serve-sim memory-warning
Equivalent to Debug → Simulate Memory Warning.
CoreAnimation debug
npx serve-sim ca-debug <option> <on|off>
Options:
blended
,
copies
,
misaligned
,
offscreen
,
slow-animations
. See references/ca-debug.md.
Inject camera feed
npx serve-sim camera <bundle-id> [--file <path>|--webcam [name]]
(Re)launches the app with the camera dylib attached. macOS 14+ only. See references/camera.md.
Hot-swap camera source
npx serve-sim camera switch <placeholder|webcam|file> [arg]
No app relaunch.
Manage app permissions
npx serve-sim permissions <grant|revoke|reset|list> <permission> <bundle-id>
Camera, photos, location, push notifications, contacts, etc. See references/permissions.md.
Read accessibility tree
curl http://localhost:3100/ax
Returns axe-style JSON. See references/endpoints.md for all endpoints.
Most subcommands accept
-d <udid|name>
to target a specific device when several are booted.
目标命令说明
启动预览服务器
npx serve-sim [device]
默认预览地址为
http://localhost:3200
,流地址为
:3100
。前台进程运行。
以无头/守护进程模式启动
npx serve-sim --detach [device]
返回包含
pid
port
url
的JSON数据。适用于Agent循环任务。
在宿主预览面板中显示流
npx serve-sim --detach -q
→ 将
url
传递给宿主预览工具
请查看“在Agent预览面板中显示流”章节。
列出正在运行的流
npx serve-sim --list
添加
-q
参数可仅输出JSON格式结果。
停止所有辅助程序
npx serve-sim --kill
传入
[device]
可停止特定设备的辅助程序。
单次点击
npx serve-sim tap <x> <y>
<x> <y>
为0..1范围内的归一化坐标。对于普通点击,请使用此命令,而非
gesture
。请查看下方“关键注意事项”。
多步骤手势
npx serve-sim gesture '<json>'
请参考references/gestures.md
硬件按键操作
npx serve-sim button <name>
按键名称:
home
swipe_home
app_switcher
lock
siri
side_button
。请参考references/buttons-rotation.md
旋转设备
npx serve-sim rotate <orientation>
方向选项:
portrait
portrait_upside_down
landscape_left
landscape_right
模拟内存警告
npx serve-sim memory-warning
等同于Debug → Simulate Memory Warning操作。
CoreAnimation调试
npx serve-sim ca-debug <option> <on|off>
选项:
blended
copies
misaligned
offscreen
slow-animations
。请参考references/ca-debug.md
注入相机画面
npx serve-sim camera <bundle-id> [--file <path>|--webcam [name]]
(重新)启动应用并附加相机动态库。仅支持macOS 14+。请参考references/camera.md
热切换相机源
npx serve-sim camera switch <placeholder|webcam|file> [arg]
无需重启应用。
管理应用权限
npx serve-sim permissions <grant|revoke|reset|list> <permission> <bundle-id>
支持相机、照片、位置、推送通知、联系人等权限。请参考references/permissions.md
读取无障碍树
curl http://localhost:3100/ax
返回axe风格的JSON数据。所有端点请参考references/endpoints.md
大多数子命令支持
-d <udid|name>
参数,当多个模拟器启动时可指定目标设备。

Critical gotcha: prefer
tap
over
gesture
for taps

关键注意事项:普通点击请优先使用
tap
而非
gesture

Each
serve-sim gesture
call opens its own WebSocket. If you issue two back-to-back
gesture
calls — one with
{"type":"begin",...}
and one with
{"type":"end",...}
— the simulator receives them with enough latency between them that the touch is interpreted as a long-press, not a tap. This is a deliberate constraint of the protocol, not a bug to work around.
Rule: for any single-shot tap, use
serve-sim tap <x> <y>
. Only use
gesture
for drags, swipes, or multi-step interactions where you must thread the same socket across
begin
move
× N →
end
.
每次调用
serve-sim gesture
都会打开一个独立的WebSocket。如果连续发出两个
gesture
调用——一个是
{"type":"begin",...}
,另一个是
{"type":"end",...}
——模拟器收到这两个调用的延迟会足够长,导致触摸被识别为长按而非点击。这是协议的刻意限制,并非需要解决的bug。
规则:对于任何单次点击操作,请使用
serve-sim tap <x> <y>
。仅在执行拖拽、滑动或需要在
begin
move
× N →
end
流程中保持同一连接的多步骤交互时,才使用
gesture

Targeting a specific device

指定目标设备

When multiple simulators are booted, every subcommand accepts
-d <udid|name>
. The name match is case-insensitive against the device name returned by
xcrun simctl list devices booted
. Examples:
sh
npx serve-sim tap 0.5 0.5 -d "iPhone 16 Pro"
npx serve-sim button home -d ABC12345-...
npx serve-sim --list                                # show all running streams
If the user has only one booted simulator, omit
-d
entirely. The skill should prefer auto-detection over hard-coding device names.
当多个模拟器启动时,所有子命令都支持
-d <udid|name>
参数。设备名称匹配对
xcrun simctl list devices booted
返回的设备名称不区分大小写。示例:
sh
npx serve-sim tap 0.5 0.5 -d "iPhone 16 Pro"
npx serve-sim button home -d ABC12345-...
npx serve-sim --list                                # 显示所有正在运行的流
如果用户仅启动了一个模拟器,可完全省略
-d
参数。本技能应优先使用自动检测,而非硬编码设备名称。

Output modes

输出模式

By default, serve-sim prints human-readable status to stdout. For agent loops, prefer JSON output:
sh
npx serve-sim --list -q          # JSON array of running streams
npx serve-sim --detach -q        # JSON with pid/port/url after spawn
npx serve-sim camera status -q   # JSON with {alive, source, mirror, ...}
Parse
-q
output programmatically. Never parse the non-
-q
human output — it can change between versions.
默认情况下,serve-sim会向stdout输出人类可读的状态信息。对于Agent循环任务,建议使用JSON输出:
sh
npx serve-sim --list -q          # 返回正在运行的流的JSON数组
npx serve-sim --detach -q        # 启动后返回包含pid/port/url的JSON数据
npx serve-sim camera status -q   # 返回包含{alive, source, mirror, ...}的JSON数据
请以编程方式解析
-q
参数的输出。绝对不要解析非
-q
模式的人类可读输出——其格式可能会在不同版本间变化。

Showing the stream in your agent's preview

在Agent预览面板中显示流

When the user asks to "see the simulator here", "view it in preview", "open it in this tool", or similar, the goal is to stream the simulator into the same surface the user is chatting with. serve-sim returns a regular HTTP URL — the agent's job is to surface that URL and, if the host exposes a preview tool, hand it off.
Steps:
  1. Start serve-sim and capture the URL:
    sh
    npx serve-sim --detach -q
    This returns JSON like
    {"pid":..., "port":3200, "url":"http://localhost:3200", "streamUrl":"http://localhost:3100", ...}
    . The
    url
    field is the human-facing preview UI;
    streamUrl
    is the raw MJPEG endpoint.
  2. Always surface the URL plainly in your response so the user can fallback to opening it manually in any browser.
  3. Probe your host's preview tool and hand off the URL if one exists. Examples of tool names you may see in your toolset:
    • preview_start
      (Claude Code) — call it with
      { url: "<url>" }
      .
    • mcp__Claude_Preview__preview_start
      (some MCP setups).
    • A
      browser_open
      ,
      open_url
      , or similar URL-opening tool — pass the URL.
    • Cursor / Codex CLI / others may not expose a preview tool to the agent. In that case, just print the URL and tell the user how to open it (their browser, their IDE's built-in browser pane, etc.).
  4. Do not assume any specific preview tool exists. Inspect the tools available to you in the current session. If one matches the description above, use it. If none does, fall back to step 2 (print the URL prominently).
The stream stays alive until
npx serve-sim --kill
. Multiple clients (the host's preview + the user's browser + a tunnel) can read the same URL simultaneously.
See references/workflows.md workflow "Show the simulator stream in the host's preview" for the full recipe.
当用户要求“在此处查看模拟器”、“在预览中查看”、“在本工具中打开”等类似需求时,目标是将模拟器画面流式传输到用户正在交互的界面中。serve-sim会返回一个常规HTTP URL——Agent的任务是展示该URL,若宿主提供了预览工具,则将URL传递给该工具。
步骤:
  1. 启动serve-sim并捕获URL:
    sh
    npx serve-sim --detach -q
    返回的JSON数据类似
    {"pid":..., "port":3200, "url":"http://localhost:3200", "streamUrl":"http://localhost:3100", ...}
    url
    字段是面向人类的预览UI地址;
    streamUrl
    是原始MJPEG端点地址。
  2. 始终清晰展示URL,以便用户可以手动在任意浏览器中打开作为备选方案。
  3. 探测宿主的预览工具,若存在则传递URL。你可能在工具集中看到以下工具名称:
    • preview_start
      (Claude Code)——调用时传入
      { url: "<url>" }
    • mcp__Claude_Preview__preview_start
      (部分MCP环境)。
    • browser_open
      open_url
      或类似的URL打开工具——传入URL即可。
    • Cursor / Codex CLI / 其他工具可能未向Agent暴露预览工具。这种情况下,只需打印URL并告知用户如何打开(浏览器、IDE内置浏览器面板等)。
  4. 不要假设存在特定的预览工具。请检查当前会话中可用的工具。若有匹配上述描述的工具,则使用它;若无,则回到步骤2(突出显示URL)。
流会一直保持活跃,直到执行
npx serve-sim --kill
。多个客户端(宿主预览面板 + 用户浏览器 + 隧道)可同时读取同一个URL。
完整流程请参考references/workflows.md中的“在宿主预览面板中显示模拟器流”工作流。

Workflows

工作流

For complete end-to-end recipes (UI automation, camera testing, accessibility-driven taps, deep-link flows, preview handoff), see references/workflows.md. The reference covers the patterns documented in serve-sim's own
AGENTS.md
.
完整的端到端流程(UI自动化、相机测试、基于无障碍功能的点击、深度链接流程、预览传递)请参考references/workflows.md。该参考文档涵盖了serve-sim自身
AGENTS.md
中记录的模式。

Cleanup

清理

Always stop helpers when finished, unless the user explicitly wants them to keep running:
sh
npx serve-sim --kill            # stop all
npx serve-sim --kill "iPhone 16 Pro"  # stop one
Orphan helpers occupy ports 3200/3100 and prevent fresh starts.
完成操作后,请始终停止辅助程序,除非用户明确要求保持运行:
sh
npx serve-sim --kill            # 停止所有辅助程序
npx serve-sim --kill "iPhone 16 Pro"  # 停止特定设备的辅助程序
孤立的辅助程序会占用3200/3100端口,导致无法启动新的实例。

Anti-patterns

反模式

  • Do not pass pixel coordinates. All coords are normalized
    0..1
    . If the user gives pixel values, divide by the screen dimensions reported by
    GET /config
    .
  • Do not use
    gesture
    for plain taps.
    Use
    tap
    . See "Critical gotcha" above.
  • Do not assume
    npx serve-sim
    is already running.
    Verify with
    --list
    or by checking
    $TMPDIR/serve-sim/server-{udid}.json
    . If absent, start it explicitly.
  • Do not skip the prerequisites check on the first invocation in a session. Wrong macOS version, missing Xcode CLI tools, or Node <18 produce confusing errors downstream.
  • Do not invent button names. Only these six are valid:
    home
    ,
    swipe_home
    ,
    app_switcher
    ,
    lock
    ,
    siri
    ,
    side_button
    . See references/buttons-rotation.md for the source-of-truth list.
  • Do not parse the non-quiet human output. Use
    -q
    for JSON.
  • Do not leave camera helpers running across unrelated tasks. Stop them with
    npx serve-sim camera --stop-webcam
    when done.
  • Do not guess coordinates when an accessibility lookup returns no match. If you fetched the AX tree (e.g.
    GET /ax
    ) to find a target element and the query returned no result, fail loudly — tapping a guessed spot is almost always worse than reporting "target not found" back to the user. See references/workflows.md workflow 1 for the guard pattern.
  • 不要传入像素坐标。所有坐标均为归一化的0..1范围。若用户提供像素值,请除以
    GET /config
    返回的屏幕尺寸。
  • 不要使用
    gesture
    执行普通点击
    。请使用
    tap
    。请查看上方“关键注意事项”。
  • 不要假设
    npx serve-sim
    已在运行
    。请通过
    --list
    或检查
    $TMPDIR/serve-sim/server-{udid}.json
    验证。若不存在,请显式启动它。
  • 不要跳过会话首次调用时的前置条件检查。错误的macOS版本、缺失的Xcode CLI工具或Node <18会导致后续出现难以排查的错误。
  • 不要自定义按键名称。仅支持以下六个有效名称:
    home
    swipe_home
    app_switcher
    lock
    siri
    side_button
    。请参考references/buttons-rotation.md获取权威列表。
  • 不要解析非静默模式的人类可读输出。请使用
    -q
    参数获取JSON格式输出。
  • 不要在无关任务中保持相机辅助程序运行。完成后请使用
    npx serve-sim camera --stop-webcam
    停止它。
  • 当无障碍查询无匹配结果时,不要猜测坐标。若你获取了AX树(例如
    GET /ax
    )以查找目标元素但查询无结果,请明确告知用户失败原因——点击猜测的位置几乎总是比报告“未找到目标”更糟糕。请参考references/workflows.md工作流1中的防护模式。

Reference index

参考索引

  • references/gestures.md — exact gesture JSON shapes, edge values, multi-touch, drag/swipe recipes.
  • references/buttons-rotation.md — the six valid buttons and the four orientations, with behavioral notes.
  • references/camera.md — synthetic camera injection: placeholder, file, webcam, mirror modes, hot-swap.
  • references/permissions.md — granting/revoking app privacy permissions, including push notifications.
  • references/ca-debug.md — the five CoreAnimation debug flags and when each one helps.
  • references/endpoints.md — HTTP and WebSocket endpoints for agents that bypass the CLI.
  • references/workflows.md — end-to-end recipes for UI automation, camera testing, deep-link flows.
  • references/gestures.md — 精确的手势JSON格式、边界值、多点触控、拖拽/滑动流程。
  • references/buttons-rotation.md — 六个有效按键和四个设备方向,包含行为说明。
  • references/camera.md — 合成相机注入:占位符、文件、摄像头、镜像模式、热切换。
  • references/permissions.md — 授予/撤销应用隐私权限,包括推送通知。
  • references/ca-debug.md — 五个CoreAnimation调试标志及其适用场景。
  • references/endpoints.md — 供Agent绕过CLI使用的HTTP和WebSocket端点。
  • references/workflows.md — UI自动化、相机测试、深度链接流程的端到端流程。