droid-control
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDroid Control
Droid Control
Automate terminals and browsers. Three routing decisions, then atoms guide you the rest of the way.
实现终端与浏览器自动化。只需做出三项路由决策,后续流程将由原子组件(atoms)引导完成。
Ground rules
基本原则
- Real apps, real environments. Non-deterministic behavior (LLM responses, network latency, variable output) is expected. Handle it with /
wait. Never substitute fixtures or mocked data.wait-idle - Commit to execution. Once you've chosen a driver, run the plan. If something fails mid-run, recover and retry -- don't re-evaluate the approach.
- Atoms are self-contained. Load one and follow its mechanics. No cross-referencing needed.
- is the ONLY way to launch recorded sessions.
tctlmanages recording by wrappingtctlaround the PTY — rawasciinema rechas no recording capability and never will. Never calltuistorydirectly; unknown flags crashtuistory launch. Always resolvetuistory-relayto its absolute filesystem path before use, especially when delegating to workers (they don't inheritTCTL).${DROID_PLUGIN_ROOT} - Isolate every run. Multiple droids may be filming simultaneously on the same machine. Session names and output paths share a global namespace (). At the start of every workflow, generate a run ID (
/tmp/tctl-sessions/or similar) and use it as a prefix for all session names and a scoped temp directory for all output files:RUN_ID=$(date +%s)-$$Never use bare session names likebashRUN_ID="$(date +%s)-$$" RUN_DIR="$(mktemp -d /tmp/droid-run-${RUN_ID}-XXXXXX)" # Session names: -s ${RUN_ID}-before, -s ${RUN_ID}-after # Output paths: ${RUN_DIR}/before.cast, ${RUN_DIR}/after.cast,-s demo,-s before— they will collide with concurrent runs.-s after
- 真实应用,真实环境。 需考虑非确定性行为(如LLM响应、网络延迟、输出变化),使用/
wait处理。绝不使用固定数据或模拟数据替代真实场景。wait-idle - 坚持执行到底。 选定驱动程序后,执行既定方案。若运行中途失败,应恢复并重试,而非重新评估方案。
- 原子组件独立封装。 加载单个组件后,只需遵循其自身机制,无需跨组件参考。
- 是启动录制会话的唯一方式。
tctl通过在PTY外层封装tctl来管理录制——原生asciinema rec不具备录制功能,且未来也不会支持。绝不要直接调用tuistory;未知参数会导致tuistory launch崩溃。使用前务必将tuistory-relay解析为文件系统绝对路径,尤其是在委托给工作进程时(它们不会继承TCTL环境变量)。${DROID_PLUGIN_ROOT} - 隔离每次运行。 同一台机器上可能有多个droid同时录制。会话名称和输出路径共享全局命名空间()。在每个工作流开始时,生成一个运行ID(如
/tmp/tctl-sessions/),并将其作为所有会话名称的前缀,同时为所有输出文件创建一个带作用域的临时目录:RUN_ID=$(date +%s)-$$绝不要使用bashRUN_ID="$(date +%s)-$$" RUN_DIR="$(mktemp -d /tmp/droid-run-${RUN_ID}-XXXXXX)" # Session names: -s ${RUN_ID}-before, -s ${RUN_ID}-after # Output paths: ${RUN_DIR}/before.cast, ${RUN_DIR}/after.cast、-s demo、-s before这类无前缀会话名称——它们会与并发运行的任务冲突。-s after
Routing
路由选择
Three independent lookups. Do all three, then load the union of skills they produce.
三项独立的查询操作。完成全部三项后,加载它们所产生技能的集合。
1. Target route — what are you driving?
1. 目标路由——你要操控什么?
| Target | Load these skills |
|---|---|
Droid CLI ( | droid-cli + tuistory backend via |
| Droid CLI (real terminal proof) | true-input + droid-cli |
| Other terminal TUI | tuistory backend via |
| Other terminal TUI (real terminal proof) | true-input |
| Web page or Electron app | agent-browser |
| Raw terminal byte sequences | true-input + pty-capture |
tuistory is the default for terminal work. Use true-input only when you need real terminal rendering evidence.
| 目标对象 | 加载以下技能 |
|---|---|
Droid CLI( | droid-cli + 通过 |
| Droid CLI(真实终端证明) | true-input + droid-cli |
| 其他终端TUI | 通过 |
| 其他终端TUI(真实终端证明) | true-input |
| 网页或Electron应用 | agent-browser |
| 原始终端字节序列 | true-input + pty-capture |
tuistory是终端操作的默认选择。仅当需要真实终端渲染证据时,才使用true-input。
2. Stage route — what does the workflow need?
2. 阶段路由——工作流需要什么?
Every workflow passes through stages. Load the atoms for each stage you'll use.
| Stage | Skill | When to load |
|---|---|---|
| Capture | capture | Always -- every workflow records or captures something |
| Compose | compose | When the deliverable is a produced artifact (video, annotated screenshots, comparison image) |
| Verify | verify | Always -- every deliverable gets checked against commitments |
每个工作流都会经历多个阶段。加载你将使用的每个阶段对应的原子组件。
| 阶段 | 技能 | 加载时机 |
|---|---|---|
| 捕获 | capture | 始终加载——每个工作流都会进行录制或捕获操作 |
| 合成 | compose | 当交付物是生成的工件(视频、标注截图、对比图)时 |
| 验证 | verify | 始终加载——每个交付物都要对照承诺检查 |
3. Artifact route — does compose need polish tools?
3. 工件路由——compose是否需要美化工具?
Only relevant when compose is loaded.
| Artifact need | Also load |
|---|---|
| Showcase polish (window chrome, branded frame, cinematic background) | showcase |
| Effects and keystroke overlays | (compose handles this — they're fields in the Remotion props JSON) |
仅在加载compose时适用。
| 工件需求 | 额外加载 |
|---|---|
| 展示美化(窗口边框、品牌框架、电影级背景) | showcase |
| 特效与按键覆盖层 | (由compose处理——它们是Remotion props JSON中的字段) |
Workflow shape
工作流结构
Command (intent + commitments)
→ Target route (load driver atoms)
→ Capture (record / screenshot / byte-capture)
→ Compose (assemble deliverable, if needed)
→ Verify (check against commitments)
→ ReportCommands declare what to produce. Atoms own how.
命令(意图+承诺)
→ 目标路由(加载驱动原子组件)
→ 捕获(录制/截图/字节捕获)
→ 合成(按需组装交付物)
→ 验证(对照承诺检查)
→ 报告命令定义要生成什么,原子组件负责如何实现。
Layout default
默认布局
Default: . One clip showing the target/final state. Pick this unless the deliverable is fundamentally a comparison.
single| Case | Layout |
|---|---|
| Brand-new feature (no meaningful prior state) | |
| Bug fix, single-clip proof of the working path | |
| Walkthrough / tutorial / readme hero | |
| Regression proof (broken vs fixed) | |
| Behavior-preserving refactor (visual parity is the point) | |
| User explicitly asks for a comparison | |
Do not synthesize a "before" state to justify . If there is no real baseline, use .
side-by-sidesingle默认:(单帧)。 单个片段展示目标/最终状态。除非交付物本质上是对比内容,否则选择此布局。
single| 场景 | 布局 |
|---|---|
| 全新功能(无有意义的前置状态) | |
| Bug修复、工作路径的单片段证明 | |
| 演练/教程/自述文件示例 | |
| 回归证明(故障vs修复) | |
| 行为保留重构(视觉一致性为核心) | |
| 用户明确要求对比 | |
不要为了使用(并排)布局而合成“前置”状态。如果没有真实基准,就使用布局。
side-by-sidesingleDelegation
任务委托
The parent agent plans and orchestrates. Mechanical work runs in worker subagents via the Task tool. This keeps the parent's context clean and enables parallelism.
主Agent负责规划与编排。机械性工作通过Task工具由工作子Agent执行。这样可保持主Agent的上下文整洁,并支持并行处理。
What to delegate
可委托的任务
| Task | Delegate? | Why |
|---|---|---|
| Capture clip (single layout) | YES | Worker runs the interaction script end-to-end and returns the |
| Capture both clips (comparison layout) | YES — | Branches are independent; run in parallel |
| Remotion render | YES | Needs only props JSON, clip paths, output path. Runs |
| Planning, interaction scripting | NO — parent | Requires PR context and editorial judgment |
| Layout and prop construction | NO — parent | Requires editorial decisions about effects, timing, labels |
| Verification | NO — parent | Requires commitment context |
| Single ffprobe / file-existence check | NO — inline | Too trivial for subagent overhead |
| 任务 | 是否委托? | 原因 |
|---|---|---|
| 捕获片段(单帧布局) | 是 | 工作进程从头到尾执行交互脚本,并返回 |
| 捕获两个片段(对比布局) | 是——每个任务设置 | 分支相互独立,可并行运行 |
| Remotion渲染 | 是 | 仅需props JSON、片段路径、输出路径。执行 |
| 规划、交互脚本编写 | 否——主Agent负责 | 需要PR上下文和编辑判断 |
| 布局与props构建 | 否——主Agent负责 | 需要关于特效、时长、标签的编辑决策 |
| 验证 | 否——主Agent负责 | 需要承诺上下文 |
| 单次ffprobe/文件存在性检查 | 否——内联执行 | 过于简单,无需子Agent开销 |
How to delegate
委托方式
Step 0: Resolve paths and generate a run ID. Workers don't inherit . Resolve once, paste everywhere:
${DROID_PLUGIN_ROOT}bash
TCTL="$(realpath "${DROID_PLUGIN_ROOT}/bin/tctl")"
RENDER="$(realpath "${DROID_PLUGIN_ROOT}/scripts/render-showcase.sh")"
RUN_ID="$(date +%s)-$$"
RUN_DIR="$(mktemp -d /tmp/droid-run-${RUN_ID}-XXXXXX)"Use for all output files (recordings, props, rendered video). Use as a prefix for all session names. Never use bare names like or hardcoded paths like .
${RUN_DIR}${RUN_ID}--s before/tmp/before.castGive workers exact commands with the resolved absolute paths — not abstract instructions, not , not . The parent does the thinking; the worker executes:
tuistory${DROID_PLUGIN_ROOT}Task prompt for a capture worker:
"Run these commands in order. Report the output file path and any errors.
1. /abs/path/to/bin/tctl launch "droid-dev" -s 1712345678-42-before --backend tuistory \
--repo-root /abs/path/to/baseline/worktree \
--cols 120 --rows 36 --record /tmp/droid-run-1712345678-42-xxxx/before.cast \
--env FORCE_COLOR=3 --env COLORTERM=truecolor
2. /abs/path/to/bin/tctl -s 1712345678-42-before wait ">" --timeout 15000
3. /abs/path/to/bin/tctl -s 1712345678-42-before type "hello world"
4. /abs/path/to/bin/tctl -s 1712345678-42-before press enter
5. /abs/path/to/bin/tctl -s 1712345678-42-before wait-idle
6. /abs/path/to/bin/tctl -s 1712345678-42-before close"Task prompt for a Remotion render worker:
"Run this command. Report the output file path and any errors.
/abs/path/to/scripts/render-showcase.sh \
--props /tmp/droid-run-1712345678-42-xxxx/showcase-props.json \
--output /tmp/droid-run-1712345678-42-xxxx/demo.mp4 \
/tmp/droid-run-1712345678-42-xxxx/before.cast /tmp/droid-run-1712345678-42-xxxx/after.cast"步骤0:解析路径并生成运行ID。 工作进程不会继承环境变量。只需解析一次,然后在所有地方使用:
${DROID_PLUGIN_ROOT}bash
TCTL="$(realpath "${DROID_PLUGIN_ROOT}/bin/tctl")"
RENDER="$(realpath "${DROID_PLUGIN_ROOT}/scripts/render-showcase.sh")"
RUN_ID="$(date +%s)-$$"
RUN_DIR="$(mktemp -d /tmp/droid-run-${RUN_ID}-XXXXXX)"所有输出文件(录制内容、props、渲染视频)都使用目录。所有会话名称都以作为前缀。绝不要使用这类无前缀名称,或这类硬编码路径。
${RUN_DIR}${RUN_ID}--s before/tmp/before.cast向工作进程提供包含已解析绝对路径的精确命令——而非抽象指令、或。主Agent负责决策,工作进程负责执行:
tuistory${DROID_PLUGIN_ROOT}捕获工作进程的Task提示:
"按顺序运行以下命令。报告输出文件路径及任何错误。
1. /abs/path/to/bin/tctl launch "droid-dev" -s 1712345678-42-before --backend tuistory \
--repo-root /abs/path/to/baseline/worktree \
--cols 120 --rows 36 --record /tmp/droid-run-1712345678-42-xxxx/before.cast \
--env FORCE_COLOR=3 --env COLORTERM=truecolor
2. /abs/path/to/bin/tctl -s 1712345678-42-before wait ">" --timeout 15000
3. /abs/path/to/bin/tctl -s 1712345678-42-before type "hello world"
4. /abs/path/to/bin/tctl -s 1712345678-42-before press enter
5. /abs/path/to/bin/tctl -s 1712345678-42-before wait-idle
6. /abs/path/to/bin/tctl -s 1712345678-42-before close"Remotion渲染工作进程的Task提示:
"运行以下命令。报告输出文件路径及任何错误。
/abs/path/to/scripts/render-showcase.sh \
--props /tmp/droid-run-1712345678-42-xxxx/showcase-props.json \
--output /tmp/droid-run-1712345678-42-xxxx/demo.mp4 \
/tmp/droid-run-1712345678-42-xxxx/before.cast /tmp/droid-run-1712345678-42-xxxx/after.cast"Parallel capture pattern (comparison flows only)
并行捕获模式(仅适用于对比流程)
Only applicable when the Layout default table above selects . For a layout, launch one capture worker and skip this section.
side-by-sidesingleFor before/after comparison demos, launch both capture workers simultaneously:
1. Parent constructs the interaction script (identical for both branches)
2. Launch worker A: capture the baseline/reference branch with `--repo-root` set to that worktree
3. Launch worker B: capture the candidate/change branch with `--repo-root` set to that worktree
4. Wait for both to complete (TaskOutput)
5. Collect .cast paths from results
6. Continue to compose仅当上述默认布局表格选择时适用。如果是布局,启动一个捕获工作进程即可,跳过此部分。
side-by-sidesingle对于前后对比演示,同时启动两个捕获工作进程:
1. 主Agent构建交互脚本(两个分支脚本相同)
2. 启动工作进程A:将`--repo-root`设置为基准/参考分支的工作目录,进行捕获
3. 启动工作进程B:将`--repo-root`设置为候选/变更分支的工作目录,进行捕获
4. 等待两者完成(获取TaskOutput)
5. 从结果中收集`.cast`文件路径
6. 继续执行合成步骤Shared tooling
共享工具
Terminal drivers use the unified wrapper. agent-browser has its own CLI and does not use .
tctltctlDrivers can be combined in one workflow — e.g., for a CLI and for a web UI it interacts with.
tctlagent-browser终端驱动使用统一的封装器。agent-browser有自己的CLI,不使用。
tctltctl驱动程序可在同一个工作流中组合使用——例如,使用操控CLI,使用操控与之交互的网页UI。
tctlagent-browserPrerequisites
前置条件
| Stage | Platform | Required | Optional |
|---|---|---|---|
| tuistory | All | | |
| true-input | Linux/Wayland | | |
| true-input | Windows (KVM) | | |
| true-input | macOS (QEMU) | | — |
| agent-browser | All | | — |
| compose | All | | — |
| showcase | All | Node.js (>= 18), Chrome/Chromium | — |
| 阶段 | 平台 | 必需工具 | 可选工具 |
|---|---|---|---|
| tuistory | 所有平台 | | |
| true-input | Linux/Wayland | | |
| true-input | Windows (KVM) | | |
| true-input | macOS (QEMU) | | — |
| agent-browser | 所有平台 | | — |
| compose | 所有平台 | | — |
| showcase | 所有平台 | Node.js(>= 18)、Chrome/Chromium | — |
Install commands
安装命令
bash
undefinedbash
undefinedtuistory driver + recording
tuistory驱动 + 录制
npm install -g tuistory # virtual PTY driver
pip install asciinema # terminal recording (tctl wraps this)
cargo install --git https://github.com/asciinema/agg # .cast -> .gif converter (compose needs this)
npm install -g tuistory # 虚拟PTY驱动
pip install asciinema # 终端录制(tctl封装此工具)
cargo install --git https://github.com/asciinema/agg # .cast -> .gif转换器(compose需要)
true-input driver (Linux/Wayland)
true-input驱动(Linux/Wayland)
sudo apt-get install -y cage wtype # required: headless compositor + keystroke injection
sudo apt-get install -y grim wf-recorder # optional: screenshots + video recording
sudo apt-get install -y cage wtype # 必需:无头 compositor + 按键注入
sudo apt-get install -y grim wf-recorder # 可选:截图 + 视频录制
agent-browser driver
agent-browser驱动
agent-browser install # one-time: downloads bundled Chromium
agent-browser install # 一次性操作:下载捆绑的Chromium
compose + showcase (video rendering)
compose + showcase(视频渲染)
sudo apt-get install -y ffmpeg # video processing (includes ffprobe)
cd ${DROID_PLUGIN_ROOT}/remotion && npm install # Remotion dependencies
sudo apt-get install -y ffmpeg # 视频处理(包含ffprobe)
cd ${DROID_PLUGIN_ROOT}/remotion && npm install # Remotion依赖
Chrome or Chromium must be installed for Remotion rendering
视频渲染需安装Chrome或Chromium
undefinedundefined