capture

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Capture

录制

The orchestrator routed you here. This atom owns the full recording lifecycle: launch a target, execute an interaction script, collect raw outputs.
You should already have a driver atom loaded (tuistory, true-input, or agent-browser) and optionally a target atom (droid-cli). This atom layers the recording discipline on top.
编排器将你引导至此。该原子模块负责完整的录制生命周期:启动目标对象、执行交互脚本、收集原始输出。
你应当已加载了一个驱动原子模块(tuistory、true-input或agent-browser),还可选择加载一个目标原子模块(droid-cli)。本原子模块在此基础上增加录制规则。

Inputs

输入参数

The command that invoked you should have provided:
  • Target: what to launch and on which branch(es)
  • Interaction script: the sequence of actions to perform
  • What to capture: recordings (.cast/.mp4), screenshots, text snapshots, byte sequences
  • Keystroke logging: whether to emit a keystroke TSV for later overlay
调用本模块的命令应提供以下参数:
  • 目标对象:要启动的内容及其所在分支
  • 交互脚本:要执行的操作序列
  • 捕获内容:录制文件(.cast/.mp4)、截图、文本快照、字节序列
  • 按键日志:是否生成按键TSV文件供后续叠加使用

Recording lifecycle

录制生命周期

1. Pre-flight

1. 前期准备

Before recording anything:
  • Terminal size is consistent across all sessions (
    --cols 120 --rows 36
    )
  • Browser viewport size matches the composition layout (see "Browser viewport sizing" below) — mismatched aspects letterbox in the final video
  • Branch/worktree paths and env vars are correct
  • Recording format matches the driver:
    .cast
    for tuistory,
    .mp4
    for true-input, screenshots for agent-browser
  • If comparing branches, both sessions use identical terminal / viewport dimensions and launch parameters
  • For
    droid-dev
    captures,
    --repo-root
    is mandatory
    tctl
    will refuse to launch without it
  • Color env vars are set (see below)
在开始录制前:
  • 所有会话的终端尺寸保持一致(
    --cols 120 --rows 36
  • 浏览器视口尺寸与合成布局匹配(见下方“浏览器视口尺寸设置”)——比例不匹配会在最终视频中出现黑边
  • 分支/工作树路径和环境变量正确
  • 录制格式与驱动匹配:tuistory对应
    .cast
    ,true-input对应
    .mp4
    ,agent-browser对应截图
  • 若进行分支对比,两个会话需使用完全相同的终端/视口尺寸和启动参数
  • 对于
    droid-dev
    捕获,
    --repo-root
    必填项——
    tctl
    会拒绝启动,若未提供该参数
  • 已设置颜色环境变量(见下方)

Browser viewport sizing

浏览器视口尺寸设置

Panel aspect ratio in the final composition is layout-dependent. At the default 1920×1080 output with factory preset margins, the window-chrome panels that clips render into come out roughly:
LayoutPanel aspectRecommended browser viewport
single
~1760×920 (≈16:9 landscape)1280×720 or 1440×810
side-by-side
~872×920 per panel (≈8:9, near-square / slight portrait)960×1000, 900×1000, or 1024×1080
Feeding a 16:9 landscape recording into a near-square side-by-side panel triggers
objectFit: "contain"
letterboxing — you get a thin strip of content with giant black bars above and below. Two ways to avoid it:
  1. Match aspects at capture time (preferred) — pick the viewport from the table above based on the committed layout.
  2. Opt into cropping at compose time — pass
    "objectFit": "cover"
    in showcase props. Crops the edges of the clip instead of letterboxing. Use when the relevant UI is centered and the clip's edges are expendable.
If you're unsure of the layout when capturing, default to
960×1000
— it is workable in both layouts (slight horizontal letterbox in
single
, no letterbox in
side-by-side
).
bash
TCTL=${DROID_PLUGIN_ROOT}/bin/tctl
最终合成中的面板宽高比取决于布局。在默认1920×1080输出及工厂预设边距下,剪辑渲染到的窗口边框面板尺寸大致如下:
布局面板宽高比推荐浏览器视口尺寸
single
~1760×920(≈16:9 横屏)1280×7201440×810
side-by-side
每个面板约872×920(≈8:9,接近正方形/轻微竖屏)960×1000900×10001024×1080
将16:9横屏录制内容放入接近正方形的side-by-side面板时,会触发
objectFit: "contain"
黑边模式——内容会变成一条细条,上下有巨大黑边。有两种方法避免这种情况:
  1. 录制时匹配宽高比(推荐)——根据确定的布局选择上表中的视口尺寸。
  2. 合成时选择裁剪模式——在展示属性中传入
    "objectFit": "cover"
    。裁剪剪辑边缘而非添加黑边。适用于相关UI居中且剪辑边缘可舍弃的场景。
若录制时不确定布局,默认使用
960×1000
——该尺寸在两种布局下都可用(
single
布局下有轻微横向黑边,
side-by-side
布局下无黑边)。
bash
TCTL=${DROID_PLUGIN_ROOT}/bin/tctl

RUN_ID and RUN_DIR should already be set by the parent (see droid-control ground rule 5)

RUN_ID 和 RUN_DIR 应由父进程设置(见droid-control基本原则5)

undefined
undefined

2. Launch and record

2. 启动并录制

CRITICAL: tuistory's virtual PTY does not advertise color support by default. Node.js apps (Ink/chalk) detect this and suppress ALL color escape codes, producing a monochrome recording. You must pass
FORCE_COLOR=3
and
COLORTERM=truecolor
to force full 24-bit color output. Without these, agg has nothing to theme and the video will look grey/desaturated regardless of the agg theme chosen.
Single branch:
bash
$TCTL launch "droid-dev" -s ${RUN_ID}-demo --backend tuistory \
  --repo-root /path/to/worktree \
  --cols 120 --rows 36 --record ${RUN_DIR}/demo.cast \
  --env FORCE_COLOR=3 --env COLORTERM=truecolor
Comparison (before/after):
bash
$TCTL launch "droid-dev" -s ${RUN_ID}-before --backend tuistory \
  --repo-root /path/to/baseline-worktree \
  --cols 120 --rows 36 --record ${RUN_DIR}/before.cast \
  --env FORCE_COLOR=3 --env COLORTERM=truecolor

$TCTL launch "droid-dev" -s ${RUN_ID}-after --backend tuistory \
  --repo-root /path/to/candidate-worktree \
  --cols 120 --rows 36 --record ${RUN_DIR}/after.cast \
  --env FORCE_COLOR=3 --env COLORTERM=truecolor
Browser: size the viewport to match the composition layout (see table above).
bash
undefined
重要提示:tuistory的虚拟PTY默认不支持颜色。 Node.js应用(如Ink/chalk)会检测到这一点并禁用所有颜色转义码,导致录制内容为单色。你必须传入
FORCE_COLOR=3
COLORTERM=truecolor
以强制输出完整24位颜色。若未设置这些变量,agg将无法应用主题,无论选择何种agg主题,视频都会呈现灰色/饱和度低的效果。
单分支录制:
bash
$TCTL launch "droid-dev" -s ${RUN_ID}-demo --backend tuistory \
  --repo-root /path/to/worktree \
  --cols 120 --rows 36 --record ${RUN_DIR}/demo.cast \
  --env FORCE_COLOR=3 --env COLORTERM=truecolor
对比录制(前后版本):
bash
$TCTL launch "droid-dev" -s ${RUN_ID}-before --backend tuistory \
  --repo-root /path/to/baseline-worktree \
  --cols 120 --rows 36 --record ${RUN_DIR}/before.cast \
  --env FORCE_COLOR=3 --env COLORTERM=truecolor

$TCTL launch "droid-dev" -s ${RUN_ID}-after --backend tuistory \
  --repo-root /path/to/candidate-worktree \
  --cols 120 --rows 36 --record ${RUN_DIR}/after.cast \
  --env FORCE_COLOR=3 --env COLORTERM=truecolor
浏览器录制: 将视口尺寸设置为与合成布局匹配(见上表)。
bash
undefined

side-by-side layout → near-square panel

side-by-side布局 → 接近正方形的面板

agent-browser open <url> --viewport 960x1000 agent-browser record start ${RUN_DIR}/demo.webm
agent-browser open <url> --viewport 960x1000 agent-browser record start ${RUN_DIR}/demo.webm

single layout → 16:9 panel

single布局 → 16:9面板

agent-browser open <url> --viewport 1280x720 agent-browser record start ${RUN_DIR}/demo.webm
undefined
agent-browser open <url> --viewport 1280x720 agent-browser record start ${RUN_DIR}/demo.webm
undefined

3. Execute the interaction script

3. 执行交互脚本

Film for a viewer with no context. You are a director, not an operator.
  • Record before setup -- the baseline state is act 1.
  • Hold after state changes -- 2-3 seconds so text is readable. Use
    snapshot --trim
    as natural verification beats.
  • Verify between steps --
    wait
    or
    snapshot
    to confirm state before proceeding. Don't blindly fire the next key.
  • Verification IS evidence. A snapshot that shows nothing changed after pressing ESC proves the session is frozen. A snapshot that shows an error message proves the command was blocked. Always snapshot after actions where the absence of a response is the point -- the viewer needs to see it too.
For comparison recordings, both branches run identical interactions -- only the behavior differs.
为没有上下文的观众录制内容。你是导演,而非操作员。
  • 在设置前开始录制——基线状态是第一幕。
  • 状态变化后保持录制——停留2-3秒以便文本可读。使用
    snapshot --trim
    作为自然的验证节点。
  • 步骤间进行验证——使用
    wait
    snapshot
    确认状态后再继续。不要盲目执行下一个按键操作。
  • 验证就是证据。按下ESC后无变化的快照证明会话已冻结。显示错误消息的快照证明命令被阻止。在那些“无响应才是重点”的操作后,务必进行快照——观众也需要看到这一点。
对于对比录制,两个分支需运行完全相同的交互操作——仅行为不同。

4. Keystroke logging

4. 按键日志

If the workflow requires keystroke overlay, emit a TSV file during recording. Since every interaction is scripted, the timing data is already known.
Write each keystroke's timestamp (seconds from recording start) and a human-readable label:
0.5	droid --fork
1.2	Enter
2.8	Ctrl+C
4.0	Esc
Use readable key names (
Ctrl+C
, not
\x03
). Save alongside the recording (e.g.,
/tmp/keys.tsv
).
若工作流需要按键叠加,录制期间需生成TSV文件。由于所有交互都是脚本化的,时序数据已知。
记录每个按键的时间戳(从录制开始的秒数)和人类可读的标签:
0.5	droid --fork
1.2	Enter
2.8	Ctrl+C
4.0	Esc
使用易读的按键名称(
Ctrl+C
,而非
\x03
)。将文件保存在录制文件旁(例如
/tmp/keys.tsv
)。

5. Close and verify raw outputs

5. 关闭并验证原始输出

bash
$TCTL -s demo close    # finalizes the .cast / stops recording
Before handing off, confirm every expected output file exists and is non-empty:
  • Recording files (.cast, .mp4, .webm)
  • Screenshot files (.png)
  • Keystroke TSV (if committed)
  • Text snapshot logs (if needed for the report)
bash
$TCTL -s demo close    # 完成.cast录制 / 停止录制
在交付前,确认所有预期输出文件存在且非空:
  • 录制文件(.cast、.mp4、.webm)
  • 截图文件(.png)
  • 按键TSV文件(若已配置)
  • 文本快照日志(若报告需要)

Evidence capture patterns

证据捕获模式

Proof typeHow to capture
Functional behaviorText snapshots:
$TCTL -s <name> snapshot --trim
Visual renderingScreenshots:
$TCTL -s <name> screenshot -o /tmp/proof-N.png
Keyboard encodingPTY bytes:
${DROID_PLUGIN_ROOT}/scripts/capture-terminal-bytes.py --backend <terminal> --combo <keys>
Web/ElectronScreenshots:
agent-browser screenshot --annotate /tmp/proof-N.png
Before/afterRun the same sequence on both branches at the same capture points
证明类型捕获方式
功能行为文本快照:
$TCTL -s <name> snapshot --trim
视觉渲染截图:
$TCTL -s <name> screenshot -o /tmp/proof-N.png
键盘编码PTY字节:
${DROID_PLUGIN_ROOT}/scripts/capture-terminal-bytes.py --backend <terminal> --combo <keys>
Web/Electron截图:
agent-browser screenshot --annotate /tmp/proof-N.png
前后对比在两个分支的相同捕获点运行相同操作序列

Outputs

输出结果

Hand these to the compose stage:
undefined
将以下内容交付至合成阶段:
undefined

Capture outputs

捕获输出

  • clips: [/tmp/before.cast, /tmp/after.cast]
  • screenshots: [/tmp/proof-1.png, /tmp/proof-2.png]
  • keys: /tmp/keys.tsv (if keystroke logging was requested)
  • driver: tuistory | true-input | agent-browser
  • terminal_size: 120x36 # for tuistory / true-input
  • viewport: 960x1000 # for agent-browser; report so compose knows the clip aspect
undefined
  • clips: [/tmp/before.cast, /tmp/after.cast]
  • screenshots: [/tmp/proof-1.png, /tmp/proof-2.png]
  • keys: /tmp/keys.tsv (若请求了按键日志)
  • driver: tuistory | true-input | agent-browser
  • terminal_size: 120x36 # 适用于tuistory / true-input
  • viewport: 960x1000 # 适用于agent-browser;报告该参数以便合成阶段了解剪辑宽高比
undefined

Recovery

故障恢复

If a session gets stuck mid-recording:
bash
$TCTL -s <name> press esc         # bail out of stuck dialog
$TCTL -s <name> snapshot --trim   # check visible state
$TCTL -s <name> close             # hard reset
For browser:
agent-browser close
.
Then re-launch and re-record. Partial recordings are not usable.
若录制中途会话卡住:
bash
$TCTL -s <name> press esc         # 退出卡住的对话框
$TCTL -s <name> snapshot --trim   # 检查可见状态
$TCTL -s <name> close             # 强制重置
浏览器会话:
agent-browser close
然后重新启动并录制。部分录制内容无法使用。