agent-tui

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

🚨 CRITICAL: macOS Daemon Workaround & Gemini CLI Usage 🚨

🚨 重要提示：macOS守护进程解决方案与Gemini CLI使用说明 🚨

When using

agent-tui

in this macOS environment, the default background daemonization process crashes, causing

Connection refused (os error 61)

errors.

You MUST start the daemon manually shielded from TTY hangups before running any
agent-tui
commands. Using

nohup

is insufficient; you must use

tmux

to provide a fully isolated pseudo-terminal.

To support parallel runs, only restart the daemon if it is not currently running:

bash

undefined

在macOS环境中使用

agent-tui

时，默认的后台守护进程化进程会崩溃，导致

Connection refused (os error 61)

错误。

**在运行任何

agent-tui

命令之前，您必须手动启动守护进程并避免TTY挂起问题。**仅使用

nohup

是不够的；您必须使用

tmux

提供完全隔离的伪终端。

为了支持并行运行，仅当守护进程未运行时才重启它：

bash

undefined

Check if daemon is alive, start it in tmux if it is not

检查守护进程是否存活，若未存活则在tmux中启动

if ! agent-tui sessions >/dev/null 2>&1; then tmux kill-session -t agent-tui 2>/dev/null || true agent-tui daemon stop 2>/dev/null || true rm -f /tmp/agent-tui* tmux new-session -d -s agent-tui 'agent-tui daemon start --foreground > /tmp/agent-tui-daemon.log 2>&1' sleep 1 fi

undefined

undefined

Session ID vs PID (Crucial for Reconnection)

会话ID与PID（重连关键）

When

agent-tui run

returns JSON, it includes both a

session_id

and a

pid

. The

pid

is purely informational (the OS process ID of the child command). You do not use the

pid

to reconnect or issue commands. You must always use the

session_id

(e.g.,

--session <id>

If the daemon crashes (

os error 61

), the pseudo-terminal is destroyed. Even if the child

pid

survives as an orphan, you cannot reconnect to it. You must restart the daemon using the workaround above and start a completely new session.

当

agent-tui run

返回JSON时，会包含

session_id

和

pid

。

pid

仅用于信息展示（子命令的操作系统进程ID）。您不能使用

pid

进行重连或发送命令。必须始终使用

session_id

（例如：

--session <id>

）。

如果守护进程崩溃（出现

os error 61

），伪终端会被销毁。即使子进程

pid

作为孤儿进程存活，您也无法重连到它。您必须使用上述解决方案重启守护进程，并启动一个全新的会话。

Testing the Gemini CLI

测试Gemini CLI

When testing the Gemini CLI with

agent-tui

, there are several strict requirements to ensure deterministic and accurate behavior:

Build Before Running:
```
agent-tui
```
runs the built JS files, not TypeScript. You MUST run
```
npm run build
```
or
```
npm run build:all
```
after making code changes and before launching the CLI with
```
agent-tui
```
.
Bypass Trust Modals: Always pass
```
GEMINI_CLI_TRUST_WORKSPACE=true
```
in the environment. If you don't, any new project-level agents or extensions will trigger a full-screen "Acknowledge and Enable" modal. This modal steals focus, swallows automation keystrokes, and causes
```
agent-tui wait
```
commands to time out.
Isolated Environments: If you need to test without real user credentials or existing agents interfering, isolate the global settings using
```
GEMINI_CLI_HOME=<some-test-dir>
```
.
Testing State Deltas (e.g., Reloads): If you are testing features that report deltas (e.g.,
```
/agents reload
```
outputting "1 new local subagent"), you MUST:
- Start the CLI first so it establishes its baseline registry.
- Use a separate shell command (outside of
```
agent-tui
```
  ) to write the new agent
```
.md
```
  /
```
.toml
```
  file.
- Use
```
agent-tui type
```
  and
```
press
```
  to trigger the
```
/agents reload
```
  command inside the running session.
- (If you add the files before starting the CLI, they become part of the baseline and won't trigger the delta logic).

bash

undefined

使用

agent-tui

测试Gemini CLI时，有几个严格要求以确保行为的确定性和准确性：

运行前构建：
```
agent-tui
```
运行的是已构建的JS文件，而非TypeScript文件。在修改代码后、使用
```
agent-tui
```
启动CLI之前，您必须运行
```
npm run build
```
或
```
npm run build:all
```
。
绕过信任弹窗：始终在环境变量中传入
```
GEMINI_CLI_TRUST_WORKSPACE=true
```
。如果不设置，任何新的项目级代理或扩展都会触发全屏的“Acknowledge and Enable”弹窗。该弹窗会抢占焦点、拦截自动化按键，并导致
```
agent-tui wait
```
命令超时。
隔离环境：如果需要在无真实用户凭据或现有代理干扰的情况下进行测试，请使用
```
GEMINI_CLI_HOME=<some-test-dir>
```
隔离全局设置。
测试状态增量（如重载）：如果您测试的功能会报告增量（例如
```
/agents reload
```
输出“1 new local subagent”），您必须：
- 先启动CLI，使其建立基线注册表。
- 使用单独的shell命令（在
```
agent-tui
```
  外部）写入新的代理
```
.md
```
  /
```
.toml
```
  文件。
- 使用
```
agent-tui type
```
  和
```
press
```
  在运行的会话中触发
```
/agents reload
```
  命令。
- （如果在启动CLI之前添加文件，它们会成为基线的一部分，不会触发增量逻辑）。

bash

undefined

Example: Standard isolated run (sandboxed config + bypass trust modals)

示例：标准隔离运行（沙箱配置 + 绕过信任弹窗）

env GEMINI_CLI_TRUST_WORKSPACE=true GEMINI_CLI_HOME=test-gemini-home agent-tui run -d "$(pwd)" node packages/cli/dist/index.js

undefined

env GEMINI_CLI_TRUST_WORKSPACE=true GEMINI_CLI_HOME=test-gemini-home agent-tui run -d "$(pwd)" node packages/cli/dist/index.js

undefined

Terminal Automation Mastery

终端自动化精通指南

Prerequisites

前提条件

Supported OS: macOS or Linux (Windows not supported yet).
Verify install:

bash

agent-tui --version

If not installed, use one of:

bash

undefined

支持的操作系统：macOS或Linux（暂不支持Windows）。
验证安装：

bash

agent-tui --version

如果未安装，可选择以下方式之一：

bash

undefined

Recommended: one-line install (macOS/Linux)

推荐：一键安装（macOS/Linux）

curl -fsSL https://raw.githubusercontent.com/pproenca/agent-tui/master/install.sh | sh


```bash

curl -fsSL https://raw.githubusercontent.com/pproenca/agent-tui/master/install.sh | sh


```bash

Package manager

包管理器安装

npm i -g agent-tui pnpm add -g agent-tui bun add -g agent-tui


```bash

npm i -g agent-tui pnpm add -g agent-tui bun add -g agent-tui


```bash

Build from source

从源码构建

cargo install --git https://github.com/pproenca/agent-tui.git --path cli/crates/agent-tui


If you used the install script, ensure `~/.local/bin` is on your PATH.

cargo install --git https://github.com/pproenca/agent-tui.git --path cli/crates/agent-tui


如果使用安装脚本，请确保`~/.local/bin`在您的PATH中。

Philosophy: Why Terminal Automation Is Different

理念：终端自动化为何与众不同

Terminal UIs are stateless from the observer's perspective. Unlike web browsers with a persistent DOM, terminal automation works with a constantly-refreshed character grid. This fundamental difference shapes everything:

Web Automation	Terminal Automation
DOM persists across interactions	Screen buffer is redrawn constantly
Selectors are stable	Text positions may shift
Query once, act many times	Must re-verify before EVERY action
Network events signal completion	Must detect visual stability

The Core Insight: agent-tui gives you vision without memory. Each screenshot is a fresh observation. Previous state means nothing after the UI changes. This isn't a limitation—it's the nature of terminal interaction.

终端UI从观察者的角度来看是无状态的。与具有持久DOM的Web浏览器不同，终端自动化处理的是不断刷新的字符网格。这一根本差异决定了所有操作逻辑：

Web自动化	终端自动化
DOM在交互过程中持久存在	屏幕缓冲区不断重绘
选择器稳定	文本位置可能偏移
查询一次，多次操作	每次操作前必须重新验证
网络事件标志完成	必须检测视觉稳定性

核心洞察：agent-tui为您提供“视觉”但无“记忆”。每个截图都是全新的观察结果。UI变化后，之前的状态毫无意义。这并非局限——而是终端交互的本质。

Mental Model: The Feedback Loop

思维模型：反馈循环

Think of terminal automation as a closed-loop control system:

    ┌──────────────────────────────────────────────┐
    │                                              │
    ▼                                              │
OBSERVE ──► DECIDE ──► ACT ──► WAIT ──► VERIFY ───┘
   │                                        │
   │                                        │
   └─────── NEVER skip ◄────────────────────┘

Each phase is mandatory. Skipping verification is the #1 cause of flaky automation.

将终端自动化视为一个闭环控制系统：

    ┌──────────────────────────────────────────────┐
    │                                              │
    ▼                                              │
观察 ──► 决策 ──► 执行 ──► 等待 ──► 验证 ───┘
   │                                        │
   │                                        │
   └─────── 绝不能跳过 ◄────────────────────┘

**每个阶段都是必需的。**跳过验证是自动化不稳定的首要原因。

The "Fresh Eyes" Principle

“全新视角”原则

Every time you need to interact with the UI:

Take a fresh screenshot — your previous one is now stale
Locate your target visually — text positions may have changed
Verify the state — the UI may have changed unexpectedly
Act only when stable — animations and loading states cause failures

This feels slower, but it's the only reliable approach. Optimistic reuse of stale state causes intermittent failures that are painful to debug.

每次需要与UI交互时：

截取全新截图——之前的截图已过期
视觉定位目标——文本位置可能已变化
验证状态——UI可能已意外变化
仅在稳定时执行操作——动画和加载状态会导致失败

这看起来更慢，但却是唯一可靠的方法。盲目复用过期状态会导致难以调试的间歇性失败。

Critical Rules (Non-Negotiable)

关键规则（不可协商）

RULE 1: Atomic Execution (No Pipelining) You are FORBIDDEN from chaining commands with
&&
(e.g.,
type "x" && press Enter && wait
). Modals or UI updates can intercept your keystrokes. You MUST execute one atomic action, wait, screenshot, and verify before taking the next action in a new turn.

RULE 2: Re-snapshot after EVERY action The UI state is invalidated by any change. Always take a fresh screenshot before acting again.

RULE 3: Never act on unstable UI If the UI is animating, loading, or transitioning,
wait --stable
first. Acting during transitions because race conditions.

RULE 4: Verify before claiming success Use
wait "expected text" --assert
to confirm outcomes. Don't assume an action worked—prove it.

RULE 5: Error Recovery If a
wait
command times out, DO NOT blindly restart or kill the session. Execute
screenshot
to visually diagnose what unexpected UI element (modal, error dialog, lost focus) intercepted the flow.

RULE 6: Clean up sessions Always end with
agent-tui kill
. Orphaned sessions consume resources and can interfere with future runs.

规则1：原子执行（禁止流水线） 禁止使用
&&
链式执行命令（例如：
type "x" && press Enter && wait
）。弹窗或UI更新可能会拦截您的按键。您必须执行一个原子操作，等待、截图、验证，然后在新的步骤中执行下一个操作。

规则2：每次操作后重新截图 任何变化都会使UI状态失效。再次执行操作前务必截取全新截图。

规则3：绝不在不稳定UI上执行操作 如果UI正在动画、加载或过渡，请先执行
wait --stable
。在过渡期间执行操作会导致竞态条件。

规则4：验证后再宣告成功 使用
wait "expected text" --assert
确认结果。不要假设操作成功——要证明它成功。

规则5：错误恢复 如果
wait
命令超时，请勿盲目重启或终止会话。执行
screenshot
以可视化诊断是什么意外UI元素（弹窗、错误对话框、失去焦点）拦截了流程。

规则6：清理会话 始终以
agent-tui kill
结束会话。孤立的会话会消耗资源并可能干扰后续运行。

Decision Framework

决策框架

Which Screenshot Mode?

选择哪种截图模式？

Use

screenshot --format json

when parsing automation output, or plain

screenshot

for human readable text.

解析自动化输出时使用

screenshot --format json

，普通

screenshot

用于人类可读文本。

How to Wait?

如何等待？

What are you waiting for?
│
├─► Specific text to appear
│   └─► `wait "text" --assert` (fails if not found)
│
├─► Specific text to disappear
│   └─► `wait "text" --gone --assert`
│
├─► UI to stop changing (animations, loading)
│   └─► `wait --stable`
│
└─► Multiple conditions
    └─► Chain waits sequentially

您在等待什么？
│
├─► 特定文本出现
│   └─► `wait "text" --assert`（未找到则失败）
│
├─► 特定文本消失
│   └─► `wait "text" --gone --assert`
│
├─► UI停止变化（动画、加载）
│   └─► `wait --stable`
│
└─► 多个条件
    └─► 按顺序链式等待

How to Act?

如何执行操作？

What do you need to do?
│
├─► Type text into the terminal
│   └─► `type "text"`
│
├─► Send keyboard shortcuts/navigation
│   └─► `press Ctrl+C` or `press ArrowDown Enter`

您需要做什么？
│
├─► 在终端中输入文本
│   └─► `type "text"`
│
├─► 发送键盘快捷键/导航
│   └─► `press Ctrl+C` 或 `press ArrowDown Enter`

Core Workflow

核心工作流

The canonical automation loop:

bash

undefined

标准自动化循环：

bash

undefined

1. START: Launch the TUI app

1. 启动：启动TUI应用

agent-tui run <command> [-- args...]

2. OBSERVE: Get current UI state

2. 观察：获取当前UI状态

agent-tui screenshot --format json

3. DECIDE: Based on text, determine next action

3. 决策：根据文本确定下一步操作

(This happens in your head/code)

（此步骤在您的头脑/代码中完成）

4. ACT: Execute the action

4. 执行：执行操作

agent-tui type "text" agent-tui press Enter

5. WAIT: Synchronize with UI changes

5. 等待：与UI变化同步

agent-tui wait "Expected" --assert # or wait --stable

agent-tui wait "Expected" --assert # 或 wait --stable

6. VERIFY: Confirm the outcome (often combined with step 5)

6. 验证：确认结果（通常与步骤5结合）

If verification fails, handle the error

如果验证失败，处理错误

7. REPEAT: Go back to step 2 until done

7. 重复：回到步骤2直到完成

8. CLEANUP: Always clean up

8. 清理：始终进行清理

agent-tui kill

undefined

agent-tui kill

undefined

Anti-Patterns (What NOT to Do)

反模式（切勿这样做）

❌ Acting During Animation/Loading

❌ 在动画/加载期间执行操作

bash

undefined

bash

undefined

WRONG: Acting immediately on dynamic UI

错误：在动态UI上立即执行操作

agent-tui run my-app agent-tui screenshot --format json # UI might still be loading! agent-tui type "value" # ❌ Might miss the input field

agent-tui run my-app agent-tui screenshot --format json # UI可能仍在加载！ agent-tui type "value" # ❌ 可能错过输入字段

RIGHT: Wait for stability first

正确：先等待稳定

agent-tui run my-app agent-tui wait --stable # Let UI settle agent-tui screenshot --format json # Now it's reliable agent-tui type "value"

undefined

agent-tui run my-app agent-tui wait --stable # 等待UI稳定 agent-tui screenshot --format json # 现在可靠了 agent-tui type "value"

undefined

❌ Assuming Success Without Verification

❌ 未验证就假设成功

bash

undefined

bash

undefined

WRONG: Assuming the type worked

错误：假设输入成功

agent-tui type "value" agent-tui press Enter

...proceed as if success... # ❌ What if it failed silently?

...继续操作，仿佛已成功... # ❌ 如果静默失败怎么办？

RIGHT: Verify the outcome

正确：验证结果

agent-tui type "value" agent-tui press Enter agent-tui wait "Success" --assert # ✓ Proves the action worked

undefined

agent-tui type "value" agent-tui press Enter agent-tui wait "Success" --assert # ✓ 证明操作成功

undefined

❌ Skipping Cleanup

❌ 跳过清理

bash

undefined

bash

undefined

WRONG: Forgetting to kill the session

错误：忘记终止会话

agent-tui run my-app

...do stuff...

...执行操作...

script ends # ❌ Session left running!

脚本结束 # ❌ 会话仍在运行！

RIGHT: Always clean up

正确：始终清理

agent-tui run my-app

...do stuff...

...执行操作...

agent-tui kill # ✓ Clean exit

undefined

agent-tui kill # ✓ 干净退出

undefined

Before You Start: Clarify Requirements

开始前：明确需求

Before automating any TUI, gather this information:

Command: What exactly to run? (
```
my-app --flag
```
or
```
npm start
```
?)
Success criteria: What text/state indicates success?
Input sequence: What keystrokes/data to enter, in what order?
Safety: Is it safe to submit forms, delete data, etc.?
Auth: Does it need login? Test credentials?
Live preview: Does the user want to watch? (
```
agent-tui live start --open
```
)

If any of these are unclear, ask before running.

在自动化任何TUI之前，请收集以下信息：

命令：具体要运行什么？（
```
my-app --flag
```
还是
```
npm start
```
？）
成功标准：什么文本/状态表示成功？
输入序列：要输入哪些按键/数据，顺序如何？
安全性：提交表单、删除数据等操作是否安全？
认证：是否需要登录？测试凭据？
实时预览：用户是否希望观看？（
```
agent-tui live start --open
```
）

如果有任何不清楚的地方，请在运行前询问。

Demo Mode: Showing What agent-tui Can Do

演示模式：展示agent-tui的功能

When a user asks what agent-tui is, wants a demo, or asks "show me how it works":

Don't explain—demonstrate. Actions speak louder than words.
Use the live preview so they can watch in real-time.
Run
top
—it's universal and shows dynamic real-time updates.

Quick demo trigger phrases:

"What is agent-tui?" / "What does agent-tui do?"
"Demo agent-tui" / "Show me agent-tui"
"How does agent-tui work?" / "See it in action"

当用户询问agent-tui是什么、想要演示，或询问“show me how it works”时：

**不要解释——直接演示。**行动胜于言语。
使用实时预览让用户可以实时观看。
运行
top
——它是通用工具，可展示动态实时更新。

快速演示触发短语：

"What is agent-tui?" / "What does agent-tui do?"
"Demo agent-tui" / "Show me agent-tui"
"How does agent-tui work?" / "See it in action"

Failure Recovery

故障恢复

Symptom	Diagnosis	Solution
"Text not found"	Stale view or text moved	Re-snapshot, locate text again
Wait times out	UI didn't reach expected state	Check screenshot, verify expectations
"Daemon not running"	Daemon crashed or not started	`agent-tui daemon start`
Unexpected layout	Wrong terminal size	`agent-tui resize --cols 120 --rows 40`
Session unresponsive	App crashed or hung	`agent-tui kill` , then re-run
Repeated failures	Something fundamentally wrong	Stop after 3-5 attempts, ask user

症状	诊断	解决方案
"Text not found"	视图过期或文本位置移动	重新截图，重新定位文本
等待超时	UI未达到预期状态	检查截图，验证预期结果
"Daemon not running"	守护进程崩溃或未启动	`agent-tui daemon start`
布局异常	终端尺寸错误	`agent-tui resize --cols 120 --rows 40`
会话无响应	应用崩溃或挂起	`agent-tui kill` ，然后重新运行
重复失败	存在根本性问题	尝试3-5次后停止，询问用户

Self-Discovery: Use --help

自我探索：使用--help

You don't need to memorize every flag. The CLI is self-documenting:

bash

agent-tui --help                     # List all commands
agent-tui run --help                 # Options for 'run'
agent-tui screenshot --help          # Options for 'screenshot'
agent-tui wait --help                # Options for 'wait'

When in doubt, ask the CLI. This skill teaches when and why to use commands. For exact flags and syntax,

--help

is authoritative.

您无需记住所有标志。CLI自带文档：

bash

agent-tui --help                     # 列出所有命令
agent-tui run --help                 # 'run'命令的选项
agent-tui screenshot --help          # 'screenshot'命令的选项
agent-tui wait --help                # 'wait'命令的选项

**如有疑问，请询问CLI。**本技能教授何时以及为何使用命令。对于确切的标志和语法，

--help

是权威来源。

Quick Reference

快速参考

bash

undefined

bash

undefined

Start app

启动应用

agent-tui run <cmd> [-- args] # Launch TUI under control

agent-tui run <cmd> [-- args] # 在控制下启动TUI

Observe

观察

agent-tui screenshot # Plain text view agent-tui screenshot --format json # Machine-readable output

agent-tui screenshot # 纯文本视图 agent-tui screenshot --format json # 机器可读输出

Act

执行操作

agent-tui press Enter # Press key(s) agent-tui press Ctrl+C # Keyboard shortcuts agent-tui type "text" # Type text

agent-tui press Enter # 按下按键 agent-tui press Ctrl+C # 键盘快捷键 agent-tui type "text" # 输入文本

Wait/Verify

等待/验证

agent-tui wait "text" --assert # Wait for text, fail if not found agent-tui wait "text" --gone --assert # Wait for text to disappear agent-tui wait --stable # Wait for UI to stop changing

agent-tui wait "text" --assert # 等待文本出现，未找到则失败 agent-tui wait "text" --gone --assert # 等待文本消失，未消失则失败 agent-tui wait --stable # 等待UI停止变化

Manage

管理

agent-tui sessions # List active sessions agent-tui live start --open # Start live preview agent-tui kill # End current session

undefined

agent-tui sessions # 列出活跃会话 agent-tui live start --open # 启动实时预览 agent-tui kill # 结束当前会话

undefined