cli-review-runner

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

cli-review-runner

Automates the 10-item agent-friendliness audit from cli-for-agents. Runs black-box probes against a target CLI and emits a structured report mapping each finding to a rule ID (e.g.,

help-examples-in-help

err-non-zero-exit-codes

safe-dry-run-flag

). Default mode is read-only - probes never run destructive verbs with real arguments.

自动执行来自cli-for-agents的10项Agent友好性审计。针对目标CLI运行黑盒探测，并生成结构化报告，将每个检测结果映射到对应的规则ID（例如

help-examples-in-help

、

err-non-zero-exit-codes

、

safe-dry-run-flag

）。默认模式为只读——探测过程绝不会使用真实参数执行破坏性命令。

When to Apply

适用场景

User asks to review or audit a CLI for agent-friendliness, automation readiness, or CI use
User has just finished building a CLI and wants a pre-ship sanity check
User is grading their own or a third-party CLI against the cli-for-agents catalog
User is asking why a CLI is hanging an agent, blowing up context, or failing to compose in a pipeline
PR review for a CLI change - quickly regress-test the
```
--help
```
, errors, and dry-run flags

用户要求评审或审计某款CLI的Agent友好性、自动化就绪性或CI适用性
用户刚完成一款CLI的开发，需要在发布前进行健全性检查
用户需要根据cli-for-agents目录评估自己或第三方的CLI
用户询问为何某款CLI会导致Agent挂起、上下文溢出或在流水线中组合失败
CLI变更的PR评审——快速回归测试
```
--help
```
、错误处理和dry-run标志

How to Use

使用方法

The skill is orchestrated by scripts/review.sh
. Point it at the target CLI (absolute path or PATH-resolvable name) and pick an output format.

bash

undefined

该技能由**

scripts/review.sh

**编排。将其指向目标CLI（绝对路径或可通过PATH解析的名称），并选择输出格式。

bash

undefined

Default: text table on stdout, exit 0 if all passed, 1 if any failed

默认：在标准输出打印文本表格，全部通过则返回退出码0，否则返回1

bash scripts/review.sh --target /usr/local/bin/mycli

Machine-readable output

机器可读格式输出

bash scripts/review.sh --target gh --format json bash scripts/review.sh --target kubectl --format ndjson

Supply subcommand list when auto-discovery misses them

当自动发现子命令失效时，手动提供子命令列表

bash scripts/review.sh --target gh --subcommands pr,issue,repo

Preview what would run without touching the target CLI

预览将要执行的操作，不实际触碰目标CLI

bash scripts/review.sh --target mycli --dry-run

Include risky probes on destructive verbs (off by default)

包含针对破坏性命令的探测（默认关闭）

bash scripts/review.sh --target mycli --include-destructive


See `bash scripts/review.sh --help` for the full flag list.

bash scripts/review.sh --target mycli --include-destructive


查看`bash scripts/review.sh --help`获取完整的标志列表。

Workflow Overview

工作流概述

--target <cli>
   │
   ▼
[1] Validate target       fail fast if path missing or not executable
   │
   ▼
[2] Load rule catalog     references/rule-catalog.tsv (45 rules)
   │
   ▼
[3] Discover subcommands  parse top-level --help (gh/kubectl/commander shapes)
   │
   ▼
[4] Run probes P1..P10    each probe emits NDJSON findings to a temp file
   │
   ▼
[5] Render report         scripts/render.sh  ->  text | json | ndjson

Read references/workflow.md when you need the full probe-by-probe breakdown, failure modes, and how to extend the catalog.

--target <cli>
   │
   ▼
[1] 验证目标       若路径不存在或不可执行则快速失败
   │
   ▼
[2] 加载规则目录     引用/references/rule-catalog.tsv（45条规则）
   │
   ▼
[3] 发现子命令     解析顶层`--help`（支持gh/kubectl/commander格式）
   │
   ▼
[4] 运行探测P1..P10    每个探测将NDJSON结果输出到临时文件
   │
   ▼
[5] 生成报告         scripts/render.sh  ->  文本 | json | ndjson

如需了解每个探测的详细细分说明、失败模式以及如何扩展规则目录，请阅读references/workflow.md。

Probe Coverage

探测覆盖范围

Probe	Rules tested	Coverage
P1 Non-interactive	`interact-no-hang-on-stdin` , `interact-no-input-flag` , `interact-flags-first` , `interact-detect-tty` , `interact-no-timed-prompts` , `interact-no-arrow-menus` , `input-no-prompt-fallback`	Run under `</dev/null` with timeout; inspect help for interactive language
P2 Layered help	`help-per-subcommand` , `help-no-flag-required` , `help-layered-discovery`	Top-level line count; per-subcommand `--help` ; zero-arg invocation
P3 Help examples	`help-examples-in-help` , `help-flag-summary` , `help-suggest-next-steps`	Grep each subcommand help for `Examples:` , short+long flag pairs, "See also"
P4 Actionable errors	`err-actionable-fix` , `err-include-example-invocation` , `err-exit-fast-on-missing-required` , `err-no-stack-traces-by-default`	Invoke with bogus flag; grep stderr for fix + example; check for raw stack traces
P5 stderr channeling	`err-stderr-not-stdout`	Error text must land on fd 2
P6 Exit codes	`err-non-zero-exit-codes`	Usage error and runtime error must produce distinct non-zero codes
P7 stdin composition	`input-accept-stdin-dash` , `input-flags-over-positional`	Grep help for `-` stdin convention; count positionals vs flags
P8 Structured output	`output-json-flag` , `output-respect-no-color`	`--json` produces JSON; `NO_COLOR=1` suppresses ANSI
P9 Destructive safety	`safe-dry-run-flag` , `safe-force-bypass-flag` , `safe-no-prompts-with-no-input`	Inspect destructive verbs' `--help` for `--dry-run` / `--yes` / `--force` / `--no-input`
P10 Command structure	`struct-resource-verb` , `struct-standard-flag-names` , `struct-no-hidden-subcommand-catchall` , `struct-flag-order-independent`	Uniform shape, `--help` / `--version` present, unknown subcommand errors, flag position independence

Coverage: 30 of the 45 rules in cli-for-agents are black-box testable. The remaining 15 (idempotency, state reconciliation, NDJSON streaming, bounded output, crash-only recovery, env-var fallback, secret-stdin, confirm-by-typing-name) require either real invocation or source-code inspection - the report lists them as "manual review required".

探测项	测试的规则	覆盖范围
P1 非交互式	`interact-no-hang-on-stdin` , `interact-no-input-flag` , `interact-flags-first` , `interact-detect-tty` , `interact-no-timed-prompts` , `interact-no-arrow-menus` , `input-no-prompt-fallback`	在 `</dev/null` 环境下运行并设置超时；检查帮助文档中是否存在交互式语言
P2 分层帮助	`help-per-subcommand` , `help-no-flag-required` , `help-layered-discovery`	顶层帮助行数统计；每个子命令的 `--help` 检查；零参数调用测试
P3 帮助示例	`help-examples-in-help` , `help-flag-summary` , `help-suggest-next-steps`	搜索每个子命令帮助中的 `Examples:` 、长短标志对、“另请参阅”内容
P4 可操作错误	`err-actionable-fix` , `err-include-example-invocation` , `err-exit-fast-on-missing-required` , `err-no-stack-traces-by-default`	使用虚假标志调用CLI；搜索标准错误输出中的修复建议和示例调用；检查是否默认输出原始堆栈跟踪
P5 标准错误通道	`err-stderr-not-stdout`	错误文本必须输出到文件描述符2（stderr）
P6 退出码	`err-non-zero-exit-codes`	使用错误和运行时错误必须返回不同的非零退出码
P7 标准输入组合	`input-accept-stdin-dash` , `input-flags-over-positional`	搜索帮助文档中的 `-` 标准输入约定；统计位置参数与标志参数的数量
P8 结构化输出	`output-json-flag` , `output-respect-no-color`	`--json` 标志需生成JSON格式输出； `NO_COLOR=1` 需禁用ANSI颜色
P9 破坏性操作安全性	`safe-dry-run-flag` , `safe-force-bypass-flag` , `safe-no-prompts-with-no-input`	检查破坏性命令的 `--help` 中是否包含 `--dry-run` / `--yes` / `--force` / `--no-input` 等标志
P10 命令结构	`struct-resource-verb` , `struct-standard-flag-names` , `struct-no-hidden-subcommand-catchall` , `struct-flag-order-independent`	统一的命令格式；是否存在 `--help` / `--version` ；未知子命令是否报错；标志位置是否无关

覆盖情况：cli-for-agents的45条规则中，有30条可通过黑盒测试验证。剩余15条（幂等性、状态协调、NDJSON流、有界输出、崩溃恢复、环境变量回退、标准输入秘钥、输入名称确认）需要实际调用或源代码检查——报告中会将这些标记为“需手动评审”。

Configuration

配置

config.json

stores the verb classifier lists and default timeout. The skill works without any setup - defaults are reasonable. Override per-invocation via flags or edit the file for project-wide changes.

json

{
  "timeout_seconds": 5,
  "safe_verbs": ["list", "get", "show", "status", "describe", "help", "version", "config", "ls", "inspect"],
  "destructive_verbs": ["delete", "drop", "destroy", "remove", "reset", "purge", "rm", "del"]
}

config.json

存储了命令分类列表和默认超时时间。该技能无需额外设置即可使用——默认配置已足够合理。可通过调用时的标志覆盖默认值，或编辑文件进行项目级别的修改。

json

{
  "timeout_seconds": 5,
  "safe_verbs": ["list", "get", "show", "status", "describe", "help", "version", "config", "ls", "inspect"],
  "destructive_verbs": ["delete", "drop", "destroy", "remove", "reset", "purge", "rm", "del"]
}

Safety Model

安全模型

All probes are read-only by default:

Safe verbs (list, get, show, ...) may be invoked with bogus flags to test error handling
Destructive verbs (delete, drop, ...) are ONLY inspected via
```
--help
```
- never executed with arguments
Every probe runs with a 5-second wall-clock timeout under
```
</dev/null
```
The target CLI is sandboxed to its own process; no shell metacharacters in arguments

When

--include-destructive

is passed, probes may invoke destructive verbs with bogus flags too. This exposes the rare case where a CLI does something destructive before validating flags. Only enable this against CLIs you trust, or in a disposable test environment.

所有探测默认均为只读：

安全命令（list、get、show等）可能会使用虚假标志调用，以测试错误处理
破坏性命令（delete、drop等）仅会通过
```
--help
```
进行检查——绝不会使用参数执行
每个探测都会在
```
</dev/null
```
环境下运行，并设置5秒的超时时间
目标CLI会被沙箱化到独立进程中；参数中不包含shell元字符

当传入

--include-destructive

标志时，探测可能会使用虚假标志调用破坏性命令。这会暴露一种罕见情况：CLI在验证标志之前就执行了破坏性操作。仅可在你信任的CLI或一次性测试环境中启用此选项。

Self-test

自测试

Before shipping changes to probes, run the self-test - it generates a mock CLI that deliberately violates specific rules and asserts the probes detect them:

bash

bash scripts/selftest.sh

Expected output:

Results: 8 passed, 0 failed

. Any failure points at a regression in

scripts/lib/probes.sh

在发布探测变更前，运行自测试——它会生成一个故意违反特定规则的模拟CLI，并验证探测是否能检测到这些问题：

bash

bash scripts/selftest.sh

预期输出：

Results: 8 passed, 0 failed

。任何失败都表明

scripts/lib/probes.sh

中存在回归问题。

Files

文件说明

File	Purpose
scripts/review.sh	Main entry point - orchestrates probes, renders the report
scripts/render.sh	Output formatter: text / json / ndjson
scripts/selftest.sh	Sanity check against a deliberately-buggy mock CLI
scripts/lib/common.sh	Shared helpers: timeout, verb classifier, JSON escape, catalog loader
scripts/lib/probes.sh	Probe functions `probe_p1..probe_p10`
references/rule-catalog.tsv	45 rules from cli-for-agents, mapped to probes
references/workflow.md	Detailed probe-by-probe methodology, failure modes, extension guide
gotchas.md	Known edge cases discovered during use
config.json	Verb classifier lists and default timeout

文件	用途
scripts/review.sh	主入口——编排探测流程，生成报告
scripts/render.sh	输出格式化工具：支持文本/JSON/NDJSON格式
scripts/selftest.sh	针对故意存在bug的模拟CLI进行健全性检查
scripts/lib/common.sh	共享辅助函数：超时处理、命令分类、JSON转义、规则目录加载
scripts/lib/probes.sh	探测函数 `probe_p1..probe_p10`
references/rule-catalog.tsv	cli-for-agents的45条规则，映射到对应的探测项
references/workflow.md	每个探测项的详细方法、失败模式、扩展指南
gotchas.md	使用过程中发现的已知边缘情况
config.json	命令分类列表和默认超时时间

cli-review-runner

Original

Translation

cli-review-runner

cli-review-runner

When to Apply

适用场景

How to Use

使用方法

Default: text table on stdout, exit 0 if all passed, 1 if any failed

默认：在标准输出打印文本表格，全部通过则返回退出码0，否则返回1

Machine-readable output

机器可读格式输出

Supply subcommand list when auto-discovery misses them

当自动发现子命令失效时，手动提供子命令列表

Preview what would run without touching the target CLI

预览将要执行的操作，不实际触碰目标CLI

Include risky probes on destructive verbs (off by default)

包含针对破坏性命令的探测（默认关闭）

Workflow Overview

工作流概述

Probe Coverage

探测覆盖范围

Configuration

配置

Safety Model

安全模型

Self-test

自测试

Files

文件说明

Related Skills

相关技能