cli-review-runner

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

cli-review-runner

cli-review-runner

Automates the 10-item agent-friendliness audit from cli-for-agents. Runs black-box probes against a target CLI and emits a structured report mapping each finding to a rule ID (e.g.,
help-examples-in-help
,
err-non-zero-exit-codes
,
safe-dry-run-flag
). Default mode is read-only - probes never run destructive verbs with real arguments.
自动执行来自cli-for-agents的10项Agent友好性审计。针对目标CLI运行黑盒探测,并生成结构化报告,将每个检测结果映射到对应的规则ID(例如
help-examples-in-help
err-non-zero-exit-codes
safe-dry-run-flag
)。默认模式为只读——探测过程绝不会使用真实参数执行破坏性命令。

When to Apply

适用场景

  • User asks to review or audit a CLI for agent-friendliness, automation readiness, or CI use
  • User has just finished building a CLI and wants a pre-ship sanity check
  • User is grading their own or a third-party CLI against the cli-for-agents catalog
  • User is asking why a CLI is hanging an agent, blowing up context, or failing to compose in a pipeline
  • PR review for a CLI change - quickly regress-test the
    --help
    , errors, and dry-run flags
  • 用户要求评审审计某款CLI的Agent友好性、自动化就绪性或CI适用性
  • 用户刚完成一款CLI的开发,需要在发布前进行健全性检查
  • 用户需要根据cli-for-agents目录评估自己或第三方的CLI
  • 用户询问为何某款CLI会导致Agent挂起上下文溢出或在流水线中组合失败
  • CLI变更的PR评审——快速回归测试
    --help
    、错误处理和dry-run标志

How to Use

使用方法

The skill is orchestrated by
scripts/review.sh
. Point it at the target CLI (absolute path or PATH-resolvable name) and pick an output format.
bash
undefined
该技能由**
scripts/review.sh
**编排。将其指向目标CLI(绝对路径或可通过PATH解析的名称),并选择输出格式。
bash
undefined

Default: text table on stdout, exit 0 if all passed, 1 if any failed

默认:在标准输出打印文本表格,全部通过则返回退出码0,否则返回1

bash scripts/review.sh --target /usr/local/bin/mycli
bash scripts/review.sh --target /usr/local/bin/mycli

Machine-readable output

机器可读格式输出

bash scripts/review.sh --target gh --format json bash scripts/review.sh --target kubectl --format ndjson
bash scripts/review.sh --target gh --format json bash scripts/review.sh --target kubectl --format ndjson

Supply subcommand list when auto-discovery misses them

当自动发现子命令失效时,手动提供子命令列表

bash scripts/review.sh --target gh --subcommands pr,issue,repo
bash scripts/review.sh --target gh --subcommands pr,issue,repo

Preview what would run without touching the target CLI

预览将要执行的操作,不实际触碰目标CLI

bash scripts/review.sh --target mycli --dry-run
bash scripts/review.sh --target mycli --dry-run

Include risky probes on destructive verbs (off by default)

包含针对破坏性命令的探测(默认关闭)

bash scripts/review.sh --target mycli --include-destructive

See `bash scripts/review.sh --help` for the full flag list.
bash scripts/review.sh --target mycli --include-destructive

查看`bash scripts/review.sh --help`获取完整的标志列表。

Workflow Overview

工作流概述

--target <cli>
[1] Validate target       fail fast if path missing or not executable
[2] Load rule catalog     references/rule-catalog.tsv (45 rules)
[3] Discover subcommands  parse top-level --help (gh/kubectl/commander shapes)
[4] Run probes P1..P10    each probe emits NDJSON findings to a temp file
[5] Render report         scripts/render.sh  ->  text | json | ndjson
Read references/workflow.md when you need the full probe-by-probe breakdown, failure modes, and how to extend the catalog.
--target <cli>
[1] 验证目标       若路径不存在或不可执行则快速失败
[2] 加载规则目录     引用/references/rule-catalog.tsv(45条规则)
[3] 发现子命令     解析顶层`--help`(支持gh/kubectl/commander格式)
[4] 运行探测P1..P10    每个探测将NDJSON结果输出到临时文件
[5] 生成报告         scripts/render.sh  ->  文本 | json | ndjson
如需了解每个探测的详细细分说明、失败模式以及如何扩展规则目录,请阅读references/workflow.md

Probe Coverage

探测覆盖范围

ProbeRules testedCoverage
P1 Non-interactive
interact-no-hang-on-stdin
,
interact-no-input-flag
,
interact-flags-first
,
interact-detect-tty
,
interact-no-timed-prompts
,
interact-no-arrow-menus
,
input-no-prompt-fallback
Run under
</dev/null
with timeout; inspect help for interactive language
P2 Layered help
help-per-subcommand
,
help-no-flag-required
,
help-layered-discovery
Top-level line count; per-subcommand
--help
; zero-arg invocation
P3 Help examples
help-examples-in-help
,
help-flag-summary
,
help-suggest-next-steps
Grep each subcommand help for
Examples:
, short+long flag pairs, "See also"
P4 Actionable errors
err-actionable-fix
,
err-include-example-invocation
,
err-exit-fast-on-missing-required
,
err-no-stack-traces-by-default
Invoke with bogus flag; grep stderr for fix + example; check for raw stack traces
P5 stderr channeling
err-stderr-not-stdout
Error text must land on fd 2
P6 Exit codes
err-non-zero-exit-codes
Usage error and runtime error must produce distinct non-zero codes
P7 stdin composition
input-accept-stdin-dash
,
input-flags-over-positional
Grep help for
-
stdin convention; count positionals vs flags
P8 Structured output
output-json-flag
,
output-respect-no-color
--json
produces JSON;
NO_COLOR=1
suppresses ANSI
P9 Destructive safety
safe-dry-run-flag
,
safe-force-bypass-flag
,
safe-no-prompts-with-no-input
Inspect destructive verbs'
--help
for
--dry-run
/
--yes
/
--force
/
--no-input
P10 Command structure
struct-resource-verb
,
struct-standard-flag-names
,
struct-no-hidden-subcommand-catchall
,
struct-flag-order-independent
Uniform shape,
--help
/
--version
present, unknown subcommand errors, flag position independence
Coverage: 30 of the 45 rules in cli-for-agents are black-box testable. The remaining 15 (idempotency, state reconciliation, NDJSON streaming, bounded output, crash-only recovery, env-var fallback, secret-stdin, confirm-by-typing-name) require either real invocation or source-code inspection - the report lists them as "manual review required".
探测项测试的规则覆盖范围
P1 非交互式
interact-no-hang-on-stdin
,
interact-no-input-flag
,
interact-flags-first
,
interact-detect-tty
,
interact-no-timed-prompts
,
interact-no-arrow-menus
,
input-no-prompt-fallback
</dev/null
环境下运行并设置超时;检查帮助文档中是否存在交互式语言
P2 分层帮助
help-per-subcommand
,
help-no-flag-required
,
help-layered-discovery
顶层帮助行数统计;每个子命令的
--help
检查;零参数调用测试
P3 帮助示例
help-examples-in-help
,
help-flag-summary
,
help-suggest-next-steps
搜索每个子命令帮助中的
Examples:
、长短标志对、“另请参阅”内容
P4 可操作错误
err-actionable-fix
,
err-include-example-invocation
,
err-exit-fast-on-missing-required
,
err-no-stack-traces-by-default
使用虚假标志调用CLI;搜索标准错误输出中的修复建议和示例调用;检查是否默认输出原始堆栈跟踪
P5 标准错误通道
err-stderr-not-stdout
错误文本必须输出到文件描述符2(stderr)
P6 退出码
err-non-zero-exit-codes
使用错误和运行时错误必须返回不同的非零退出码
P7 标准输入组合
input-accept-stdin-dash
,
input-flags-over-positional
搜索帮助文档中的
-
标准输入约定;统计位置参数与标志参数的数量
P8 结构化输出
output-json-flag
,
output-respect-no-color
--json
标志需生成JSON格式输出;
NO_COLOR=1
需禁用ANSI颜色
P9 破坏性操作安全性
safe-dry-run-flag
,
safe-force-bypass-flag
,
safe-no-prompts-with-no-input
检查破坏性命令的
--help
中是否包含
--dry-run
/
--yes
/
--force
/
--no-input
等标志
P10 命令结构
struct-resource-verb
,
struct-standard-flag-names
,
struct-no-hidden-subcommand-catchall
,
struct-flag-order-independent
统一的命令格式;是否存在
--help
/
--version
;未知子命令是否报错;标志位置是否无关
覆盖情况:cli-for-agents的45条规则中,有30条可通过黑盒测试验证。剩余15条(幂等性、状态协调、NDJSON流、有界输出、崩溃恢复、环境变量回退、标准输入秘钥、输入名称确认)需要实际调用或源代码检查——报告中会将这些标记为“需手动评审”。

Configuration

配置

config.json
stores the verb classifier lists and default timeout. The skill works without any setup - defaults are reasonable. Override per-invocation via flags or edit the file for project-wide changes.
json
{
  "timeout_seconds": 5,
  "safe_verbs": ["list", "get", "show", "status", "describe", "help", "version", "config", "ls", "inspect"],
  "destructive_verbs": ["delete", "drop", "destroy", "remove", "reset", "purge", "rm", "del"]
}
config.json
存储了命令分类列表和默认超时时间。该技能无需额外设置即可使用——默认配置已足够合理。可通过调用时的标志覆盖默认值,或编辑文件进行项目级别的修改。
json
{
  "timeout_seconds": 5,
  "safe_verbs": ["list", "get", "show", "status", "describe", "help", "version", "config", "ls", "inspect"],
  "destructive_verbs": ["delete", "drop", "destroy", "remove", "reset", "purge", "rm", "del"]
}

Safety Model

安全模型

All probes are read-only by default:
  • Safe verbs (list, get, show, ...) may be invoked with bogus flags to test error handling
  • Destructive verbs (delete, drop, ...) are ONLY inspected via
    --help
    - never executed with arguments
  • Every probe runs with a 5-second wall-clock timeout under
    </dev/null
  • The target CLI is sandboxed to its own process; no shell metacharacters in arguments
When
--include-destructive
is passed, probes may invoke destructive verbs with bogus flags too. This exposes the rare case where a CLI does something destructive before validating flags. Only enable this against CLIs you trust, or in a disposable test environment.
所有探测默认均为只读
  • 安全命令(list、get、show等)可能会使用虚假标志调用,以测试错误处理
  • 破坏性命令(delete、drop等)仅会通过
    --help
    进行检查——绝不会使用参数执行
  • 每个探测都会在
    </dev/null
    环境下运行,并设置5秒的超时时间
  • 目标CLI会被沙箱化到独立进程中;参数中不包含shell元字符
当传入
--include-destructive
标志时,探测可能会使用虚假标志调用破坏性命令。这会暴露一种罕见情况:CLI在验证标志之前就执行了破坏性操作。仅可在你信任的CLI或一次性测试环境中启用此选项。

Self-test

自测试

Before shipping changes to probes, run the self-test - it generates a mock CLI that deliberately violates specific rules and asserts the probes detect them:
bash
bash scripts/selftest.sh
Expected output:
Results: 8 passed, 0 failed
. Any failure points at a regression in
scripts/lib/probes.sh
.
在发布探测变更前,运行自测试——它会生成一个故意违反特定规则的模拟CLI,并验证探测是否能检测到这些问题:
bash
bash scripts/selftest.sh
预期输出:
Results: 8 passed, 0 failed
。任何失败都表明
scripts/lib/probes.sh
中存在回归问题。

Files

文件说明

FilePurpose
scripts/review.shMain entry point - orchestrates probes, renders the report
scripts/render.shOutput formatter: text / json / ndjson
scripts/selftest.shSanity check against a deliberately-buggy mock CLI
scripts/lib/common.shShared helpers: timeout, verb classifier, JSON escape, catalog loader
scripts/lib/probes.shProbe functions
probe_p1..probe_p10
references/rule-catalog.tsv45 rules from cli-for-agents, mapped to probes
references/workflow.mdDetailed probe-by-probe methodology, failure modes, extension guide
gotchas.mdKnown edge cases discovered during use
config.jsonVerb classifier lists and default timeout
文件用途
scripts/review.sh主入口——编排探测流程,生成报告
scripts/render.sh输出格式化工具:支持文本/JSON/NDJSON格式
scripts/selftest.sh针对故意存在bug的模拟CLI进行健全性检查
scripts/lib/common.sh共享辅助函数:超时处理、命令分类、JSON转义、规则目录加载
scripts/lib/probes.sh探测函数
probe_p1..probe_p10
references/rule-catalog.tsvcli-for-agents的45条规则,映射到对应的探测项
references/workflow.md每个探测项的详细方法、失败模式、扩展指南
gotchas.md使用过程中发现的已知边缘情况
config.json命令分类列表和默认超时时间

Related Skills

相关技能

  • cli-for-agents - the 45-rule design catalog this skill audits against. Read rule files there when the report flags an issue and you need the full explanation.
  • cli-for-agents——此技能所依据的45条规则设计目录。当报告标记某个问题时,可查看该目录中的规则文件获取完整解释。