verify-this

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Verify This

验证此项

Verification is not a recap. It proves or disproves a specific claim with repeatable evidence.

验证不是简单复述，而是用可重复的证据证明或推翻特定声明。

When To Use

适用场景

The user asks "verify this", "prove it works", "did this fix it", or "show me the evidence".
A bug fix needs a before/after repro.
A UI, CLI, API, performance, or memory claim needs measurement.
A test passes but the user-visible behavior still needs confirmation.

Do not use this for vague claims like "the code is cleaner". Ask for a measurable claim first.

用户提出“验证这个”“证明它有效”“这个修复生效了吗”或“给我看证据”等需求时
Bug修复需要对比修复前后的复现情况时
UI、CLI、API、性能或内存相关声明需要量化验证时
测试通过但仍需确认用户可见行为是否符合预期时

请勿用于“代码更简洁”这类模糊声明，应先要求用户提供可量化的具体声明。

Workflow

工作流程

Restate the claim in falsifiable form: condition, metric, and threshold.
Pick the smallest local surface that can disprove it.
Capture a baseline from the old state: merge base, parent commit, failing branch, or current broken repro.
Capture treatment from the changed state with the same command, data, warmup, and environment.
Compare raw artifacts: numbers, screenshots, terminal transcripts, HTTP responses, profiles, heap snapshots, or test output.
Return exactly one verdict:
```
VERIFIED
```
,
```
NOT VERIFIED
```
, or
```
INCONCLUSIVE
```
.

以可证伪的形式重述声明：包含触发条件、衡量指标和阈值。
选择最小的本地验证层面来证伪声明。
从旧状态捕获基准数据：如合并基准分支、父提交、故障分支或当前的问题复现场景。
在变更后的状态下，使用相同的命令、数据、预热操作和环境捕获处理后数据。
对比原始产物：数值、截图、终端记录、HTTP响应、性能分析报告、堆快照或测试输出。
返回唯一判定结果：
```
VERIFIED
```
、
```
NOT VERIFIED
```
或
```
INCONCLUSIVE
```
。

Local Surfaces

本地验证层面

Code behavior: focused unit/integration tests or a minimal repro script.
CLI/TUI behavior:
```
control-cli
```
, terminal transcript, or demo recording.
UI behavior:
```
control-ui
```
, screenshots, accessibility snapshots, or browser traces.
API behavior: local HTTP/RPC request and response diff.
Performance: same-machine baseline/treatment timings or CPU profiles.
Memory: heap snapshots before and after the suspected operation.

代码行为：聚焦的单元/集成测试或最小化复现脚本
CLI/TUI行为：
```
control-cli
```
、终端记录或演示录像
UI行为：
```
control-ui
```
、截图、无障碍快照或浏览器追踪数据
API行为：本地HTTP/RPC请求与响应差异对比
性能：同一机器上的基准/处理后耗时或CPU性能分析
内存：可疑操作前后的堆快照

Artifact Layout

产物存储结构

When safe to write artifacts:

text

/tmp/verify-this/<claim-slug>/
├── claim.md
├── timeline.md
├── baseline/
├── treatment/
├── diff/
└── verdict.md

If artifacts may contain sensitive code, prompts, screenshots, HTTP bodies, or heap data, keep only the minimal inline evidence unless the user agrees to disk storage.

当可以安全写入产物时：

text

/tmp/verify-this/<claim-slug>/
├── claim.md
├── timeline.md
├── baseline/
├── treatment/
├── diff/
└── verdict.md

如果产物可能包含敏感代码、提示词、截图、HTTP请求体或堆数据，除非用户同意存储到磁盘，否则仅保留必要的内嵌证据。

Verdict Rules

判定规则

```
VERIFIED
```
: baseline and treatment differ in the predicted direction, by the claimed threshold, with no obvious confound.
```
NOT VERIFIED
```
: the behavior is unchanged, moves the wrong way, or misses the threshold.
```
INCONCLUSIVE
```
: no valid baseline, noisy signal, failed measurement, or an environment difference invalidates the comparison.

```
VERIFIED
```
：基准状态与处理后状态的差异符合预期方向，达到声明的阈值，且无明显干扰因素
```
NOT VERIFIED
```
：行为未发生变化、朝相反方向变化或未达到阈值
```
INCONCLUSIVE
```
：无有效基准数据、信号噪声大、测量失败或环境差异导致对比无效

Output

输出格式

Use this shape:

text

VERIFIED | NOT VERIFIED | INCONCLUSIVE
Claim: <falsifiable claim>

Evidence:
<metric/artifact>: baseline=<...>, treatment=<...>, delta=<...>, threshold=<...>

Reasoning:
<one tight paragraph naming the evidence and any confounds>

Do not soften a negative result. A clear

NOT VERIFIED

is useful.

使用以下格式：

text

VERIFIED | NOT VERIFIED | INCONCLUSIVE
Claim: <可证伪的声明内容>

Evidence:
<指标/产物>: baseline=<...>, treatment=<...>, delta=<...>, threshold=<...>

Reasoning:
<一段简洁的说明，列出证据及可能存在的干扰因素>

不要弱化负面结果，明确的

NOT VERIFIED

同样具有价值。