cc-canary-html

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

cc-canary-html — HTML nerf-detection report

cc-canary-html — HTML格式的nerf检测报告

Same analysis as
/cc-canary
, rendered as HTML.
Bundled script
scripts/compute_stats.py
does the work in ~2.5s: scans JSONLs, runs inflection + transition-day detection, builds pre/post aggregates, cross-version comparison, hour-of-day, word frequency, three-period thinking depth, visibility transition, per-turn rates, and abnormalities — then renders a complete HTML skeleton with every table filled and ASCII bar charts preserved in monospace
<pre>
blocks. Narrative slots are marked
<!-- C: ... -->
.
Default window:
60d
. Accept
7d / 14d / 30d / 60d / 90d / 180d
.
/cc-canary
的分析内容完全相同,以HTML格式渲染。
捆绑脚本
scripts/compute_stats.py
可在约2.5秒内完成工作:扫描JSONLs文件,进行词形变化+过渡日检测,构建前后聚合数据、跨版本对比、时段分析、词频统计、三阶段思考深度、可见性转换、每轮交互率以及异常检测——然后渲染完整的HTML骨架,填充所有表格,并将ASCII条形图保留在等宽
<pre>
块中。叙述槽位标记为
<!-- C: ... -->
默认时间窗口:
60d
。支持参数:
7d / 14d / 30d / 60d / 90d / 180d

Your 4-step job

四步操作流程

1. Run the script

1. 运行脚本

Bash(python3 <SKILL_DIR>/scripts/compute_stats.py --window {window} --render-html /tmp/cc-canary-skeleton-{window}.html > /dev/null 2>&1)
<SKILL_DIR>
:
.claude/skills/cc-canary-html/
→ fallback to
~/.claude/skills/cc-canary-html/
.
Flags:
--window {Nd}
(required);
--include-agents
;
--min-user-words N
.
If the script fails: report error, retry once with
--include-agents
, else stop. Never fall back to hand-computation.
Bash(python3 <SKILL_DIR>/scripts/compute_stats.py --window {window} --render-html /tmp/cc-canary-skeleton-{window}.html > /dev/null 2>&1)
<SKILL_DIR>
.claude/skills/cc-canary-html/
→ 备选路径为
~/.claude/skills/cc-canary-html/
参数:
--window {Nd}
(必填);
--include-agents
--min-user-words N
若脚本运行失败:报告错误,添加
--include-agents
参数重试一次,若仍失败则停止。绝不手动计算。

2. Read the HTML skeleton

2. 读取HTML骨架文件

Read /tmp/cc-canary-skeleton-{window}.html
Read /tmp/cc-canary-skeleton-{window}.html

3. Fill every
<!-- C: ... -->
placeholder and save

3. 填充所有
<!-- C: ... -->
占位符并保存

Replace each placeholder comment with HTML-safe narrative text. Escape
<
,
>
,
&
in anything you write. Keep all tables,
<pre>
bar charts, and existing HTML untouched. Save as:
Write ./cc-canary-{YYYY-MM-DD}.html
将每个占位符注释替换为HTML安全的叙述文本。对写入内容中的
<
>
&
进行转义。保留所有表格、
<pre>
条形图及现有HTML内容不变。保存为:
Write ./cc-canary-{YYYY-MM-DD}.html

4. Open in the browser

4. 在浏览器中打开

Detect platform via
Bash(uname)
:
  • Darwin
    Bash(open ./cc-canary-{YYYY-MM-DD}.html)
  • Linux
    Bash(xdg-open ./cc-canary-{YYYY-MM-DD}.html)
  • anything else →
    Bash(start ./cc-canary-{YYYY-MM-DD}.html)
If the open command fails, just print the absolute path — don't error.
End your message with:
Wrote /Users/.../cc-canary-{date}.html — opened in browser.
(or
… open manually.
if open failed).
通过
Bash(uname)
检测平台:
  • Darwin
    Bash(open ./cc-canary-{YYYY-MM-DD}.html)
  • Linux
    Bash(xdg-open ./cc-canary-{YYYY-MM-DD}.html)
  • 其他系统 →
    Bash(start ./cc-canary-{YYYY-MM-DD}.html)
若打开命令失败,只需打印文件绝对路径——无需报错。
消息结尾需包含:
Wrote /Users/.../cc-canary-{date}.html — opened in browser.
(若打开失败则改为
… open manually.
)。

Narrative placeholders

叙述占位符

Same set as
/cc-canary
:
  • verdict-line
    — HOLDING / SUSPECTED REGRESSION / CONFIRMED REGRESSION / INCONCLUSIVE + brief justification
  • summary
    — 1–2 sentences, terse: what moved and by how much
  • timeline
    — 1–2 paragraphs
  • xv-para
    — 1 paragraph on cross-version (if §2 is present)
  • finding-N-class
    × up to 5 — inline classification: model-side | user-side | ambiguous
  • finding-N-reason
    × up to 5 — 2–3 sentences max, evidence-first
  • root-cause
    — 3–5 paragraphs
  • what-would-help
    — 2–4 concrete bullets
  • appendix-a1…a4, b, c, d, e, f, g, h
    — 1 paragraph each
  • meta-note
    — 2–5 sentences, first person, honest
/cc-canary
的占位符完全一致:
  • verdict-line
    — HOLDING / SUSPECTED REGRESSION / CONFIRMED REGRESSION / INCONCLUSIVE + 简要理由
  • summary
    — 1-2句话,简洁明了:哪些指标发生变化及变化幅度
  • timeline
    — 1-2段落
  • xv-para
    — 关于跨版本分析的段落(若存在§2部分)
  • finding-N-class
    × 最多5个 — 内联分类:模型端 | 用户端 | 不明确
  • finding-N-reason
    × 最多5个 — 最多2-3句话,以证据为先
  • root-cause
    — 3-5段落
  • what-would-help
    — 2-4条具体要点
  • appendix-a1…a4, b, c, d, e, f, g, h
    — 每个附录1段落
  • meta-note
    — 2-5句话,第一人称,真实表述

Verdict calibration

判定校准规则

  • HOLDING: ≤1 model-side signal
  • SUSPECTED REGRESSION: 2–3 model-side signals
  • CONFIRMED REGRESSION: ≥3 model-side signals + non-empty cross-version showing decline +
    session_count ≥ 15
    + ≥2 models +
    inflection.gap_sigma ≥ 1.0
  • INCONCLUSIVE:
    session_count < 15
    OR
    inflection.method == "fallback_split_half"
    with overlapping confounds
Cap at SUSPECTED when: only one model; <15 sessions; single-project with project starting mid-window; inflection coincides with a visible user-side event.
  • HOLDING:≤1个模型端信号
  • SUSPECTED REGRESSION:2-3个模型端信号
  • CONFIRMED REGRESSION:≥3个模型端信号 + 非空跨版本数据显示性能下降 +
    session_count ≥ 15
    + ≥2个模型 +
    inflection.gap_sigma ≥ 1.0
  • INCONCLUSIVE
    session_count < 15
    inflection.method == "fallback_split_half"
    且存在重叠干扰因素
以下情况判定上限为SUSPECTED:仅单个模型;会话数<15;单一项目且项目在时间窗口中途启动;词形变化与明显的用户端事件重合。

Hard rules

硬性规则

  • Never read, grep, or glob
    ~/.claude/projects/**/*.jsonl
    . Never run
    jq
    /
    awk
    /
    wc
    on session files. Script owns all that.
  • Never touch existing HTML, tables, or
    <pre>
    blocks — they came from real data.
  • HTML-escape any narrative text you insert (
    <
    ,
    >
    ,
    &
    → entities).
  • Every finding gets a classification label.
  • Hedge when cross-version is empty or
    session_count < 15
    .
  • Do not verdict CONFIRMED REGRESSION without the full checklist.
  • Do not save the skeleton as-is — replace every
    <!-- C: ... -->
    first.
  • 绝不读取、grep或遍历
    ~/.claude/projects/**/*.jsonl
    文件。绝不针对会话文件运行
    jq
    /
    awk
    /
    wc
    命令。所有此类操作均由脚本完成。
  • 绝不修改现有HTML、表格或
    <pre>
    块——这些内容来自真实数据。
  • 对插入的所有叙述文本进行HTML转义
    <
    >
    &
    转换为实体)。
  • 每个发现结果都必须带有分类标签。
  • 当跨版本数据为空或
    session_count < 15
    时,需谨慎表述
  • 未满足完整 checklist 时,不得判定为CONFIRMED REGRESSION。
  • 不得直接保存骨架文件——必须先替换所有
    <!-- C: ... -->
    占位符。

Failure modes

故障处理

  • Script import error → check
    python3 -V
    ≥ 3.8; retry once with
    --include-agents
    ; else stop.
  • Skeleton < 5KB → likely no sessions in window.
  • inflection.method == fallback_split_half
    → state it; cap at SUSPECTED.
  • Cross-version Δ
    None
    → div-by-zero when model-A value is 0; note the confound.
  • Browser-open command fails → don't error; print the path and move on.
  • 脚本导入错误 → 检查
    python3 -V
    是否≥3.8;添加
    --include-agents
    参数重试一次;若仍失败则停止。
  • 骨架文件大小<5KB → 可能时间窗口内无会话数据。
  • inflection.method == fallback_split_half
    → 需注明此情况;判定上限为SUSPECTED。
  • 跨版本差值为
    None
    → 当模型A的值为0时会出现除零错误;需注明该干扰因素。
  • 浏览器打开命令失败 → 无需报错;打印文件路径即可继续。