printing-press-retro

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

/printing-press-retro

/printing-press-retro

Analyze a Printing Press session to find ways to improve the system that produces CLIs — the Go binary, templates, skills, and catalog. Not fixes to the specific CLI that was just printed, but improvements so the next CLI comes out stronger.
It is a non-goal for the Printing Press to produce flawless CLIs without manual tweaks. That's the nature of the system. We expect agents to reason over the generated CLI, customize for the specific API, build novel features, and iterate. Some hand-built work in every run is normal.
The retro's job is to find the subset of manual work where the machine could have realistically raised the floor — given the agent a better starting point, prevented the issue entirely, or eliminated friction that would recur on the next CLI. Two clear cases qualify:
  1. The machine could have completely prevented the issue, and the pattern is generalizable across many printed CLIs. File it.
  2. The machine could have raised the floor meaningfully — better default, partial scaffold, helper that absorbs the boilerplate — across multiple CLIs you can name with evidence. File it.
Otherwise, the manual work is normal iteration and should not generate a finding. Some items make it back as machine fixes; not all. The retro is the filter that distinguishes the two.
The retro creates a GitHub issue on the printing-press repo with the findings that survive triage and the adversarial check, plus artifacts, so maintainers (or an AI agent) can fix the Printing Press.
分析Printing Press会话,找出改进CLI生成系统的方法——包括Go二进制文件、模板、技能和目录。不是修复刚生成的特定CLI,而是进行改进,让下一个CLI的生成质量更优。
**让Printing Press无需手动调整就能生成完美CLI并非目标。**这是系统的本质特性。我们期望Agent能够对生成的CLI进行推理,针对特定API进行定制,构建新颖功能并迭代。每次运行中进行一些手工操作是正常的。
回顾分析的任务是找出那些机器本可以切实提升基础质量的手工工作子集——为Agent提供更好的起点、完全避免问题,或者消除在下次生成CLI时会重复出现的阻碍。符合条件的两种明确情况:
  1. **机器本可以完全避免该问题,且该模式可推广到多个生成的CLI。**创建Issue。
  2. 机器本可以显著提升基础质量——更好的默认配置、部分脚手架、吸收样板代码的辅助工具——**且能应用到多个你可以提供证据的CLI。**创建Issue。
除此之外,手工工作属于正常迭代,不应生成结论。部分问题会反馈为机器修复;并非全部。回顾分析就是区分这两者的过滤器。
回顾分析会在printing-press仓库创建GitHub Issue,包含通过筛选和对抗性检查的结论以及相关工件,以便维护者(或AI Agent)修复Printing Press。

Terminology

术语

  • The Printing Press: The whole system that produces CLIs. Use this name in all user-facing output (issues, retros, prompts). It has four subsystems:
    • Generator — templates that emit Go code (
      internal/generator/
      )
    • Scorer — tools that grade the output: verify, dogfood, scorecard
    • Skills — SKILL.md instructions that guide Claude during generation
    • Binary — the Go CLI itself: commands, flags, parsers (
      cmd/printing-press/
      )
  • Printed CLI: A CLI produced by the Printing Press for a specific API (e.g.,
    notion-pp-cli
    ). Printed-CLI fixes only help that one CLI.
Use "the Printing Press" when talking about the system. Use the subsystem name when pointing a developer at what to fix — "fix the scorer" and "fix the generator" are different PRs.
  • Printing Press:生成CLI的整个系统。在所有面向用户的输出(Issue、回顾分析、提示词)中使用此名称。它包含四个子系统:
    • Generator——生成Go代码的模板(
      internal/generator/
    • Scorer——评估输出的工具:验证、内部测试、评分卡
    • Skills——指导Claude生成过程的SKILL.md说明
    • Binary——Go CLI本身:命令、标志、解析器(
      cmd/printing-press/
  • Printed CLI:Printing Press为特定API生成的CLI(例如
    notion-pp-cli
    )。对Printed CLI的修复仅对该CLI有效。
讨论系统时使用“Printing Press”。指向开发者需要修复的部分时使用子系统名称——“修复scorer”和“修复generator”是不同的PR。

Cardinal rules

核心规则

  • Default is "don't change the machine." The Printing Press is mature — 30+ CLIs printed, most templates exercised across many shapes. The burden of proof is on the finding, not on the Skip path. Most things you encountered while printing one CLI are that CLI's quirks, iteration noise, or upstream API behavior — not generator gaps. Propose a machine change only when cross-CLI evidence is concrete and the finding survives the Phase 3 adversarial check (Step G).
  • A retro of three sharp findings is more valuable than ten mixed-quality findings. Each filed finding spends maintainer attention. If you find yourself writing "every finding warrants action" or producing zero drops and zero skips, stop and re-triage — that outcome is the failure mode this skill exists to prevent.
  • The retro proposes Printing Press changes that help multiple printed CLIs. Don't propose direct edits to the one CLI that just shipped, and don't propose machine changes whose value is unique to this CLI's quirks — those are printed-CLI fixes wearing a generator costume.
  • Never upload un-scrubbed artifacts. All artifacts go through the secrets scrub before upload.
  • Never modify source directories. Manuscripts and library directories are read-only. Scrub operations work on temporary copies.
  • Never skip the secrets scrub, even if the generation pipeline already ran one. Defense in depth.
  • Never work around a scorer bug in the Printing Press. If a scoring tool penalizes something incorrectly, the fix goes in the scoring tool.
  • **默认是“不修改机器”。**Printing Press已成熟——已生成30多个CLI,大多数模板已在多种场景下使用。结论需承担举证责任,而非跳过路径。你在生成一个CLI时遇到的大多数问题都是该CLI的特殊情况、迭代噪声或上游API行为——并非生成器的缺陷。只有当跨CLI的证据具体且结论通过第3阶段的对抗性检查(步骤G)时,才提议修改机器。
  • **三个明确的结论比十个质量参差不齐的结论更有价值。**每个提交的结论都会占用维护者的精力。如果你发现自己写“每个结论都需要行动”或没有任何跳过和丢弃的结论,请停下来重新筛选——这种结果正是该技能要避免的失败模式。
  • 回顾分析提议的Printing Press改进应能帮助多个Printed CLI。不要提议直接修改刚交付的那个CLI,也不要提议仅对该CLI的特殊情况有价值的机器修改——这些是伪装成生成器修复的Printed CLI修复。
  • **切勿上传未清理敏感信息的工件。**所有工件在上传前都需经过敏感信息清理。
  • **切勿修改源目录。**手稿和库目录为只读。清理操作在临时副本上进行。
  • **切勿跳过敏感信息清理步骤,**即使生成流水线已运行过一次。纵深防御。
  • **切勿在Printing Press中绕过scorer的bug。**如果评分工具错误地惩罚了某些内容,修复应在评分工具中进行。

Setup

设置

<!-- RETRO_SETUP_START -->
bash
undefined
<!-- RETRO_SETUP_START -->
bash
undefined

Path-only setup — no binary detection required.

仅路径设置——无需检测二进制文件。

The retro skill reads manuscripts and runs gh/curl. It does not invoke the

回顾分析技能读取手稿并运行gh/curl。它不会调用printing-press二进制文件。

printing-press binary. This avoids aborting for users who installed the

这避免了安装插件但未安装Go二进制文件的用户运行失败。

plugin but not the Go binary.

_scope_dir="$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")" _scope_dir="$(cd "$_scope_dir" && pwd -P)"
PRESS_HOME="$HOME/printing-press" PRESS_MANUSCRIPTS="$PRESS_HOME/manuscripts" PRESS_LIBRARY="$PRESS_HOME/library" RETRO_SCRATCH_DIR="/tmp/printing-press/retro"
mkdir -p "$PRESS_MANUSCRIPTS" "$PRESS_LIBRARY" "$RETRO_SCRATCH_DIR"
_scope_dir="$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")" _scope_dir="$(cd "$_scope_dir" && pwd -P)"
PRESS_HOME="$HOME/printing-press" PRESS_MANUSCRIPTS="$PRESS_HOME/manuscripts" PRESS_LIBRARY="$PRESS_HOME/library" RETRO_SCRATCH_DIR="/tmp/printing-press/retro"
mkdir -p "$PRESS_MANUSCRIPTS" "$PRESS_LIBRARY" "$RETRO_SCRATCH_DIR"

Detect whether we're inside the printing-press repo

检测是否在printing-press仓库内

IN_REPO=false if [ -f "$_scope_dir/cmd/printing-press/main.go" ]; then IN_REPO=true REPO_ROOT="$_scope_dir" echo "Running from printing-press repo: $REPO_ROOT" fi
<!-- RETRO_SETUP_END -->
IN_REPO=false if [ -f "$_scope_dir/cmd/printing-press/main.go" ]; then IN_REPO=true REPO_ROOT="$_scope_dir" echo "Running from printing-press repo: $REPO_ROOT" fi
<!-- RETRO_SETUP_END -->

Guard rails

防护措施

Nothing to retro

无回顾分析内容

bash
if [ ! -d "$PRESS_MANUSCRIPTS" ] || [ -z "$(ls -A "$PRESS_MANUSCRIPTS" 2>/dev/null)" ]; then
  echo "No manuscripts found. Run /printing-press first to generate a CLI."
  exit 1
fi
bash
if [ ! -d "$PRESS_MANUSCRIPTS" ] || [ -z "$(ls -A "$PRESS_MANUSCRIPTS" 2>/dev/null)" ]; then
  echo "未找到手稿。请先运行/printing-press生成CLI。"
  exit 1
fi

Resolve which API

确定API

If the user passed an API name as an argument, use that. Validate for path traversal:
bash
undefined
如果用户传入了API名称作为参数,则使用该名称。验证是否存在路径遍历:
bash
undefined

Reject names with /, , or ..

拒绝包含/、\或..的名称

if echo "$USER_API_NAME" | grep -qE '[/\]|..'; then echo "Invalid API name: '$USER_API_NAME'. Names cannot contain path separators or '..'." exit 1 fi
if echo "$USER_API_NAME" | grep -qE '[/\]|..'; then echo "无效的API名称: '$USER_API_NAME'。名称不能包含路径分隔符或'..'。" exit 1 fi

Verify resolved path stays under PRESS_MANUSCRIPTS

验证解析后的路径是否在PRESS_MANUSCRIPTS下

RESOLVED="$(cd "$PRESS_MANUSCRIPTS/$USER_API_NAME" 2>/dev/null && pwd -P)" case "$RESOLVED" in "$PRESS_MANUSCRIPTS"/*) ;; # OK *) echo "Invalid API name: path resolves outside manuscripts directory."; exit 1 ;; esac

If no API name was provided and multiple APIs exist, list them with their most recent
run dates and ask the user to choose:

```bash
echo "Multiple APIs found in manuscripts:"
for api_dir in "$PRESS_MANUSCRIPTS"/*/; do
  api_name=$(basename "$api_dir")
  latest=$(ls -t "$api_dir" 2>/dev/null | head -1)
  echo "  - $api_name (latest run: $latest)"
done
Use
AskUserQuestion
to let the user pick.
RESOLVED="$(cd "$PRESS_MANUSCRIPTS/$USER_API_NAME" 2>/dev/null && pwd -P)" case "$RESOLVED" in "$PRESS_MANUSCRIPTS"/*) ;; # 合法 *) echo "无效的API名称: 路径解析到手稿目录之外。"; exit 1 ;; esac

如果未提供API名称且存在多个API,则列出它们及其最近运行日期,并让用户选择:

```bash
echo "在手稿中找到多个API:"
for api_dir in "$PRESS_MANUSCRIPTS"/*/; do
  api_name=$(basename "$api_dir")
  latest=$(ls -t "$api_dir" 2>/dev/null | head -1)
  echo "  - $api_name (最近运行: $latest)"
done
使用
AskUserQuestion
让用户选择。

Resolve which run

确定运行实例

If the API has multiple runs, default to the most recent. If the user specified a run ID, use that. Otherwise:
bash
API_DIR="$PRESS_MANUSCRIPTS/$API_NAME"
RUN_ID=$(ls -t "$API_DIR" 2>/dev/null | head -1)
RUN_DIR="$API_DIR/$RUN_ID"

echo "Retro for: $API_NAME (run $RUN_ID)"
echo "Manuscripts: $RUN_DIR"
如果API有多个运行实例,默认使用最近的一个。如果用户指定了运行ID,则使用该ID。否则:
bash
API_DIR="$PRESS_MANUSCRIPTS/$API_NAME"
RUN_ID=$(ls -t "$API_DIR" 2>/dev/null | head -1)
RUN_DIR="$API_DIR/$RUN_ID"

echo "回顾分析目标: $API_NAME (运行实例 $RUN_ID)"
echo "手稿路径: $RUN_DIR"

Resolve CLI directory

确定CLI目录

bash
API_SLUG="$API_NAME"
CLI_NAME="${API_SLUG}-pp-cli"
CLI_DIR="$PRESS_LIBRARY/$CLI_NAME"

if [ ! -d "$CLI_DIR" ]; then
  # Try without -pp-cli suffix (legacy naming)
  CLI_DIR="$PRESS_LIBRARY/$API_NAME"
fi

if [ ! -d "$CLI_DIR" ]; then
  echo "WARNING: CLI directory not found at $PRESS_LIBRARY/$CLI_NAME"
  echo "Proceeding with manuscripts only — CLI source will not be included in artifacts."
  CLI_DIR=""
fi
bash
API_SLUG="$API_NAME"
CLI_NAME="${API_SLUG}-pp-cli"
CLI_DIR="$PRESS_LIBRARY/$CLI_NAME"

if [ ! -d "$CLI_DIR" ]; then
  # 尝试不带-pp-cli后缀的旧命名方式
  CLI_DIR="$PRESS_LIBRARY/$API_NAME"
fi

if [ ! -d "$CLI_DIR" ]; then
  echo "警告: 在$PRESS_LIBRARY/$CLI_NAME未找到CLI目录"
  echo "仅对手稿进行分析——CLI源码不会包含在工件中。"
  CLI_DIR=""
fi

When to run

运行时机

Best results come from running in the same conversation where the CLI was generated (post-shipcheck) — the retro can mine the full conversation history for errors, retries, manual edits, and discoveries.
If running in a fresh conversation, the retro proceeds with manuscript evidence only. Phase 2 marks session-dependent findings as "evidence: manuscripts only."
在生成CLI的同一对话中(交付检查后)运行效果最佳——回顾分析可以挖掘完整的对话历史,找出错误、重试、手动编辑和发现的问题。
如果在新对话中运行,回顾分析仅基于手稿证据进行。第2阶段会将依赖会话的结论标记为“证据: 仅手稿”。

Phase 1: Gather evidence

第1阶段:收集证据

Read all artifacts from the run:
  1. Research brief
    $RUN_DIR/research/*brief*
  2. Absorb manifest
    $RUN_DIR/research/*absorb*
  3. Shipcheck proof
    $RUN_DIR/proofs/*shipcheck*
  4. Build log
    $RUN_DIR/proofs/*build-log*
    (if exists)
  5. Live smoke log
    $RUN_DIR/proofs/*live-smoke*
    (if exists)
  6. The generated CLI
    $CLI_DIR/
    (if available)
Also gather the scorecard, verify pass rate, and dogfood report (from the shipcheck proof or by re-running the tools if
IN_REPO
is true and the binary is available).
读取运行实例的所有工件:
  1. 研究摘要——
    $RUN_DIR/research/*brief*
  2. 吸收清单——
    $RUN_DIR/research/*absorb*
  3. 交付检查证明——
    $RUN_DIR/proofs/*shipcheck*
  4. 构建日志——
    $RUN_DIR/proofs/*build-log*
    (如果存在)
  5. 实时冒烟测试日志——
    $RUN_DIR/proofs/*live-smoke*
    (如果存在)
  6. 生成的CLI——
    $CLI_DIR/
    (如果可用)
同时收集评分卡、验证通过率和内部测试报告(来自交付检查证明,或如果
IN_REPO
为true且二进制文件可用,则重新运行工具)。

Phase 2: Mine the session

第2阶段:挖掘会话信息

Scan the conversation history for six categories of signal and produce a candidate list. The candidate list is not the finding list — Phase 2.5 triage will cull it and Phase 3 will further drop weak survivors. Most candidates will not survive.
While collecting, distinguish:
  • Iteration noise — one-off retries, typos, normal trial-and-error during a long generation. Skip these even at the candidate stage; they don't survive triage.
  • Per-CLI quirks — behavior tied to this API's shape (auth oddity, undocumented endpoint, vendor-specific envelope) that wouldn't recur on another spec. Add to the candidate list with a "looks per-CLI" tag — most will be dropped at triage.
  • Systemic friction — patterns that would plausibly recur on the next CLI (template gap, default that needs to change, skill instruction that misled you). These are what the retro exists to surface.
If running in a fresh conversation without generation history: Note this and proceed with manuscript evidence only. Focus on what the manuscripts reveal — scorecard gaps, verify failures, dogfood issues, and obvious template patterns in the CLI source. Mark session-dependent findings as "evidence: manuscripts only."
扫描对话历史,寻找六类信号并生成候选列表。候选列表并非最终结论列表——第2.5阶段的筛选会剔除部分候选,第3阶段会进一步淘汰薄弱的候选。大多数候选不会留存下来。
收集时需区分:
  • 迭代噪声——一次性重试、拼写错误、长时间生成过程中的正常试错。即使在候选阶段也跳过这些;它们无法通过筛选。
  • 单个CLI的特殊情况——与该API结构相关的行为(认证异常、未记录的端点、供应商特定的封装),不会在其他规范中重复出现。添加到候选列表并标记“看起来是单个CLI的问题”——大多数会在筛选阶段被丢弃。
  • 系统性阻碍——可能在下次生成CLI时重复出现的模式(模板缺陷、需要修改的默认配置、误导你的技能说明)。这些正是回顾分析要发现的内容。
**如果在没有生成历史的新对话中运行:**注意这一点,并仅基于手稿证据进行。专注于手稿揭示的内容——评分卡缺陷、验证失败、内部测试问题以及CLI源码中明显的模板模式。将依赖会话的结论标记为“证据: 仅手稿”。

2a. Errors and retries

2a. 错误与重试

Any time a command failed and was re-run, a build broke, or the Printing Press produced code that didn't compile. What broke and what fixed it?
任何命令失败并重新运行、构建中断或Printing Press生成的代码无法编译的情况。是什么出了问题,又是什么修复了问题?

2b. Manual code edits

2b. 手动代码编辑

Manual edits during iteration are normal — agents reason over the generated CLI and tweak. A single edit to handle this CLI's quirk is the workflow.
For each manual edit, ask: could the machine have raised the floor here?
  • Could the machine have completely prevented this edit? Default was wrong for most APIs, template emitted broken code, parser missed a common pattern. If yes AND the same edit would be needed on multiple CLIs you can name with evidence → candidate.
  • Could the machine have given a better starting point that made the edit smaller, simpler, or skippable in common cases? Even if you'd still tweak, raising the floor compounds across future CLIs. If yes AND generalizable → candidate.
  • Was this just per-API customization the agent was expected to do? Drop.
  • Was this iteration noise (typo, retry, transient confusion)? Drop.
The triage question is whether the machine raising the floor would compound across future CLIs — not whether this one CLI would have shipped a few lines lighter.
迭代过程中的手动编辑是正常的——Agent会对生成的CLI进行推理并调整。针对该CLI特殊情况的单次编辑属于正常工作流程。
对于每一次手动编辑,问自己:机器本可以提升这里的基础质量吗?
  • *机器本可以完全避免这次编辑吗?*默认配置对大多数API错误,模板生成了有问题的代码,解析器遗漏了常见模式。如果是,且相同的编辑需要应用到多个你可以提供证据的CLI → 候选结论。
  • *机器本可以提供更好的起点,让编辑更简单、更小或在常见情况下可跳过吗?*即使你仍需调整,提升基础质量也会在未来的CLI生成中带来累积收益。如果是,且可推广 → 候选结论。
  • *这只是Agent预期要做的针对API的定制吗?*丢弃。
  • *这是迭代噪声(拼写错误、重试、临时混淆)吗?*丢弃。
筛选的关键问题是,机器提升基础质量是否会在未来的CLI生成中带来累积收益——而不是这个CLI是否能少几行代码交付。

2c. Features built from scratch

2c. 从头构建的功能

Hand-built features (transcendence commands, novel commands, helper packages for secondary APIs) are part of the workflow — agents build the domain-specific value layer on top of the API surface the machine emits. Building features by hand is not by itself a finding.
For each hand-built feature, ask: could the machine have raised the floor for this kind of feature?
  • Could the machine have emitted a working default version, even if you'd still customize it? (E.g., every list+detail API benefits from a
    summary
    aggregation that the machine could scaffold from the spec.) Candidate, if generalizable across multiple named APIs.
  • Could the machine have emitted scaffolding, types, or helpers that would have cut the build effort meaningfully? (E.g., a typed secondary-client template for combo CLIs, a fanout-aggregation helper.) Candidate, if generalizable.
  • Is this genuinely custom domain logic the machine couldn't realistically generate from a spec? (E.g., booking a slot is custom orchestration; the machine can emit the underlying endpoints but not the choreography.) Drop — the SKILL is the right place to share the recipe, not the generator.
The "raises the floor" test separates "machine fix" from "SKILL recipe": if the machine's contribution would still leave significant per-CLI work, the recipe belongs in the SKILL so the next agent knows the pattern; if the machine could absorb the boilerplate cleanly, it's a generator template.
手工构建的功能(超越命令、新颖命令、辅助API的辅助包)是工作流程的一部分——Agent会在机器生成的API表面之上构建特定领域的价值层。手工构建功能本身并非结论。
对于每个手工构建的功能,问自己:机器本可以提升这类功能的基础质量吗?
  • 机器本可以生成一个可用的默认版本,即使你仍需定制吗?(例如,每个列表+详情API都能从机器可以从规范中生成的
    summary
    聚合中受益。)如果可推广到多个指定API,则为候选结论。
  • 机器本可以生成脚手架、类型或辅助工具,显著减少构建工作量吗?(例如,组合CLI的类型化二级客户端模板、扇出聚合辅助工具。)如果可推广,则为候选结论。
  • 这是机器无法从规范中切实生成的真正定制领域逻辑吗?(例如,预订时段是定制编排;机器可以生成底层端点,但无法生成编排逻辑。)丢弃——SKILL是分享该方法的合适位置,而非生成器。
“提升基础质量”测试区分了“机器修复”和“SKILL方法”:如果机器的贡献仍会留下大量针对单个CLI的工作,该方法应放在SKILL中,以便下一个Agent了解该模式;如果机器可以干净地吸收样板代码,则属于生成器模板的问题。

2d. Recurring friction

2d. 重复出现的阻碍

Work that happens on every generation, not just this one. For each: is this inherent to the approach, or can the Printing Press eliminate it?
Propose at least two possible fixes at different levels (generator templates, binary post-processing, skill instruction) and assess which is most durable.
每次生成都会发生的工作,而不仅仅是这一次。对于每个这样的情况,问:这是该方法固有的,还是Printing Press可以消除的?
提出至少两个不同层面的可能修复方案(生成器模板、二进制文件后处理、技能说明),并评估哪个方案更持久。

2e. Discovered optimizations

2e. 发现的优化措施

Improvements noticed during the session — UX ideas, performance improvements, new command patterns, output format improvements. Could this optimization be detected automatically and applied by the Printing Press?
会话中注意到的改进——UX想法、性能提升、新命令模式、输出格式改进。这种优化可以被Printing Press自动检测并应用吗?

2f. Scorer accuracy audit

2f. 评分器准确性审核

Before proposing Printing Press fixes to improve scores, check whether the scoring itself is correct. Changing the Printing Press to satisfy a broken scorer is worse than doing nothing.
For each score penalty from dogfood, verify, and scorecard:
  1. Trace the scorer's logic. Read the scoring tool's source code to understand exactly what it checks. Don't guess.
  2. Test the scorer's assumption against reality. Does the CLI actually have the problem the scorer claims?
  3. Classify the penalty:
    • Scorer is correct — the CLI genuinely has this problem.
    • Scorer is wrong — the CLI is fine; the scoring tool has a bug.
    • Scorer is partially right — both could be better.
Common scorer bugs: name derivation mismatches, grep-based detection missing patterns, file exclusions too broad, section-counting heuristics.
The scorer audit is not optional. Every finding from a score penalty must have a "Scorer correct?" assessment before proposing a fix direction.
在提议修改Printing Press以提升分数之前,检查评分本身是否正确。修改Printing Press以满足有问题的评分器比不做任何修改更糟糕。
对于来自内部测试、验证和评分卡的每个分数惩罚:
  1. **追踪评分器的逻辑。**阅读评分工具的源码,准确理解它检查的内容。不要猜测。
  2. **测试评分器的假设是否符合实际。**CLI真的存在评分器声称的问题吗?
  3. 对惩罚进行分类:
    • 评分器正确——CLI确实存在该问题。
    • 评分器错误——CLI没问题;评分工具有bug。
    • 评分器部分正确——两者都可以改进。
常见的评分器bug:名称推导不匹配、基于grep的检测遗漏模式、文件排除范围太广、章节计数启发式错误。
评分器审核是必须的。每个来自分数惩罚的结论在提议修复方向之前都必须有“评分器是否正确?”的评估。

2g. Combo CLI priority audit

2g. 组合CLI优先级审核

Only runs when the briefing named 2+ sources. Check
$RUN_DIR/source-priority.json
(from the Multi-Source Priority Gate in the main skill). If it doesn't exist but the briefing or user command clearly listed multiple services, that's itself a finding: the priority gate didn't fire when it should have.
For runs with a
source-priority.json
, cross-reference it against the absorb manifest and the shipped CLI:
  1. Command count per source. Count commands attributed to each named source in the manifest. The primary should have at least as many as any secondary. If it has fewer, that's a priority inversion and becomes a finding — even if the user approved the manifest, it means the skill's discovery path for the primary failed silently.
  2. Auth scoping. If the primary was declared free in the priority gate but the shipped CLI requires a paid key for the primary's headline commands, that's a finding — the economics check either didn't run or didn't route the paid key correctly to secondary-only scope.
  3. README leadership. The primary should lead the README and
    --help
    . If a secondary is the first thing the user sees, flag it.
Each of these is a skill instruction gap category finding. The durable fix lives in
skills/printing-press/SKILL.md
(the Multi-Source Priority Gate, the Priority inversion check before Phase Gate 1.5, and the brief's
## Source Priority
section) or in the generator if README ordering is template-driven.
**仅当摘要中指定了2个以上来源时运行。**检查
$RUN_DIR/source-priority.json
(来自主技能中的多源优先级门控)。如果摘要或用户命令明确列出了多个服务,但该文件不存在,这本身就是一个结论:优先级门控在应该触发时没有触发。
对于存在
source-priority.json
的运行实例,将其与吸收清单和交付的CLI进行交叉验证:
  1. 每个来源的命令数量。统计清单中每个指定来源的命令数量。主来源的命令数量至少应与任何次要来源相同。如果更少,这是优先级倒置,并成为一个结论——即使用户批准了清单,这也意味着技能对主来源的发现路径静默失败。
  2. **认证范围。**如果主来源在优先级门控中被声明为免费,但交付的CLI对主来源的核心命令需要付费密钥,这是一个结论——经济检查要么没有运行,要么没有将付费密钥正确路由到仅次要来源的范围。
  3. **README主导性。**主来源应在README和
    --help
    中占据主导地位。如果次要来源是用户首先看到的内容,标记为问题。
这些都属于技能说明缺陷类结论。持久修复位于
skills/printing-press/SKILL.md
(多源优先级门控、第1.5阶段门控前的优先级倒置检查以及摘要的
## Source Priority
部分),或者如果README排序由模板驱动,则位于生成器中。

Phase 2.5: Triage candidates

第2.5阶段:筛选候选结论

Before Phase 3 spends deep analysis on each candidate, run a fast triage to drop candidates that don't justify the deeper look. Most candidates should die here. The retro is a filter, not a funnel — if everything from Phase 2 makes it to Phase 3 unchanged, triage isn't doing its job.
For each candidate, ask in order:
  1. Was this iteration noise? Normal trial-and-error during generation — one-off retry, typo recovery, agent forgetting a flag, transient network blip. Drop.
  2. Is this a printed-CLI fix? The fix lives in
    ~/printing-press/library/<api>/
    and helps only this one CLI. If the proposed change is "edit this command in this CLI" or "regenerate after fixing the spec," it's not a retro finding — it's a polish pass on that CLI. Drop.
  3. Is this an upstream API quirk? The vendor returns null instead of 404, or ignores a query param the docs claim to honor, or has rate limits the spec doesn't declare. The Printing Press doesn't fix vendors. If the only fix is "work around this in the generator for every CLI," that's almost always wrong; if it's "let one CLI work around it," that's a printed-CLI fix. Drop.
  4. Is the only evidence "I noticed this once"? A one-time observation that you can't connect to a recurring pattern across other CLIs is a candidate for Drop, not a P3. P3 means "low priority systemic finding," not "I want to record this somewhere."
  5. Does the same finding appear in 2+ prior retros without being implemented? Don't re-raise at the same priority. Either drop it (the cost-benefit math has been "no" twice and the retro is becoming a wishlist), or reframe as a smaller incremental fix that addresses part of the friction. Search:
    grep -l "<finding keywords>" ~/printing-press/manuscripts/*/proofs/*-retro-*.md
Survivors of these five questions go to Phase 3. Dropped candidates are recorded as one-line entries in the retro's "Dropped at triage" section — they exist for your own discipline check and for the maintainer to see triage actually ran.
Anti-pattern to avoid. A recent Pagliacci retro produced "Skip: None. Every finding warrants action." That sentence is the failure mode this triage exists to prevent. Two of those findings (snake_case in
Use:
, root.go
Short:
rewrite that the SKILL already documents as a manual step) were classic per-CLI / instructional candidates that should have been dropped here. If you find yourself writing "every finding warrants action," stop and re-run triage.
在第3阶段对每个候选结论进行深入分析之前,先进行快速筛选,剔除不值得深入研究的候选结论。**大多数候选结论应在此处被淘汰。**回顾分析是过滤器,而非漏斗——如果第2阶段的所有内容都毫无变化地进入第3阶段,说明筛选没有起到作用。
对于每个候选结论,依次问自己:
  1. **这是迭代噪声吗?**生成过程中的正常试错——一次性重试、拼写错误恢复、Agent忘记标志、临时网络故障。丢弃。
  2. **这是单个CLI的修复吗?**修复位于
    ~/printing-press/library/<api>/
    ,仅对该CLI有效。如果提议的修改是“编辑该CLI中的这个命令”或“修复规范后重新生成”,这不是回顾分析结论——而是对该CLI的优化。丢弃。
  3. **这是上游API的特殊情况吗?**供应商返回null而非404,或忽略文档声称支持的查询参数,或规范未声明的速率限制。Printing Press不修复供应商的问题。如果唯一的修复是“在生成器中为每个CLI解决这个问题”,这几乎总是错误的;如果是“让一个CLI解决这个问题”,这是单个CLI的修复。丢弃。
  4. **唯一的证据是“我注意到过一次”吗?**无法与其他CLI的重复模式关联的一次性观察结果应被丢弃,而非列为P3。P3意味着“低优先级系统性结论”,而非“我想在某个地方记录这个”。
  5. **相同的结论在2次以上的回顾分析中提出但未被实现?**不要以相同的优先级重新提出。要么丢弃(成本效益分析已经两次得出“不”的结论,回顾分析正在变成愿望清单),要么重新构建为更小的增量修复,解决部分阻碍。搜索:
    grep -l "<结论关键词>" ~/printing-press/manuscripts/*/proofs/*-retro-*.md
通过这五个问题的候选结论进入第3阶段。被丢弃的候选结论作为单行条目记录在回顾分析的“筛选阶段丢弃”部分——它们用于检查你的筛选是否严谨,以及让维护者看到筛选确实执行了。
**要避免的反模式。*最近一次Pagliacci回顾分析得出“跳过:无。每个结论都需要行动。”*这句话正是该筛选步骤要防止的失败模式。其中两个结论(
Use:
中的snake_case、SKILL已记录为手动步骤的root.go
Short:
重写)是典型的单个CLI/说明性候选结论,本应在此处被丢弃。如果你发现自己写“每个结论都需要行动”,请停下来重新运行筛选。

Phase 3: Classify findings

第3阶段:分类结论

For each candidate that survived Phase 2.5 triage, answer these seven questions. Question 5 has seven sub-steps (A through G); Step G is the adversarial check. Findings that fail Step G drop out — they don't get a priority, they don't go in the Do/Skip tables, they go on the dropped-candidates list with the reason.
1. What happened? One sentence — the symptom, not the fix.
2. Is the scorer correct? (mandatory for score-penalty findings)
  • Scorer correct → fix the Printing Press (templates, binary, or skill)
  • Scorer wrong → fix the scoring tool, not the Printing Press
  • Both → fix both, label which is primary
3. What category?
CategoryDescription
BugGenerated code is wrong
Scorer bugScoring tool reports a false positive
Template gapNo template for a common pattern
Assumption mismatchPrinting Press assumes X but API uses Y
Recurring frictionHappens every generation, might be inherent
Missing scaffoldingFeature class the Printing Press could emit but doesn't
Default gapPrinting Press emits a wrong or placeholder default
Discovered optimizationImprovement found during use
Skill instruction gapSkill told Claude wrong thing or missed a step
4. Where in the Printing Press does this originate?
Pick exactly one component. The
slug
column drives the
comp:<slug>
label applied to the issue when filed (Phase 6), which is how agents filter related work across retros (
gh issue list --label comp:<slug>
).
ComponentSlugPath
Generator templates
generator
internal/generator/
Spec parser
spec-parser
internal/spec/
OpenAPI parser
openapi-parser
internal/openapi/
Catalog
catalog
catalog/
Main skill
skill
skills/printing-press/SKILL.md
Verify/dogfood/scorecard
scorer
CLI commands
If a finding genuinely spans two components, pick the one where the durable fix lands. Don't multi-label.
5. Blast radius and fallback cost — should the Printing Press handle this?
Step A: Cross-API stress test. Test across API shapes (standard REST, proxy-envelope, RPC-style) and input methods (OpenAPI, crowd-sniffed, HAR-sniffed, no spec).
Step B: Name three concrete APIs from the catalog with direct evidence. Not "every API with multi-word resources" or "any browser-sniffed CLI." Name three specific APIs already in
~/printing-press/library/
(or the embedded
catalog/
directory) where you can point to evidence the pattern exists: a path in their spec, a known endpoint shape, a header the vendor documents, an output you can reproduce. "Stripe, Notion, GitHub probably have this" is hand-waving; "Stripe (Stripe-Version header in spec line N), GitHub (X-GitHub-Api-Version on the issues endpoints), Linear (api-version on /v2/*)" is evidence. If you can name only two with evidence — or three with hand-waving — the finding drops to P3 max with a
subclass:<name>
annotation
, or moves to Drop.
Step C: Counter-check question. Ask explicitly: "If I implemented this fix, would it actively hurt any API that doesn't have this pattern?" If yes, the fix needs a guard or condition before being P1/P2 — not a default change. Example: turning on client-side
?limit=N
truncation by default would hurt APIs that need server-side pagination for correctness; it stays P2 only because it's gated on profiler-detected absence of a paginator. Without that guard the same finding is unsafe to land.
Step D: Recurrence-cost check. Search prior retros under
~/printing-press/manuscripts/*/proofs/*-retro-*.md
for the same finding. If the same finding has been raised in 2+ prior retros without being implemented, the prior cost- benefit math has been "no" twice. Don't re-raise it at the same priority — either move to P3 with a "raised N times, still not justified" annotation, or reframe the finding into a smaller incremental fix that addresses part of the friction. Recurrence at the same priority is a triage failure, not stronger evidence.
Capture matched prior retros. When the search returns hits, record each as a structured tuple — retro CLI name, retro file path (or GitHub issue number if the retro file's frontmatter contains one), and a one-word classification:
  • aligned
    — the prior retro proposed the same fix direction. Strengthens the case; reference it in Step F.
  • contradicts
    — the prior retro proposed an opposing fix or chose a different default. Surface this explicitly: a maintainer reading the new finding must see the disagreement. State in one sentence why this retro reaches a different conclusion (e.g., "prior retro saw single-paginator APIs; this one saw an always-paginated API where the prior default would break").
  • extends
    — the prior retro raised an adjacent finding in the same component area but a different specific fix. Useful context, doesn't change the case.
These tuples flow forward into the per-finding template ("Related prior retros") in the retro doc and merge into the issue body's "Related issues" block alongside the Step 2.5 dedup scan's
related-area
outputs. GitHub auto-cross-links any
#N
issue number you write, so contradictions and alignments show up in both retro timelines without further action.
Step E: Assess fallback cost. How reliably will Claude catch and fix this across every future API? A "simple" edit Claude forgets 30% of the time means 30% ship with the defect.
Step F: Make the tradeoff. Default is don't change the machine. The burden of proof is on the finding to justify a machine change. Continue to Step G only when all three of these are true:
(a) Step B named three concrete APIs with evidence (not speculation). (b) Step D's recurrence-cost check didn't disqualify the finding. (c) Step C's counter-check didn't surface a hurts-other-APIs concern that lacks a guard.
If a finding can't clear all three, it doesn't get a priority — it goes to Drop with the specific reason ("only named 2 APIs with evidence" / "raised 3 times, still not justified" / "fix would hurt single-paginator APIs without a guard").
When the finding applies to an API subclass, include: Condition (when to activate), Guard (when to skip), Frequency estimate.
Step G: Construct the case against filing. Before recording the finding, write 1-2 sentences arguing the opposite — what makes this look like a printed-CLI fix, an iteration artifact, or a wishlist item. Why might a maintainer close this as "works as designed" or "too narrow for a machine fix"? What's the strongest version of "this shouldn't be filed"?
If the case-against is stronger than the case-for, drop the finding. If they're roughly even, drop the finding (default direction is don't-file). Only when the case-for is clearly stronger does the finding survive to Phase 4.
This step is not a formality. It is the explicit place where weak findings die. A finding that survives Step G should be able to state, in one sentence, why the case-against fails — and that sentence is worth quoting in the retro entry.
6. Is this inherent or fixable? Push hard on whether smarter templates, a post-processing step, or better spec analysis could eliminate the friction. If inherent, propose the cheapest mitigation.
7. What is the durable fix? Prefer: template fix > binary post-processing > skill instruction.
Mark uncertainty explicitly. If you can't confidently isolate one root cause or one fix, say so — list the candidate causes (or candidate fixes) and how an implementer could disambiguate before committing. The issue body surfaces this uncertainty so the agent picking up the work doesn't lock in a wrong-but-plausible diagnosis. Confidence isn't a virtue when it's manufactured; an honest "either A or B; verify by X" is more useful than a wrong prescription.
Strip API-specific details from the proposed fix. The durable fix must work across APIs, not just the one that surfaced the finding. If the fix includes hardcoded param names (e.g.,
--sport
,
--league
), date formats (e.g.,
YYYYMMDD
), chunking strategies (e.g., monthly), or domain-specific logic, those are printed-CLI details leaking into the machine recommendation. The machine fix should be parameterized — driven by what the profiler detects in the spec, not by what one API happens to need.
Example of the anti-pattern:
  • Finding: "ESPN sync needs
    --dates
    for historical data"
  • Bad fix: "Add
    --dates
    with
    YYYYMMDD-YYYYMMDD
    format,
    --sport
    /
    --league
    flags, and monthly chunking to the sync template"
  • Good fix: "When the profiler detects a date-range query param, emit a
    --dates
    flag that passes the value through to the API"
The bad fix bakes ESPN's date format, scope params, and chunking strategy into the machine. The good fix lets the profiler drive behavior from the spec.
对于每个通过第2.5阶段筛选的候选结论,回答以下七个问题。问题5有七个子步骤(A到G);步骤G是对抗性检查。未通过步骤G的结论被丢弃——它们没有优先级,不会进入“执行/跳过”表格,而是被添加到丢弃候选列表并说明原因。
**1. 发生了什么?**一句话描述——症状,而非修复方案。
2. 评分器是否正确?(来自分数惩罚的结论必填)
  • 评分器正确 → 修复Printing Press(模板、二进制文件或技能)
  • 评分器错误 → 修复评分工具,而非Printing Press
  • 两者都有问题 → 修复两者,标记哪个是主要的
3. 类别是什么?
类别描述
Bug生成的代码错误
Scorer bug评分工具报告误报
Template gap常见模式缺少模板
Assumption mismatchPrinting Press假设X,但API使用Y
Recurring friction每次生成都会发生,可能是固有问题
Missing scaffoldingPrinting Press可以生成但未生成的功能类别
Default gapPrinting Press生成错误或占位符默认值
Discovered optimization使用过程中发现的改进
Skill instruction gapSkill告诉Claude错误的内容或遗漏步骤
4. 问题起源于Printing Press的哪个部分?
选择恰好一个组件。
slug
列决定提交Issue时应用的
comp:<slug>
标签(第6阶段),Agent可以通过该标签筛选不同回顾分析中的相关工作(
gh issue list --label comp:<slug>
)。
组件Slug路径
生成器模板
generator
internal/generator/
规范解析器
spec-parser
internal/spec/
OpenAPI解析器
openapi-parser
internal/openapi/
目录
catalog
catalog/
主技能
skill
skills/printing-press/SKILL.md
验证/内部测试/评分卡
scorer
CLI命令
如果结论确实涉及两个组件,选择持久修复所在的组件。不要添加多个标签。
5. 影响范围与 fallback 成本——Printing Press应该处理这个问题吗?
**步骤A:跨API压力测试。**在不同API结构(标准REST、代理封装、RPC风格)和输入方式(OpenAPI、浏览器抓取、HAR抓取、无规范)下测试。
步骤B:从目录中命名三个有直接证据的具体API。不是“每个有多词资源的API”或“任何浏览器抓取的CLI”。命名三个已在
~/printing-press/library/
(或嵌入式
catalog/
目录)中的具体API,你可以指出该模式存在的证据:规范中的路径、已知的端点结构、供应商文档的标头、可重现的输出。“Stripe、Notion、GitHub可能有这个”是主观猜测;“Stripe(规范第N行的Stripe-Version标头)、GitHub(问题端点的X-GitHub-Api-Version)、Linear(/v2/*上的api-version)”是证据。如果你只能命名两个有证据的API——或三个主观猜测的API——结论最多列为
P3,并添加
subclass:<name>
注释
,或被丢弃。
**步骤C:反向检查问题。**明确问自己:“如果我实现这个修复,会不会对没有这个模式的API造成负面影响?”如果是,修复需要添加防护或条件才能成为P1/P2——不能是默认修改。例如,默认启用客户端
?limit=N
截断会破坏需要服务器端分页才能保证正确性的API;只有在分析器检测到没有分页器时才启用,它才能保持P2优先级。没有这个防护,同样的结论是不安全的。
**步骤D:重复成本检查。**在
~/printing-press/manuscripts/*/proofs/*-retro-*.md
下搜索相同的结论。如果相同的结论在2次以上的回顾分析中提出但未被实现,之前的成本效益分析已经两次得出“不”的结论。不要以相同的优先级重新提出——要么列为P3并添加“已提出N次,仍不合理”的注释,要么将结论重构为更小的增量修复,解决部分阻碍。以相同优先级重复提出是筛选失败,而非更强的证据。
**记录匹配的先前回顾分析。**当搜索返回结果时,将每个结果记录为结构化元组——回顾分析的CLI名称、回顾分析文件路径(如果回顾分析文件的前置内容包含GitHub Issue编号,则为该编号),以及一个单词的分类:
  • aligned
    ——先前的回顾分析提议相同的修复方向。加强结论;在步骤F中引用。
  • contradicts
    ——先前的回顾分析提议相反的修复或选择不同的默认值。明确指出:阅读新结论的维护者必须看到这种分歧。用一句话说明为什么这次回顾分析得出不同的结论(例如,“先前的回顾分析针对单分页器API;这次针对始终分页的API,先前的默认值会导致问题”)。
  • extends
    ——先前的回顾分析提出了同一组件领域的相邻结论,但具体修复不同。有用的背景信息,不影响结论。
这些元组会进入回顾分析文档的每个结论模板(“相关先前回顾分析”),并与第2.5阶段去重扫描的
related-area
输出合并到Issue正文的“相关Issue”块中。GitHub会自动链接你写入的任何
#N
Issue编号,因此分歧和一致性会在两个回顾分析时间线中显示,无需进一步操作。
**步骤E:评估fallback成本。**Claude在未来每个API中发现并修复这个问题的可靠性如何?Claude有30%的几率忘记的“简单”编辑意味着30%的CLI会带着缺陷交付。
步骤F:权衡决策。默认是不修改机器。结论需承担举证责任,证明修改机器是合理的。只有当以下三个条件都满足时,才继续到步骤G:
(a) 步骤B命名了三个有证据的具体API(而非猜测)。 (b) 步骤D的重复成本检查未 disqualify结论。 (c) 步骤C的反向检查未发现缺乏防护的、会伤害其他API的问题。
如果结论无法满足所有三个条件,它不会获得优先级——会被丢弃并说明具体原因(“仅命名2个有证据的API”/“已提出3次,仍不合理”/“修复会伤害单分页器API且无防护”)。
当结论适用于API子类时,包含:激活条件、跳过防护、频率估计。
**步骤G:构建反对提交的理由。**在记录结论之前,写1-2句话论证相反的观点——为什么这看起来像是单个CLI的修复、迭代产物或愿望清单项目。为什么维护者可能会将其标记为“按设计工作”或“范围太窄,不适合机器修复”?“不应该提交”的最强理由是什么?
如果反对理由比支持理由更强,丢弃结论。如果两者大致相当,丢弃结论(默认方向是不提交)。只有当支持理由明显更强时,结论才能进入第4阶段。
这一步不是形式主义。这是弱结论被淘汰的明确环节。通过步骤G的结论应该能够用一句话说明反对理由不成立——这句话值得在回顾分析条目中引用。
**6. 这是固有问题还是可修复的?**深入思考更智能的模板、后处理步骤或更好的规范分析是否可以消除阻碍。如果是固有问题,提出成本最低的缓解方案。
**7. 持久修复方案是什么?**优先选择:模板修复 > 二进制文件后处理 > 技能说明。
**明确标记不确定性。**如果你无法自信地确定一个根本原因或一个修复方案,请说明——列出候选原因(或候选修复方案),以及实现者在提交前如何消除歧义。Issue正文会显示这种不确定性,以便接手工作的Agent不会锁定错误但看似合理的诊断。制造的信心不是美德;诚实的“要么A要么B;通过X验证”比错误的处方更有用。
**从提议的修复中剥离API特定细节。**持久修复必须适用于所有API,而不仅仅是发现该问题的那个API。如果修复包含硬编码的参数名称(例如
--sport
--league
)、日期格式(例如
YYYYMMDD
)、分块策略(例如每月)或特定领域逻辑,这些是单个CLI的细节泄露到机器建议中。机器修复应该是参数化的——由分析器在规范中检测到的内容驱动,而非某个API恰好需要的内容。
反模式示例:
  • 结论:“ESPN同步需要
    --dates
    获取历史数据”
  • 糟糕的修复:“向同步模板添加
    --dates
    (格式为
    YYYYMMDD-YYYYMMDD
    )、
    --sport
    /
    --league
    标志,以及每月分块”
  • 良好的修复:“当分析器检测到日期范围查询参数时,生成
    --dates
    标志,将值传递给API”
糟糕的修复将ESPN的日期格式、范围参数和分块策略硬编码到机器中。良好的修复让分析器根据规范驱动行为。

Phase 4: Prioritize

第4阶段:优先级排序

Sort survivors of Phase 3 into three buckets:
  • Do — survived Phase 3 Step G with a clear case-for. Assign a priority (P1, P2, P3) based on frequency, fallback reliability, and complexity. Scorer bugs are just findings like any other — rank them by impact alongside template gaps and parser issues.
  • Skip — survived Phase 2.5 triage but didn't clear Phase 3 (Step B couldn't name 3 APIs with evidence, Step D recurrence-cost disqualified, or Step G's case-against was stronger). State the specific step that failed. These are listed in the retro so the maintainer can see what was considered and rejected.
  • Drop — rejected at Phase 2.5 triage as iteration noise, printed-CLI fix, upstream API quirk, unproven one-off, or recurring-not-implemented. Listed as one-liners only — they don't need full analysis, they need a record so triage is auditable.
No numerical scoring formulas. State the priority reasoning in words.
Sanity check before moving to Phase 5. Look at the bucket distribution. Almost every retro should have some drops and some skips. A retro with "all Do, no Skip, no Drop" is the failure mode — re-run triage and Step G on the weakest findings. Likewise, if every Do is P1, you're not prioritizing, you're inflating; force yourself to identify the weakest "Do" and ask whether it really beats the Skip bar.
将通过第3阶段的结论分为三个类别:
  • 执行——通过第3阶段步骤G,支持理由明确。根据频率、fallback可靠性和复杂度分配优先级(P1、P2、P3)。评分器bug与模板缺陷和解析器问题一样,按影响排名。
  • 跳过——通过第2.5阶段筛选,但未通过第3阶段(步骤B无法命名3个有证据的API、步骤D重复成本检查 disqualify、或步骤G的反对理由更强)。说明未通过的具体步骤。这些会在回顾分析中列出,以便维护者看到哪些内容被考虑并拒绝。
  • 丢弃——在第2.5阶段筛选中被拒绝,属于迭代噪声、单个CLI修复、上游API特殊情况、未证实的一次性问题或重复提出未实现的问题。仅列出单行条目——不需要完整分析,只需记录以便筛选可审计。
没有数值评分公式。用文字说明优先级理由。
**进入第5阶段前的 sanity check。**查看类别分布。几乎每个回顾分析都应该有一些丢弃和一些跳过的结论。一个“全执行,无跳过,无丢弃”的回顾分析是失败模式——对最弱的结论重新运行筛选和步骤G。同样,如果所有执行结论都是P1,说明你没有进行优先级排序,而是在夸大;强迫自己找出最弱的“执行”结论,问自己它是否真的超过了跳过的标准。

Phase 5: Write the retro

第5阶段:撰写回顾分析文档

The retro document is the durable audit trail — keep all fields below. The GitHub issue body in Phase 6 will use a slim subset (action-shaped fields only); the full triage rationale lives here, in the doc that gets uploaded as an artifact and linked from the issue. See
references/issue-template.md
for the issue-body shape.
Write the full retro document using this template:
markdown
undefined
回顾分析文档是持久的审计记录——保留以下所有字段。第6阶段的GitHub Issue正文将使用精简的子集(仅行动相关字段);完整的筛选理由保存在此处,作为工件上传并从Issue链接。查看
references/issue-template.md
获取Issue正文格式。
使用以下模板撰写完整的回顾分析文档:
markdown
undefined

Printing Press Retro: <API name>

Printing Press回顾分析: <API名称>

Session Stats

会话统计

  • API: <name>
  • Spec source: <catalog/browser-sniffed/docs/HAR>
  • Scorecard: <score>/100 (<grade>)
  • Verify pass rate: <X>%
  • Fix loops: <N>
  • Manual code edits: <N>
  • Features built from scratch: <N>
  • API: <名称>
  • 规范来源: <catalog/浏览器抓取/文档/HAR>
  • 评分卡: <分数>/100 (<等级>)
  • 验证通过率: <X>%
  • 修复循环次数: <N>
  • 手动代码编辑次数: <N>
  • 从头构建的功能数量: <N>

Findings

结论

1. <Title> (<category>)

1. <标题> (<类别>)

  • What happened: ...
  • Scorer correct? Yes / No / Partially. [details]
  • Root cause: Component + what's specifically wrong
  • Cross-API check: Would this recur?
  • Frequency: every API / most / subclass:<name> / this API only
  • Fallback if the Printing Press doesn't fix it: ...
  • Worth a Printing Press fix? ...
  • Inherent or fixable: ...
  • Durable fix: ...
  • Test: How to verify (positive + negative)
  • Evidence: Session moment that surfaced this
  • Related prior retros: (from Phase 3 Step D; "None" if no matches)
    • <api-slug>
      retro #<issue-num-if-known>
      aligned
      /
      contradicts
      /
      extends
      . <one-sentence note on what changed or what's shared>
    • ...
  • 发生了什么: ...
  • 评分器是否正确? 是/否/部分正确。[详情]
  • 根本原因: 组件 + 具体问题
  • 跨API检查: 会重复出现吗?
  • 频率: 每个API/大多数/子类:<名称>/仅该API
  • 如果Printing Press不修复的fallback方案: ...
  • 值得Printing Press修复吗? ...
  • 固有问题或可修复: ...
  • 持久修复方案: ...
  • 测试方法: 如何验证(正向+反向)
  • 证据: 发现该问题的会话时刻
  • 相关先前回顾分析: (来自第3阶段步骤D;无匹配则为“无”)
    • <api-slug>
      回顾分析 #<已知Issue编号> —
      aligned
      /
      contradicts
      /
      extends
      。<一句话说明变化或共同点>
    • ...

Prioritized Improvements

优先级排序的改进措施

P1 — High priority

P1 — 高优先级

FindingTitleComponentFrequencyFallback ReliabilityComplexityGuards
结论标题组件频率Fallback可靠性复杂度防护措施

P2 — Medium priority

P2 — 中优先级

FindingTitleComponentFrequencyFallback ReliabilityComplexityGuards
结论标题组件频率Fallback可靠性复杂度防护措施

P3 — Low priority

P3 — 低优先级

FindingTitleComponentFrequencyFallback ReliabilityComplexityGuards
Omit empty priority sections.
结论标题组件频率Fallback可靠性复杂度防护措施
省略空的优先级部分。

Skip

跳过

FindingTitleWhy it didn't make it (Step B / Step D / Step G)
Findings that survived Phase 2.5 triage but failed Phase 3 — name the specific step that failed (e.g., "Step B: only 2 APIs with evidence" / "Step G: case-against stronger; mostly per-CLI"). Empty if every Phase 3 candidate filed.
结论标题未通过原因(步骤B / 步骤D / 步骤G)
通过第2.5阶段筛选但未通过第3阶段的结论——说明未通过的具体步骤(例如“步骤B: 仅2个有证据的API”/“步骤G: 反对理由更强;主要是单个CLI的问题”)。如果第3阶段的所有候选结论都被提交,则为空。

Dropped at triage

筛选阶段丢弃

CandidateOne-linerDrop reason
Candidates rejected at Phase 2.5. One line each. Reasons:
iteration-noise
/
printed-CLI
/
API-quirk
/
unproven-one-off
/
raised-N-times
. If this section is empty, re-check Phase 2.5 — almost every retro has some.
候选结论单行描述丢弃原因
在第2.5阶段被拒绝的候选结论。每个一行。原因:
iteration-noise
/
printed-CLI
/
API-quirk
/
unproven-one-off
/
raised-N-times
。如果此部分为空,请重新检查第2.5阶段——几乎每个回顾分析都有一些丢弃的内容。

Work Units

工作单元

(see Phase 5.5)
(见第5.5阶段)

Anti-patterns

反模式

  • ...
  • ...

What the Printing Press Got Right

Printing Press做对的地方

  • ...

Save the retro to manuscript proofs (always) and to the temp retro scratch
directory (always). Do not save retro documents under the source repo's
`docs/retros/` directory; the skill must work the same way for users who do not
have the repo checked out, and retro documents are issue artifacts rather than
durable repo docs.

```bash
RETRO_STAMP="$(date +%Y%m%d-%H%M%S)"
RETRO_PROOF_PATH="$PRESS_MANUSCRIPTS/$API_NAME/$RUN_ID/proofs/$RETRO_STAMP-retro-$CLI_NAME.md"
RETRO_SCRATCH_DIR="/tmp/printing-press/retro"
RETRO_SCRATCH_PATH="$RETRO_SCRATCH_DIR/$RETRO_STAMP-$API_NAME-retro.md"
mkdir -p "$(dirname "$RETRO_PROOF_PATH")" "$RETRO_SCRATCH_DIR"
Write the full retro document to
$RETRO_PROOF_PATH
, then copy that file to
$RETRO_SCRATCH_PATH
. This must complete before Phase 6 Step 1 copies the manuscripts directory to staging.
  • ...

将回顾分析文档保存到手稿证明(始终)和临时回顾分析临时目录(始终)。不要将回顾分析文档保存到源仓库的`docs/retros/`目录下;该技能必须对未检出仓库的用户同样有效,且回顾分析文档是Issue工件而非持久仓库文档。

```bash
RETRO_STAMP="$(date +%Y%m%d-%H%M%S)"
RETRO_PROOF_PATH="$PRESS_MANUSCRIPTS/$API_NAME/$RUN_ID/proofs/$RETRO_STAMP-retro-$CLI_NAME.md"
RETRO_SCRATCH_DIR="/tmp/printing-press/retro"
RETRO_SCRATCH_PATH="$RETRO_SCRATCH_DIR/$RETRO_STAMP-$API_NAME-retro.md"
mkdir -p "$(dirname "$RETRO_PROOF_PATH")" "$RETRO_SCRATCH_DIR"
将完整的回顾分析文档写入
$RETRO_PROOF_PATH
,然后将该文件复制到
$RETRO_SCRATCH_PATH
。这必须在第6阶段步骤1将手稿目录复制到 staging 之前完成。

Phase 5.5: Plannable work units

第5.5阶段:可规划的工作单元

Group related findings into coherent work units a planner could pick up directly.
For each "Do" finding or group of related findings:
markdown
undefined
将相关结论分组为规划者可以直接接手的连贯工作单元。
对于每个“执行”结论或相关结论组:
markdown
undefined

WU-1: <Title> (from F1, F3, ...)

WU-1: <标题> (来自F1, F3, ...)

  • Priority: P1 / P2 / P3 (max priority among absorbed findings — P1 if any absorbed finding is P1, else P2 if any is P2, else P3)
  • Component: generator / openapi-parser / spec-parser / scorer / skill / catalog (must match one of the six fixed component slugs; drives the
    comp:*
    label applied to the issue when filed)
  • Goal: One sentence describing the outcome
  • Target: <component and area, e.g., "Generator templates in internal/generator/">
  • Acceptance criteria:
    • positive test: ...
    • negative test: ...
  • Scope boundary: What this does NOT include
  • Dependencies: Other work units that must complete first
  • Complexity: small / medium / large

The six fixed component slugs are: `generator` (`internal/generator/`),
`openapi-parser` (`internal/openapi/`), `spec-parser` (`internal/spec/`),
`scorer` (verify / dogfood / scorecard), `skill` (`skills/printing-press/SKILL.md`),
`catalog` (`catalog/`). If a WU genuinely spans two, pick the **primary** one — the
component where the durable fix will land. Pick exactly one; don't multi-label.

**If running from inside the printing-press repo (`IN_REPO=true`):**
Resolve target file paths using Glob and Grep tool invocations on `$REPO_ROOT` to
make work units more precise. E.g., use Glob to find `internal/generator/*.go` files,
Grep to find where sync code is generated.

**If running externally (`IN_REPO=false`):**
Describe target components by name (e.g., "Generator templates in `internal/generator/`")
and acceptance criteria without resolved file paths. The fixer will resolve paths when
they pick up the work.
  • 优先级: P1 / P2 / P3 (吸收结论中的最高优先级——如果任何吸收结论是P1则为P1,否则如果有P2则为P2,否则为P3)
  • 组件: generator / openapi-parser / spec-parser / scorer / skill / catalog (必须匹配六个固定组件slug之一;决定提交Issue时应用的
    comp:*
    标签)
  • 目标: 一句话描述结果
  • 目标位置: <组件和区域,例如“internal/generator/中的生成器模板”>
  • 验收标准:
    • 正向测试: ...
    • 反向测试: ...
  • 范围边界: 不包含的内容
  • 依赖项: 必须先完成的其他工作单元
  • 复杂度: 小/中/大

六个固定组件slug是:`generator`(`internal/generator/`)、`openapi-parser`(`internal/openapi/`)、`spec-parser`(`internal/spec/`)、`scorer`(验证/内部测试/评分卡)、`skill`(`skills/printing-press/SKILL.md`)、`catalog`(`catalog/`)。如果一个工作单元确实涉及两个组件,选择**主要**组件——持久修复将落地的组件。恰好选择一个;不要添加多个标签。

**如果在printing-press仓库内运行(`IN_REPO=true`):**使用Glob和Grep工具调用`$REPO_ROOT`解析目标文件路径,使工作单元更精确。例如,使用Glob查找`internal/generator/*.go`文件,使用Grep查找同步代码生成的位置。

**如果在外部运行(`IN_REPO=false`):**按名称描述目标组件(例如“`internal/generator/`中的生成器模板”),并提供验收标准,无需解析文件路径。修复者接手工作时会解析路径。

Phase 5.6: Issue gate — are there Printing Press improvements?

第5.6阶段:Issue门控——是否有Printing Press改进措施?

After prioritization and work units are written, decide whether GitHub issues are warranted. Each WU becomes one flat top-level issue (no parent, no sub-issue hierarchy). The purpose of filing is to give someone (human or agent) something to fix in the Printing Press. If every finding is specific to this one printed CLI with nothing to change in the Printing Press, filing is noise — there's nothing to act on.
Skip filing if:
  • Every finding landed in "Skip"
  • All findings are printed-CLI-specific (manual edits that only apply to this one API and wouldn't recur across other CLIs)
  • The "Do" table is empty
File issues (one per WU) if:
  • There is at least one "Do" finding — i.e., something a maintainer or agent could act on in the Printing Press (templates, binary, skills, or scoring tools)
Use judgment. A retro that found three things but all three are "this API has a weird auth scheme no other API uses" is not worth filing. A retro that found one small template gap that would help every future CLI is worth filing.
If filing is skipped, still save the retro locally (manuscript proofs +
/tmp/printing-press/retro/
), present the findings to the user, then jump directly to Phase 6 Step 6 (present results — adjusted to show local-only paths).
完成优先级排序和工作单元撰写后,决定是否需要创建GitHub Issue。每个工作单元成为一个独立的顶级Issue(无父Issue,无子Issue层级)。提交Issue的目的是让某人(人类或Agent)修复Printing Press中的问题。如果所有结论都特定于这个单个Printed CLI,没有需要修改Printing Press的内容,提交Issue就是噪音——没有可执行的内容。
跳过提交如果:
  • 所有结论都在“跳过”类别中
  • 所有结论都是单个CLI特定的(仅适用于该API的手动编辑,不会在其他CLI中重复出现)
  • “执行”表格为空
提交Issue(每个工作单元一个)如果:
  • 至少有一个“执行”结论——即维护者或Agent可以在Printing Press中采取行动的内容(模板、二进制文件、技能或评分工具)
使用判断力。一个发现三个问题但所有问题都是“该API有其他API都不使用的奇怪认证方案”的回顾分析不值得提交。一个发现一个小模板缺陷,能帮助未来所有CLI的回顾分析值得提交。
如果跳过提交,仍需将回顾分析文档本地保存(手稿证明 +
/tmp/printing-press/retro/
),向用户展示结论,然后直接跳转到第6阶段步骤6(展示结果——调整为显示本地路径)。

Phase 6: Package, upload, and present

第6阶段:打包、上传和展示

Step 1: Package artifacts into staging folder

步骤1:将工件打包到staging文件夹

Read and apply references/artifact-packaging.md through Step 4 only (create staging dir, copy, scrub, zip). Do not upload or clean up yet — the staging folder stays alive until the end of Phase 6.
The staging folder (
$STAGING_DIR
) now contains the scrubbed copies and the zips. This is both the review target and the upload source.
读取并应用references/artifact-packaging.md仅步骤4(创建staging目录、复制、清理、压缩)。不要上传或清理——staging文件夹在第6阶段结束前保持存在。
staging文件夹(
$STAGING_DIR
)现在包含清理后的副本和压缩包。这既是审查目标也是上传源。

Step 2: Compute filing plan + confirm before publishing

步骤2:计算提交计划 + 发布前确认

This step only runs if the Phase 5.6 issue gate passed (there are Printing Press findings to act on).
Before showing the confirm prompt, run
references/issue-template.md
Steps 1, 2, and 2.5 to ensure labels exist, sort the work units, and compute the per-WU filing plan via the dedup scan against open retro-tagged issues. Each WU ends up classified as either:
  • File new — no matching open issue
  • Comment on #N — Step 2.5 found a
    same
    match; the new evidence will be added as a comment instead of filing a duplicate
  • File new with related issues — Step 2.5 found one or more
    related-area
    matches; the new issue's body will reference them via
    #N
    in the Related issues block
The dedup scan does not need to be bulletproof. Bias toward "file new" when uncertain — duplicates are recoverable, miscomments on the wrong issue are uglier.
Then show the user a summary including the filing plan and ask for confirmation via
AskUserQuestion
.
Ready to submit your retro.
Here's what will happen on mvanhorn/cli-printing-press:
Filing plan:
#TitlePlanNotes
1<wu-1 title>File new (P1, comp:<slug>)No match
2<wu-2 title>Comment on #234Matches "<existing title>"
3<wu-3 title>File new + reference #189Adjacent open issue
Each new issue carries
retro
,
priority:P<n>
,
comp:<slug>
labels — agents filter related work across retros with
gh issue list --label comp:<slug>
or
gh issue list --label priority:P1
.
Scrubbed artifact zips uploaded to catbox.moe and linked from each new issue:
  • Retro document — full triage rationale, drops, skips, what went right
  • Manuscripts (<size>) — research brief, shipcheck proof, build logs
  • CLI source (<size>) — the generated Go code (no binary, no vendor/) (omit if not available)
Everything is staged at
<$STAGING_DIR>
if you'd like to inspect the files first.
Options:
  1. Submit — execute the filing plan
  2. Let me review the files first — I'll check the staging folder, then come back
  3. Save locally only — skip filing, keep the manuscript proof and temp copy
If the user picks "Let me review the files first," acknowledge and wait. When they come back, re-ask with Submit / Save locally only.
If the user picks "Save locally only," skip Steps 3 and 4 — the retro is already saved to manuscript proofs and
/tmp/printing-press/retro/
. Clean up the staging folder, then jump to Step 6.
If the user wants to override a dedup decision before submitting (e.g., "file new for WU-2 instead of commenting"), accept the override: clear
WU_DEDUP[i]
for that WU and proceed.
仅当第5.6阶段Issue门控通过(有Printing Press改进结论可行动)时运行此步骤。
在显示确认提示之前,运行references/issue-template.md步骤1、2和2.5,确保标签存在、排序工作单元,并通过对已标记为retro的开放Issue进行去重扫描,计算每个工作单元的提交计划。每个工作单元最终被分类为:
  • 提交新Issue——无匹配的开放Issue
  • 在#N上添加评论——步骤2.5找到
    same
    匹配;新证据将作为评论添加,而非提交重复Issue
  • 提交新Issue并关联相关Issue——步骤2.5找到一个或多个
    related-area
    匹配;新Issue的正文将通过
    #N
    引用这些Issue
去重扫描不需要完美。不确定时偏向“提交新Issue”——重复Issue可恢复,在错误Issue上添加评论更糟糕。
然后向用户展示包含提交计划的摘要,并通过
AskUserQuestion
请求确认。
准备提交你的回顾分析。
将在mvanhorn/cli-printing-press上执行以下操作:
提交计划:
#标题计划备注
1<wu-1标题>提交新Issue(P1, comp:<slug>无匹配
2<wu-2标题>在#234上添加评论匹配“<现有标题>”
3<wu-3标题>提交新Issue并关联#189相邻开放Issue
每个新Issue带有
retro
priority:P<n>
comp:<slug>
标签——Agent可以使用
gh issue list --label comp:<slug>
gh issue list --label priority:P1
筛选不同回顾分析中的相关工作。
清理后的工件压缩包将上传到catbox.moe并从每个新Issue链接:
  • 回顾分析文档——完整筛选理由、丢弃结论、跳过结论、做对的地方
  • 手稿(<大小>)——研究摘要、交付检查证明、构建日志
  • CLI源码(<大小>)——生成的Go代码(无二进制文件,无vendor/)(不可用时省略)
所有内容都已暂存到
<$STAGING_DIR>
,你可以先检查文件。
选项:
  1. 提交——执行提交计划
  2. 让我先检查文件——我会检查staging文件夹,然后回来
  3. 仅本地保存——跳过提交,保留手稿证明和临时副本
如果用户选择“让我先检查文件”,确认并等待。当用户回来时,重新询问“提交/仅本地保存”。
如果用户选择“仅本地保存”,跳过步骤3和4——回顾分析已保存到手稿证明和
/tmp/printing-press/retro/
。清理staging文件夹,然后跳转到步骤6。
如果用户想在提交前覆盖去重决策(例如“为WU-2提交新Issue而非添加评论”),接受覆盖:清除该WU的
WU_DEDUP[i]
并继续。

Step 3: Upload artifacts

步骤3:上传工件

Run artifact-packaging.md Step 5 (the catbox upload) using the zips already in
$STAGING_DIR
. This produces
$MANUSCRIPTS_URL
and
$CLI_SOURCE_URL
.
运行artifact-packaging.md的步骤5(catbox上传),使用
$STAGING_DIR
中已有的压缩包。这会生成
$MANUSCRIPTS_URL
$CLI_SOURCE_URL

Step 4: Execute the filing plan

步骤4:执行提交计划

Steps 1, 2, and 2.5 of references/issue-template.md already ran during Step 2 (filing plan + confirm), so labels exist, WUs are sorted, and
$WU_DEDUP
and
$WU_RELATED
are populated. This step runs Step 3 of the reference: build bodies and execute the plan in parallel.
The "Execution principles" block at the top of
issue-template.md
is mandatory: build issue bodies inline (heredocs into shell variables, not the Write tool), run the whole step in one Bash invocation, and parallelize the per-WU
gh issue create
/
gh issue comment
calls. Skipping these costs real wall-clock latency — an N WU retro should finish in a single round trip's worth of network time, not a serialized stack of them.
Each WU is independent: WUs marked
comment:#N
get a comment on the existing issue; WUs marked file-new create a new flat top-level issue. No parent, no sub-issue REST linking — every new issue stands alone in GitHub's issue list with its own open/close lifecycle.
Each new issue carries its own
priority:P<n>
and
comp:<slug>
labels. This is what enables
gh issue list --label comp:openapi-parser
to surface every retro WU in that area across every retro — labels are the cross-retro discovery surface, not auto-cross-links inside issue bodies.
Each new issue body's Related issues block combines:
  • Prior-retro references from Phase 3 Step D (alignments, contradictions, extensions across retros)
  • related-area
    issue references from Step 2.5 (open issues in adjacent territory)
Both reach across separate filed work where the
#N
auto-cross-link is real signal. The body does not auto-cross-link to sibling WUs in the same retro; that linkage is noise unless one is genuinely a prerequisite (captured as free-text
Dependencies:
instead).
If
gh
is not authenticated or every per-WU action fails, follow the graceful degradation path in the issue-template reference: save locally and print manual filing instructions. Per-WU partial failures (some succeed, some don't) are surfaced through
$FAILED_ISSUES
in Step 6.
references/issue-template.md的步骤1、2和2.5已在步骤2(提交计划+确认)中运行,因此标签存在、工作单元已排序,且
$WU_DEDUP
$WU_RELATED
已填充。此步骤运行步骤3:构建正文并并行执行计划。
issue-template.md
顶部的“执行原则”块是必须的:内联构建Issue正文(使用shell变量的heredocs,而非Write工具)、在一次Bash调用中运行整个步骤,并并行执行每个工作单元的
gh issue create
/
gh issue comment
调用。跳过这些会增加实际延迟——N个工作单元的回顾分析应在一次网络往返时间内完成,而非串行执行。
每个工作单元都是独立的:标记为
comment:#N
的工作单元会在现有Issue上添加评论;标记为file-new的工作单元会创建一个新的独立顶级Issue。无父Issue,无子Issue REST链接——每个新Issue在GitHub的Issue列表中独立存在,有自己的开放/关闭生命周期。
每个新Issue带有自己的
priority:P<n>
comp:<slug>
标签。这使得
gh issue list --label comp:openapi-parser
可以显示每个回顾分析中该领域的所有工作单元——标签是跨回顾分析的发现入口,而非Issue正文中的自动交叉链接。
每个新Issue正文的相关Issue块包含:
  • 第3阶段步骤D的先前回顾分析引用(跨回顾分析的对齐、矛盾、扩展)
  • 步骤2.5的
    related-area
    Issue引用(相邻领域的开放Issue)
两者都在
#N
自动交叉链接是真实信号的独立提交工作中跨领域关联。正文不会自动交叉链接同一回顾分析中的兄弟工作单元;除非一个确实是另一个的先决条件(在自由文本
Dependencies:
中捕获),否则这种链接是噪音。
如果
gh
未认证或每个工作单元的操作都失败,遵循issue-template参考中的优雅降级路径:本地保存并打印手动提交说明。部分工作单元失败(部分成功,部分失败)会在步骤6中通过
$FAILED_ISSUES
显示。

Step 5: Local scratch copy

步骤5:本地临时副本

Ensure the temp scratch copy exists. This is the human-friendly local path for reviewing or manually filing the retro when upload or issue creation fails.
bash
if [ -f "$RETRO_PROOF_PATH" ]; then
  mkdir -p "$RETRO_SCRATCH_DIR"
  cp "$RETRO_PROOF_PATH" "$RETRO_SCRATCH_PATH"
fi
确保临时副本存在。这是上传或Issue创建失败时,用于回顾或手动提交回顾分析的友好本地路径。
bash
if [ -f "$RETRO_PROOF_PATH" ]; then
  mkdir -p "$RETRO_SCRATCH_DIR"
  cp "$RETRO_PROOF_PATH" "$RETRO_SCRATCH_PATH"
fi

Step 6: Present results

步骤6:展示结果

After issues are created and comments posted, show the user a summary in priority order. Group
created
and
commented
outcomes — both are real filed work, but the shape differs.
Retro submitted!
Filed <C> new issue<s>, added <E> comment<s> on existing issues (P1 → P3 order):
New issues:
  • [P1] <title> — <full $OUTCOME_URL[i]>
  • [P2] <title> — <full $OUTCOME_URL[i]>
  • ...
Comments on existing issues:
  • [P1] <title> → comment on #234 — <comment URL>
  • ...
<N> findings across <M> work units. New issues are tagged with
comp:<slug>
and
priority:P<n>
labels — agents can filter related work across retros with
gh issue list --label comp:<slug>
or
gh issue list --label priority:P1
. (if artifacts uploaded) Artifacts: retro doc · manuscripts · CLI source Local copy: <$RETRO_SCRATCH_PATH>
The
[P<n>]
annotation here is presentation-only — the issue titles themselves do not carry a priority prefix (priority lives on the label). Showing it in the user-facing summary helps the user scan filed work in priority order without opening each issue.
Omit either subsection (
New issues:
or
Comments on existing issues:
) when empty. A retro that produced only comments (every WU matched an existing open issue) is a good outcome — it means the issue tracker already covered the findings and the new evidence reinforces them.
If
$FAILED_ISSUES
is non-empty (set by
references/issue-template.md
Step 3), append a warning block before the closing line:
⚠️ Some actions need attention:
  • <title> — issue creation failed
  • <title> — comment on #234 failed
  • ...
File the missing issue(s) or comment(s) manually using the retro doc at <$RETRO_SCRATCH_PATH>.
If filing wasn't completed (user chose local-only, or gh failed entirely), show the local save paths and the manual filing instructions printed by the issue-template fallback path.
创建Issue和添加评论后,按优先级顺序向用户展示摘要。分组
created
commented
结果——两者都是已提交的工作,但形式不同。
回顾分析已提交!
提交了<C>个新Issue,在现有Issue上添加了<E>条评论(按P1→P3顺序):
新Issue:
  • [P1] <标题> — <完整$OUTCOME_URL[i]>
  • [P2] <标题> — <完整$OUTCOME_URL[i]>
  • ...
在现有Issue上添加的评论:
  • [P1] <标题> → 在#234上添加评论 — <评论URL>
  • ...
<N>个结论,分布在<M>个工作单元中。新Issue带有
comp:<slug>
priority:P<n>
标签——Agent可以使用
gh issue list --label comp:<slug>
gh issue list --label priority:P1
筛选不同回顾分析中的相关工作。 (如果工件已上传) 工件: 回顾分析文档 · 手稿 · CLI源码 本地副本: <$RETRO_SCRATCH_PATH>
此处的
[P<n>]
注释仅用于展示——Issue标题本身不带有优先级前缀(优先级在标签上)。在用户可见的摘要中显示优先级有助于用户无需打开每个Issue即可按优先级扫描已提交的工作。
当某个子部分为空时,省略(
新Issue:
在现有Issue上添加的评论:
)。一个仅生成评论的回顾分析(每个工作单元都匹配现有开放Issue)是好结果——这意味着Issue追踪器已覆盖这些结论,新证据强化了它们。
如果
$FAILED_ISSUES
非空(由
references/issue-template.md
步骤3设置),在最后一行之前添加警告块:
⚠️ 部分操作需要关注:
  • <标题> — Issue创建失败
  • <标题> — 在#234上添加评论失败
  • ...
使用<$RETRO_SCRATCH_PATH>处的回顾分析文档手动提交缺失的Issue或评论。
如果提交未完成(用户选择仅本地保存,或gh完全失败),显示本地保存路径和issue-template降级路径打印的手动提交说明。

Step 7: Clean up staging folder

步骤7:清理staging文件夹

Run artifact-packaging.md Step 7 to delete
$STAGING_DIR
.
运行artifact-packaging.md的步骤7删除
$STAGING_DIR

Rules

规则

  • Prefer automatic fixes (templates, binary) over instructional fixes (skill).
  • For recurring friction, always answer "inherent or fixable?" honestly.
  • Be honest about what went well. Protecting good patterns matters.
  • Default is don't-file. Bias toward filing only when Phase 3 Step B gave you three concrete cross-API examples with evidence (not speculation), and the Step G case-against was clearly weaker than the case-for. "20% of catalog" without named APIs is optimism. "Every API has multi-word resources" is hand-waving. The retro is a filter, not a wishlist; an issue overloaded with weak findings wastes maintainer attention.
  • When in doubt, drop. A finding you're uncertain about almost certainly shouldn't be filed. The next CLI's retro will surface it again with stronger evidence if it's real; if it doesn't, it wasn't.
  • Look for broader patterns. When something does clear the bar, check whether this is the first sighting of a behavior you'd encounter again.
  • When a fix applies to an API subclass, include the condition AND the guard.
  • No time estimates. Use complexity sizing (small/medium/large).
  • Be thorough on the findings that survive. Include enough detail that someone reading months later can understand the finding, the reasoning, and the proposed fix without the original conversation.
  • Do not add more phases, documents, or gates to the main printing-press skill. Propose making existing phases smarter or the Printing Press emit better defaults.
  • 优先选择自动修复(模板、二进制文件)而非说明性修复(技能)。
  • 对于重复出现的阻碍,诚实回答“固有问题或可修复?”。
  • 诚实地记录做得好的地方。保护好的模式很重要。
  • **默认是不提交。**仅当第3阶段步骤B给你三个有证据的跨API示例(而非猜测),且步骤G的反对理由明显弱于支持理由时,才偏向提交。“目录中20%的API”但未命名API是乐观主义。“每个API都有多词资源”是主观猜测。回顾分析是过滤器,而非愿望清单;充满弱结论的Issue会浪费维护者的精力。
  • **不确定时,丢弃。**你不确定的结论几乎肯定不应该提交。如果是真实问题,下一次CLI的回顾分析会以更强的证据再次发现它;如果没有,说明它不是真实问题。
  • **寻找更广泛的模式。**当某个问题确实通过标准时,检查这是否是你会再次遇到的行为的首次出现。
  • 当修复适用于API子类时,包含条件和防护措施。
  • **不提供时间估计。**使用复杂度大小(小/中/大)。
  • 对留存的结论要彻底。包含足够的细节,以便数月后阅读的人无需原始对话即可理解结论、推理和提议的修复方案。
  • 不要向主printing-press技能添加更多阶段、文档或门控。提议让现有阶段更智能,或让Printing Press生成更好的默认值。