benchmark-models

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese
<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly --> <!-- Regenerate: bun run gen:skill-docs -->
<!-- 由SKILL.md.tmpl自动生成——请勿直接编辑 --> <!-- 重新生成:bun run gen:skill-docs -->

Preamble (run first)

前置操作(需先运行)

bash
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD" || true
mkdir -p ~/.gstack/sessions
touch ~/.gstack/sessions/"$PPID"
_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ')
find ~/.gstack/sessions -mmin +120 -type f -exec rm {} + 2>/dev/null || true
_PROACTIVE=$(~/.claude/skills/gstack/bin/gstack-config get proactive 2>/dev/null || echo "true")
_PROACTIVE_PROMPTED=$([ -f ~/.gstack/.proactive-prompted ] && echo "yes" || echo "no")
_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown")
echo "BRANCH: $_BRANCH"
_SKILL_PREFIX=$(~/.claude/skills/gstack/bin/gstack-config get skill_prefix 2>/dev/null || echo "false")
echo "PROACTIVE: $_PROACTIVE"
echo "PROACTIVE_PROMPTED: $_PROACTIVE_PROMPTED"
echo "SKILL_PREFIX: $_SKILL_PREFIX"
source <(~/.claude/skills/gstack/bin/gstack-repo-mode 2>/dev/null) || true
REPO_MODE=${REPO_MODE:-unknown}
echo "REPO_MODE: $REPO_MODE"
_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no")
echo "LAKE_INTRO: $_LAKE_SEEN"
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true)
_TEL_PROMPTED=$([ -f ~/.gstack/.telemetry-prompted ] && echo "yes" || echo "no")
_TEL_START=$(date +%s)
_SESSION_ID="$$-$(date +%s)"
echo "TELEMETRY: ${_TEL:-off}"
echo "TEL_PROMPTED: $_TEL_PROMPTED"
_EXPLAIN_LEVEL=$(~/.claude/skills/gstack/bin/gstack-config get explain_level 2>/dev/null || echo "default")
if [ "$_EXPLAIN_LEVEL" != "default" ] && [ "$_EXPLAIN_LEVEL" != "terse" ]; then _EXPLAIN_LEVEL="default"; fi
echo "EXPLAIN_LEVEL: $_EXPLAIN_LEVEL"
_QUESTION_TUNING=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false")
echo "QUESTION_TUNING: $_QUESTION_TUNING"
mkdir -p ~/.gstack/analytics
if [ "$_TEL" != "off" ]; then
echo '{"skill":"benchmark-models","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}'  >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
fi
for _PF in $(find ~/.gstack/analytics -maxdepth 1 -name '.pending-*' 2>/dev/null); do
  if [ -f "$_PF" ]; then
    if [ "$_TEL" != "off" ] && [ -x "~/.claude/skills/gstack/bin/gstack-telemetry-log" ]; then
      ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true
    fi
    rm -f "$_PF" 2>/dev/null || true
  fi
  break
done
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true
_LEARN_FILE="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}/learnings.jsonl"
if [ -f "$_LEARN_FILE" ]; then
  _LEARN_COUNT=$(wc -l < "$_LEARN_FILE" 2>/dev/null | tr -d ' ')
  echo "LEARNINGS: $_LEARN_COUNT entries loaded"
  if [ "$_LEARN_COUNT" -gt 5 ] 2>/dev/null; then
    ~/.claude/skills/gstack/bin/gstack-learnings-search --limit 3 2>/dev/null || true
  fi
else
  echo "LEARNINGS: 0"
fi
~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"benchmark-models","event":"started","branch":"'"$_BRANCH"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null &
_HAS_ROUTING="no"
if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then
  _HAS_ROUTING="yes"
fi
_ROUTING_DECLINED=$(~/.claude/skills/gstack/bin/gstack-config get routing_declined 2>/dev/null || echo "false")
echo "HAS_ROUTING: $_HAS_ROUTING"
echo "ROUTING_DECLINED: $_ROUTING_DECLINED"
_VENDORED="no"
if [ -d ".claude/skills/gstack" ] && [ ! -L ".claude/skills/gstack" ]; then
  if [ -f ".claude/skills/gstack/VERSION" ] || [ -d ".claude/skills/gstack/.git" ]; then
    _VENDORED="yes"
  fi
fi
echo "VENDORED_GSTACK: $_VENDORED"
echo "MODEL_OVERLAY: claude"
_CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode 2>/dev/null || echo "explicit")
_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false")
echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE"
echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH"
[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true
bash
_UPD=$(~/.claude/skills/gstack/bin/gstack-update-check 2>/dev/null || .claude/skills/gstack/bin/gstack-update-check 2>/dev/null || true)
[ -n "$_UPD" ] && echo "$_UPD" || true
mkdir -p ~/.gstack/sessions
touch ~/.gstack/sessions/"$PPID"
_SESSIONS=$(find ~/.gstack/sessions -mmin -120 -type f 2>/dev/null | wc -l | tr -d ' ')
find ~/.gstack/sessions -mmin +120 -type f -exec rm {} + 2>/dev/null || true
_PROACTIVE=$(~/.claude/skills/gstack/bin/gstack-config get proactive 2>/dev/null || echo "true")
_PROACTIVE_PROMPTED=$([ -f ~/.gstack/.proactive-prompted ] && echo "yes" || echo "no")
_BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown")
echo "BRANCH: $_BRANCH"
_SKILL_PREFIX=$(~/.claude/skills/gstack/bin/gstack-config get skill_prefix 2>/dev/null || echo "false")
echo "PROACTIVE: $_PROACTIVE"
echo "PROACTIVE_PROMPTED: $_PROACTIVE_PROMPTED"
echo "SKILL_PREFIX: $_SKILL_PREFIX"
source <(~/.claude/skills/gstack/bin/gstack-repo-mode 2>/dev/null) || true
REPO_MODE=${REPO_MODE:-unknown}
echo "REPO_MODE: $REPO_MODE"
_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no")
echo "LAKE_INTRO: $_LAKE_SEEN"
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || true)
_TEL_PROMPTED=$([ -f ~/.gstack/.telemetry-prompted ] && echo "yes" || echo "no")
_TEL_START=$(date +%s)
_SESSION_ID="$$-$(date +%s)"
echo "TELEMETRY: ${_TEL:-off}"
echo "TEL_PROMPTED: $_TEL_PROMPTED"
_EXPLAIN_LEVEL=$(~/.claude/skills/gstack/bin/gstack-config get explain_level 2>/dev/null || echo "default")
if [ "$_EXPLAIN_LEVEL" != "default" ] && [ "$_EXPLAIN_LEVEL" != "terse" ]; then _EXPLAIN_LEVEL="default"; fi
echo "EXPLAIN_LEVEL: $_EXPLAIN_LEVEL"
_QUESTION_TUNING=$(~/.claude/skills/gstack/bin/gstack-config get question_tuning 2>/dev/null || echo "false")
echo "QUESTION_TUNING: $_QUESTION_TUNING"
mkdir -p ~/.gstack/analytics
if [ "$_TEL" != "off" ]; then
echo '{"skill":"benchmark-models","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}'  >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
fi
for _PF in $(find ~/.gstack/analytics -maxdepth 1 -name '.pending-*' 2>/dev/null); do
  if [ -f "$_PF" ]; then
    if [ "$_TEL" != "off" ] && [ -x "~/.claude/skills/gstack/bin/gstack-telemetry-log" ]; then
      ~/.claude/skills/gstack/bin/gstack-telemetry-log --event-type skill_run --skill _pending_finalize --outcome unknown --session-id "$_SESSION_ID" 2>/dev/null || true
    fi
    rm -f "$_PF" 2>/dev/null || true
  fi
  break
done
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true
_LEARN_FILE="${GSTACK_HOME:-$HOME/.gstack}/projects/${SLUG:-unknown}/learnings.jsonl"
if [ -f "$_LEARN_FILE" ]; then
  _LEARN_COUNT=$(wc -l < "$_LEARN_FILE" 2>/dev/null | tr -d ' ')
  echo "LEARNINGS: $_LEARN_COUNT entries loaded"
  if [ "$_LEARN_COUNT" -gt 5 ] 2>/dev/null; then
    ~/.claude/skills/gstack/bin/gstack-learnings-search --limit 3 2>/dev/null || true
  fi
else
  echo "LEARNINGS: 0"
fi
~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"benchmark-models","event":"started","branch":"'"$_BRANCH"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null &
_HAS_ROUTING="no"
if [ -f CLAUDE.md ] && grep -q "## Skill routing" CLAUDE.md 2>/dev/null; then
  _HAS_ROUTING="yes"
fi
_ROUTING_DECLINED=$(~/.claude/skills/gstack/bin/gstack-config get routing_declined 2>/dev/null || echo "false")
echo "HAS_ROUTING: $_HAS_ROUTING"
echo "ROUTING_DECLINED: $_ROUTING_DECLINED"
_VENDORED="no"
if [ -d ".claude/skills/gstack" ] && [ ! -L ".claude/skills/gstack" ]; then
  if [ -f ".claude/skills/gstack/VERSION" ] || [ -d ".claude/skills/gstack/.git" ]; then
    _VENDORED="yes"
  fi
fi
echo "VENDORED_GSTACK: $_VENDORED"
echo "MODEL_OVERLAY: claude"
_CHECKPOINT_MODE=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_mode 2>/dev/null || echo "explicit")
_CHECKPOINT_PUSH=$(~/.claude/skills/gstack/bin/gstack-config get checkpoint_push 2>/dev/null || echo "false")
echo "CHECKPOINT_MODE: $_CHECKPOINT_MODE"
echo "CHECKPOINT_PUSH: $_CHECKPOINT_PUSH"
[ -n "$OPENCLAW_SESSION" ] && echo "SPAWNED_SESSION: true" || true

Plan Mode Safe Operations

计划模式安全操作

In plan mode, allowed because they inform the plan:
$B
,
$D
,
codex exec
/
codex review
, writes to
~/.gstack/
, writes to the plan file, and
open
for generated artifacts.
在计划模式下,允许执行以下操作(用于辅助制定计划):
$B
$D
codex exec
/
codex review
、写入
~/.gstack/
、写入计划文件,以及对生成的产物执行
open
操作。

Skill Invocation During Plan Mode

计划模式下的技能调用

If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. Treat the skill file as executable instructions, not reference. Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant —
mcp__*__AskUserQuestion
or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report
BLOCKED — AskUserQuestion unavailable
per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.
If
PROACTIVE
is
"false"
, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?"
If
SKILL_PREFIX
is
"true"
, suggest/invoke
/gstack-*
names. Disk paths stay
~/.claude/skills/gstack/[skill-name]/SKILL.md
.
If output shows
UPGRADE_AVAILABLE <old> <new>
: read
~/.claude/skills/gstack/gstack-upgrade/SKILL.md
and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined).
If output shows
JUST_UPGRADED <from> <to>
: print "Running gstack v{to} (just updated!)". If
SPAWNED_SESSION
is true, skip feature discovery.
Feature discovery, max one prompt per session:
  • Missing
    ~/.claude/skills/gstack/.feature-prompted-continuous-checkpoint
    : AskUserQuestion for Continuous checkpoint auto-commits. If accepted, run
    ~/.claude/skills/gstack/bin/gstack-config set checkpoint_mode continuous
    . Always touch marker.
  • Missing
    ~/.claude/skills/gstack/.feature-prompted-model-overlay
    : inform "Model overlays are active. MODEL_OVERLAY shows the patch." Always touch marker.
After upgrade prompts, continue workflow.
If
WRITING_STYLE_PENDING
is
yes
: ask once about writing style:
v1 prompts are simpler: first-use jargon glosses, outcome-framed questions, shorter prose. Keep default or restore terse?
Options:
  • A) Keep the new default (recommended — good writing helps everyone)
  • B) Restore V0 prose — set
    explain_level: terse
If A: leave
explain_level
unset (defaults to
default
). If B: run
~/.claude/skills/gstack/bin/gstack-config set explain_level terse
.
Always run (regardless of choice):
bash
rm -f ~/.gstack/.writing-style-prompt-pending
touch ~/.gstack/.writing-style-prompted
Skip if
WRITING_STYLE_PENDING
is
no
.
If
LAKE_INTRO
is
no
: say "gstack follows the Boil the Lake principle — do the complete thing when AI makes marginal cost near-zero. Read more: https://garryslist.org/posts/boil-the-ocean" Offer to open:
bash
open https://garryslist.org/posts/boil-the-ocean
touch ~/.gstack/.completeness-intro-seen
Only run
open
if yes. Always run
touch
.
If
TEL_PROMPTED
is
no
AND
LAKE_INTRO
is
yes
: ask telemetry once via AskUserQuestion:
Help gstack get better. Share usage data only: skill, duration, crashes, stable device ID. No code, file paths, or repo names.
Options:
  • A) Help gstack get better! (recommended)
  • B) No thanks
If A: run
~/.claude/skills/gstack/bin/gstack-config set telemetry community
If B: ask follow-up:
Anonymous mode sends only aggregate usage, no unique ID.
Options:
  • A) Sure, anonymous is fine
  • B) No thanks, fully off
If B→A: run
~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous
If B→B: run
~/.claude/skills/gstack/bin/gstack-config set telemetry off
Always run:
bash
touch ~/.gstack/.telemetry-prompted
Skip if
TEL_PROMPTED
is
yes
.
If
PROACTIVE_PROMPTED
is
no
AND
TEL_PROMPTED
is
yes
: ask once:
Let gstack proactively suggest skills, like /qa for "does this work?" or /investigate for bugs?
Options:
  • A) Keep it on (recommended)
  • B) Turn it off — I'll type /commands myself
If A: run
~/.claude/skills/gstack/bin/gstack-config set proactive true
If B: run
~/.claude/skills/gstack/bin/gstack-config set proactive false
Always run:
bash
touch ~/.gstack/.proactive-prompted
Skip if
PROACTIVE_PROMPTED
is
yes
.
If
HAS_ROUTING
is
no
AND
ROUTING_DECLINED
is
false
AND
PROACTIVE_PROMPTED
is
yes
: Check if a CLAUDE.md file exists in the project root. If it does not exist, create it.
Use AskUserQuestion:
gstack works best when your project's CLAUDE.md includes skill routing rules.
Options:
  • A) Add routing rules to CLAUDE.md (recommended)
  • B) No thanks, I'll invoke skills manually
If A: Append this section to the end of CLAUDE.md:
markdown
undefined
如果用户在计划模式下调用技能,技能优先级高于通用计划模式行为。将技能文件视为可执行指令,而非参考文档。 从步骤0开始逐步执行;第一个AskUserQuestion是工作流进入计划模式的标志,不属于违规行为。AskUserQuestion(任何变体——
mcp__*__AskUserQuestion
或原生;请参见“AskUserQuestion格式 → 工具解析”)满足计划模式的回合结束要求。如果无法调用任何变体,则技能被阻塞——根据AskUserQuestion格式规则,停止并报告
BLOCKED — AskUserQuestion unavailable
。在STOP节点,立即停止,不要继续工作流或调用ExitPlanMode。标记为“PLAN MODE EXCEPTION — ALWAYS RUN”的命令会执行。仅在技能工作流完成后,或用户要求取消技能/退出计划模式时,调用ExitPlanMode。
如果
PROACTIVE
"false"
,请勿自动调用或主动推荐技能。如果某个技能看起来有用,请询问:“我认为/skillname可能会有帮助——需要我运行它吗?”
如果
SKILL_PREFIX
"true"
,建议/调用
/gstack-*
命名的技能。磁盘路径保持为
~/.claude/skills/gstack/[skill-name]/SKILL.md
如果输出显示
UPGRADE_AVAILABLE <old> <new>
:请阅读
~/.claude/skills/gstack/gstack-upgrade/SKILL.md
并遵循“内联升级流程”(如果已配置则自动升级,否则通过AskUserQuestion提供4个选项,若用户拒绝则记录 snooze 状态)。
如果输出显示
JUST_UPGRADED <from> <to>
:打印“Running gstack v{to} (just updated!)”。如果
SPAWNED_SESSION
为true,跳过功能发现环节。
功能发现环节,每个会话最多触发一次提示:
  • 如果缺少
    ~/.claude/skills/gstack/.feature-prompted-continuous-checkpoint
    :通过AskUserQuestion询问是否开启自动提交持续检查点。如果用户同意,运行
    ~/.claude/skills/gstack/bin/gstack-config set checkpoint_mode continuous
    。无论用户选择如何,都要创建标记文件。
  • 如果缺少
    ~/.claude/skills/gstack/.feature-prompted-model-overlay
    :告知用户“模型覆盖已激活。MODEL_OVERLAY显示当前补丁”。无论用户选择如何,都要创建标记文件。
完成升级提示后,继续工作流。
如果
WRITING_STYLE_PENDING
yes
:询问一次写作风格偏好:
v1版本的提示词更简洁:首次使用时会提供术语解释、以结果为框架的问题,以及更简短的文本。是保留默认风格还是恢复简洁风格?
选项:
  • A) 保留新的默认风格(推荐——良好的写作体验有助于所有人)
  • B) 恢复V0版本的文本——设置
    explain_level: terse
如果选择A:保持
explain_level
未设置(默认值为
default
)。 如果选择B:运行
~/.claude/skills/gstack/bin/gstack-config set explain_level terse
无论选择哪个选项,始终运行:
bash
rm -f ~/.gstack/.writing-style-prompt-pending
touch ~/.gstack/.writing-style-prompted
如果
WRITING_STYLE_PENDING
no
,跳过此环节。
如果
LAKE_INTRO
no
:告知用户“gstack遵循Boil the Lake原则——当AI的边际成本接近零时,完成完整的任务。了解更多:https://garryslist.org/posts/boil-the-ocean”,并询问是否打开链接:
bash
open https://garryslist.org/posts/boil-the-ocean
touch ~/.gstack/.completeness-intro-seen
仅在用户同意时运行
open
。无论用户选择如何,始终运行
touch
如果
TEL_PROMPTED
no
LAKE_INTRO
yes
:通过AskUserQuestion询问一次遥测设置:
帮助gstack变得更好。仅共享使用数据:技能信息、时长、崩溃情况、稳定设备ID。不包含代码、文件路径或仓库名称。
选项:
  • A) 帮助gstack改进!(推荐)
  • B) 不用了,谢谢
如果选择A:运行
~/.claude/skills/gstack/bin/gstack-config set telemetry community
如果选择B:继续询问后续选项:
匿名模式仅发送汇总使用数据,不包含唯一ID。
选项:
  • A) 好的,匿名模式可以接受
  • B) 不用了,完全关闭
如果选择B→A:运行
~/.claude/skills/gstack/bin/gstack-config set telemetry anonymous
如果选择B→B:运行
~/.claude/skills/gstack/bin/gstack-config set telemetry off
无论选择哪个选项,始终运行:
bash
touch ~/.gstack/.telemetry-prompted
如果
TEL_PROMPTED
yes
,跳过此环节。
如果
PROACTIVE_PROMPTED
no
TEL_PROMPTED
yes
:询问一次:
是否让gstack主动推荐技能,比如针对“这个功能能用吗?”调用/qa,或针对bug调用/investigate?
选项:
  • A) 保持开启(推荐)
  • B) 关闭——我会手动输入/命令
如果选择A:运行
~/.claude/skills/gstack/bin/gstack-config set proactive true
如果选择B:运行
~/.claude/skills/gstack/bin/gstack-config set proactive false
无论选择哪个选项,始终运行:
bash
touch ~/.gstack/.proactive-prompted
如果
PROACTIVE_PROMPTED
yes
,跳过此环节。
如果
HAS_ROUTING
no
ROUTING_DECLINED
false
PROACTIVE_PROMPTED
yes
: 检查项目根目录是否存在CLAUDE.md文件。如果不存在,创建该文件。
通过AskUserQuestion询问:
当项目的CLAUDE.md包含技能路由规则时,gstack的效果最佳。
选项:
  • A) 向CLAUDE.md添加路由规则(推荐)
  • B) 不用了,谢谢,我会手动调用技能
如果选择A:将以下部分追加到CLAUDE.md末尾:
markdown
undefined

Skill routing

Skill routing

When the user's request matches an available skill, invoke it via the Skill tool. When in doubt, invoke the skill.
Key routing rules:
  • Product ideas/brainstorming → invoke /office-hours
  • Strategy/scope → invoke /plan-ceo-review
  • Architecture → invoke /plan-eng-review
  • Design system/plan review → invoke /design-consultation or /plan-design-review
  • Full review pipeline → invoke /autoplan
  • Bugs/errors → invoke /investigate
  • QA/testing site behavior → invoke /qa or /qa-only
  • Code review/diff check → invoke /review
  • Visual polish → invoke /design-review
  • Ship/deploy/PR → invoke /ship or /land-and-deploy
  • Save progress → invoke /context-save
  • Resume context → invoke /context-restore

Then commit the change: `git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`

If B: run `~/.claude/skills/gstack/bin/gstack-config set routing_declined true` and say they can re-enable with `gstack-config set routing_declined false`.

This only happens once per project. Skip if `HAS_ROUTING` is `yes` or `ROUTING_DECLINED` is `true`.

If `VENDORED_GSTACK` is `yes`, warn once via AskUserQuestion unless `~/.gstack/.vendoring-warned-$SLUG` exists:

> This project has gstack vendored in `.claude/skills/gstack/`. Vendoring is deprecated.
> Migrate to team mode?

Options:
- A) Yes, migrate to team mode now
- B) No, I'll handle it myself

If A:
1. Run `git rm -r .claude/skills/gstack/`
2. Run `echo '.claude/skills/gstack/' >> .gitignore`
3. Run `~/.claude/skills/gstack/bin/gstack-team-init required` (or `optional`)
4. Run `git add .claude/ .gitignore CLAUDE.md && git commit -m "chore: migrate gstack from vendored to team mode"`
5. Tell the user: "Done. Each developer now runs: `cd ~/.claude/skills/gstack && ./setup --team`"

If B: say "OK, you're on your own to keep the vendored copy up to date."

Always run (regardless of choice):
```bash
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true
touch ~/.gstack/.vendoring-warned-${SLUG:-unknown}
If marker exists, skip.
If
SPAWNED_SESSION
is
"true"
, you are running inside a session spawned by an AI orchestrator (e.g., OpenClaw). In spawned sessions:
  • Do NOT use AskUserQuestion for interactive prompts. Auto-choose the recommended option.
  • Do NOT run upgrade checks, telemetry prompts, routing injection, or lake intro.
  • Focus on completing the task and reporting results via prose output.
  • End with a completion report: what shipped, decisions made, anything uncertain.
当用户的请求与可用技能匹配时,通过Skill工具调用该技能。如有疑问,调用技能。
关键路由规则:
  • 产品创意/头脑风暴 → 调用/office-hours
  • 策略/范围规划 → 调用/plan-ceo-review
  • 架构设计 → 调用/plan-eng-review
  • 设计系统/计划评审 → 调用/design-consultation或/plan-design-review
  • 完整评审流程 → 调用/autoplan
  • 漏洞/错误排查 → 调用/investigate
  • QA/测试站点行为 → 调用/qa或/qa-only
  • 代码评审/差异检查 → 调用/review
  • 视觉优化 → 调用/design-review
  • 发布/部署/PR → 调用/ship或/land-and-deploy
  • 保存进度 → 调用/context-save
  • 恢复上下文 → 调用/context-restore

然后提交更改:`git add CLAUDE.md && git commit -m "chore: add gstack skill routing rules to CLAUDE.md"`

如果选择B:运行`~/.claude/skills/gstack/bin/gstack-config set routing_declined true`并告知用户可以通过`gstack-config set routing_declined false`重新启用该功能。

每个项目仅执行一次此操作。如果`HAS_ROUTING`为`yes`或`ROUTING_DECLINED`为`true`,跳过此环节。

如果`VENDORED_GSTACK`为`yes`,且`~/.gstack/.vendoring-warned-$SLUG`不存在,则通过AskUserQuestion发出一次警告:

> 此项目已将gstack嵌入到`.claude/skills/gstack/`目录中。嵌入方式已被弃用。
> 是否迁移到团队模式?

选项:
- A) 是,立即迁移到团队模式
- B) 不用,我会自行处理

如果选择A:
1. 运行`git rm -r .claude/skills/gstack/`
2. 运行`echo '.claude/skills/gstack/' >> .gitignore`
3. 运行`~/.claude/skills/gstack/bin/gstack-team-init required`(或`optional`)
4. 运行`git add .claude/ .gitignore CLAUDE.md && git commit -m "chore: migrate gstack from vendored to team mode"`
5. 告知用户:“完成。每位开发者现在需要运行:`cd ~/.claude/skills/gstack && ./setup --team`”

如果选择B:告知用户“好的,你需要自行负责保持嵌入版本的更新。”

无论选择哪个选项,始终运行:
```bash
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)" 2>/dev/null || true
touch ~/.gstack/.vendoring-warned-${SLUG:-unknown}
如果标记文件已存在,跳过此环节。
如果
SPAWNED_SESSION
"true"
,则你正在AI编排器(如OpenClaw)生成的会话中运行。在生成的会话中:
  • 请勿使用AskUserQuestion进行交互式提示。自动选择推荐选项。
  • 请勿运行升级检查、遥测提示、路由注入或Lake介绍环节。
  • 专注于完成任务并通过文本输出报告结果。
  • 最后提交完成报告:已完成的工作、做出的决策、任何不确定的事项。

Artifacts Sync (skill start)

产物同步(技能启动时)

bash
_GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"
bash
_GSTACK_HOME="${GSTACK_HOME:-$HOME/.gstack}"

Prefer the v1.27.0.0 artifacts file; fall back to brain file for users

优先使用v1.27.0.0版本的产物文件;对于在迁移脚本运行前中途升级的用户,回退到brain文件。

upgrading mid-stream before the migration script runs.

if [ -f "$HOME/.gstack-artifacts-remote.txt" ]; then _BRAIN_REMOTE_FILE="$HOME/.gstack-artifacts-remote.txt" else _BRAIN_REMOTE_FILE="$HOME/.gstack-brain-remote.txt" fi _BRAIN_SYNC_BIN="/.claude/skills/gstack/bin/gstack-brain-sync" _BRAIN_CONFIG_BIN="/.claude/skills/gstack/bin/gstack-config"
if [ -f "$HOME/.gstack-artifacts-remote.txt" ]; then _BRAIN_REMOTE_FILE="$HOME/.gstack-artifacts-remote.txt" else _BRAIN_REMOTE_FILE="$HOME/.gstack-brain-remote.txt" fi _BRAIN_SYNC_BIN="/.claude/skills/gstack/bin/gstack-brain-sync" _BRAIN_CONFIG_BIN="/.claude/skills/gstack/bin/gstack-config"

/sync-gbrain context-load: teach the agent to use gbrain when it's available.

/sync-gbrain context-load:教导agent在可用时使用gbrain。

Per-worktree pin: post-spike redesign uses kubectl-style
.gbrain-source
in the

每个工作树独立固定:后峰值重新设计使用kubectl风格的
.gbrain-source
文件放在git根目录,用于限定查询范围。在工作树中查找固定文件(而非全局状态文件),这样打开没有固定文件的工作树B时,不会因为工作树A已同步而显示“已索引”。当未配置gbrain时,为空字符串(非gbrain用户的上下文成本为零)。

git toplevel to scope queries. Look for the pin in the worktree (not a global

state file) so that opening worktree B without a pin doesn't claim "indexed"

just because worktree A was synced. Empty string when gbrain is not

configured (zero context cost for non-gbrain users).

_GBRAIN_CONFIG="$HOME/.gbrain/config.json" if [ -f "$_GBRAIN_CONFIG" ] && command -v gbrain >/dev/null 2>&1; then _GBRAIN_VERSION_OK=$(gbrain --version 2>/dev/null | grep -c '^gbrain ' || echo 0) if [ "$_GBRAIN_VERSION_OK" -gt 0 ] 2>/dev/null; then _GBRAIN_PIN_PATH="" _REPO_TOP=$(git rev-parse --show-toplevel 2>/dev/null || echo "") if [ -n "$_REPO_TOP" ] && [ -f "$_REPO_TOP/.gbrain-source" ]; then _GBRAIN_PIN_PATH="$_REPO_TOP/.gbrain-source" fi if [ -n "$_GBRAIN_PIN_PATH" ]; then echo "GBrain configured. Prefer `gbrain search`/`gbrain query` over Grep for" echo "semantic questions; use `gbrain code-def`/`code-refs`/`code-callers` for" echo "symbol-aware code lookup. See "## GBrain Search Guidance" in CLAUDE.md." echo "Run /sync-gbrain to refresh." else echo "GBrain configured but this worktree isn't pinned yet. Run `/sync-gbrain --full`" echo "before relying on `gbrain search` for code questions in this worktree." echo "Falls back to Grep until pinned." fi fi fi
_BRAIN_SYNC_MODE=$("$_BRAIN_CONFIG_BIN" get artifacts_sync_mode 2>/dev/null || echo off)
_GBRAIN_CONFIG="$HOME/.gbrain/config.json" if [ -f "$_GBRAIN_CONFIG" ] && command -v gbrain >/dev/null 2>&1; then _GBRAIN_VERSION_OK=$(gbrain --version 2>/dev/null | grep -c '^gbrain ' || echo 0) if [ "$_GBRAIN_VERSION_OK" -gt 0 ] 2>/dev/null; then _GBRAIN_PIN_PATH="" _REPO_TOP=$(git rev-parse --show-toplevel 2>/dev/null || echo "") if [ -n "$_REPO_TOP" ] && [ -f "$_REPO_TOP/.gbrain-source" ]; then _GBRAIN_PIN_PATH="$_REPO_TOP/.gbrain-source" fi if [ -n "$_GBRAIN_PIN_PATH" ]; then echo "GBrain已配置。对于语义问题,优先使用\
gbrain search\\
/\
gbrain query\\
而非Grep;" echo "对于符号感知的代码查找,使用\
gbrain code-def\\
/\
code-refs\\
/\
code-callers\\
。" echo "请查看CLAUDE.md中的\"## GBrain Search Guidance\"部分。" echo "运行/sync-gbrain以刷新数据。" else echo "GBrain已配置,但此工作树尚未固定。在依赖\
gbrain search\\
解决此工作树中的代码问题之前," echo "请先运行\
/sync-gbrain --full\\
。在固定之前,会回退到Grep。" fi fi fi
_BRAIN_SYNC_MODE=$("$_BRAIN_CONFIG_BIN" get artifacts_sync_mode 2>/dev/null || echo off)

Detect remote-MCP mode (Path 4 of /setup-gbrain). Local artifacts sync is

检测remote-MCP模式(/setup-gbrain的路径4)。在远程模式下,本地产物同步无效;brain服务器会按自己的节奏从GitHub/GitLab拉取数据。直接读取claude.json以保持前置操作的速度(无需在每次技能启动时调用claude CLI子进程)。

a no-op in remote mode; the brain server pulls from GitHub/GitLab on its

own cadence. Read claude.json directly to keep this preamble fast (no

subprocess to claude CLI on every skill start).

_GBRAIN_MCP_MODE="none" if command -v jq >/dev/null 2>&1 && [ -f "$HOME/.claude.json" ]; then _GBRAIN_MCP_TYPE=$(jq -r '.mcpServers.gbrain.type // .mcpServers.gbrain.transport // empty' "$HOME/.claude.json" 2>/dev/null) case "$_GBRAIN_MCP_TYPE" in url|http|sse) _GBRAIN_MCP_MODE="remote-http" ;; stdio) _GBRAIN_MCP_MODE="local-stdio" ;; esac fi
if [ -f "$_BRAIN_REMOTE_FILE" ] && [ ! -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" = "off" ]; then _BRAIN_NEW_URL=$(head -1 "$_BRAIN_REMOTE_FILE" 2>/dev/null | tr -d '[:space:]') if [ -n "$_BRAIN_NEW_URL" ]; then echo "ARTIFACTS_SYNC: artifacts repo detected: $_BRAIN_NEW_URL" echo "ARTIFACTS_SYNC: run 'gstack-brain-restore' to pull your cross-machine artifacts (or 'gstack-config set artifacts_sync_mode off' to dismiss forever)" fi fi
if [ -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" != "off" ]; then _BRAIN_LAST_PULL_FILE="$_GSTACK_HOME/.brain-last-pull" _BRAIN_NOW=$(date +%s) _BRAIN_DO_PULL=1 if [ -f "$_BRAIN_LAST_PULL_FILE" ]; then _BRAIN_LAST=$(cat "$_BRAIN_LAST_PULL_FILE" 2>/dev/null || echo 0) _BRAIN_AGE=$(( _BRAIN_NOW - _BRAIN_LAST )) [ "$_BRAIN_AGE" -lt 86400 ] && _BRAIN_DO_PULL=0 fi if [ "$_BRAIN_DO_PULL" = "1" ]; then ( cd "$_GSTACK_HOME" && git fetch origin >/dev/null 2>&1 && git merge --ff-only "origin/$(git rev-parse --abbrev-ref HEAD)" >/dev/null 2>&1 ) || true echo "$_BRAIN_NOW" > "$_BRAIN_LAST_PULL_FILE" fi "$_BRAIN_SYNC_BIN" --once 2>/dev/null || true fi
if [ "$_GBRAIN_MCP_MODE" = "remote-http" ]; then

Remote-MCP mode: local artifacts sync is a no-op (brain admin's server

pulls from GitHub/GitLab). Show the user this is by design, not broken.

_GBRAIN_HOST=$(jq -r '.mcpServers.gbrain.url // empty' "$HOME/.claude.json" 2>/dev/null | sed -E 's|^https?://([^/:]+).*|\1|') echo "ARTIFACTS_SYNC: remote-mode (managed by brain server ${_GBRAIN_HOST:-remote})" elif [ -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" != "off" ]; then _BRAIN_QUEUE_DEPTH=0 [ -f "$_GSTACK_HOME/.brain-queue.jsonl" ] && _BRAIN_QUEUE_DEPTH=$(wc -l < "$_GSTACK_HOME/.brain-queue.jsonl" | tr -d ' ') _BRAIN_LAST_PUSH="never" [ -f "$_GSTACK_HOME/.brain-last-push" ] && _BRAIN_LAST_PUSH=$(cat "$_GSTACK_HOME/.brain-last-push" 2>/dev/null || echo never) echo "ARTIFACTS_SYNC: mode=$_BRAIN_SYNC_MODE | last_push=$_BRAIN_LAST_PUSH | queue=$_BRAIN_QUEUE_DEPTH" else echo "ARTIFACTS_SYNC: off" fi



Privacy stop-gate: if output shows `ARTIFACTS_SYNC: off`, `artifacts_sync_mode_prompted` is `false`, and gbrain is on PATH or `gbrain doctor --fast --json` works, ask once:

> gstack can publish your artifacts (CEO plans, designs, reports) to a private GitHub repo that GBrain indexes across machines. How much should sync?

Options:
- A) Everything allowlisted (recommended)
- B) Only artifacts
- C) Decline, keep everything local

After answer:

```bash
_GBRAIN_MCP_MODE="none" if command -v jq >/dev/null 2>&1 && [ -f "$HOME/.claude.json" ]; then _GBRAIN_MCP_TYPE=$(jq -r '.mcpServers.gbrain.type // .mcpServers.gbrain.transport // empty' "$HOME/.claude.json" 2>/dev/null) case "$_GBRAIN_MCP_TYPE" in url|http|sse) _GBRAIN_MCP_MODE="remote-http" ;; stdio) _GBRAIN_MCP_MODE="local-stdio" ;; esac fi
if [ -f "$_BRAIN_REMOTE_FILE" ] && [ ! -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" = "off" ]; then _BRAIN_NEW_URL=$(head -1 "$_BRAIN_REMOTE_FILE" 2>/dev/null | tr -d '[:space:]') if [ -n "$_BRAIN_NEW_URL" ]; then echo "ARTIFACTS_SYNC: 检测到产物仓库: $_BRAIN_NEW_URL" echo "ARTIFACTS_SYNC: 运行'gstack-brain-restore'以拉取跨机器的产物(或运行'gstack-config set artifacts_sync_mode off'永久关闭此提示)" fi fi
if [ -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" != "off" ]; then _BRAIN_LAST_PULL_FILE="$_GSTACK_HOME/.brain-last-pull" _BRAIN_NOW=$(date +%s) _BRAIN_DO_PULL=1 if [ -f "$_BRAIN_LAST_PULL_FILE" ]; then _BRAIN_LAST=$(cat "$_BRAIN_LAST_PULL_FILE" 2>/dev/null || echo 0) _BRAIN_AGE=$(( _BRAIN_NOW - _BRAIN_LAST )) [ "$_BRAIN_AGE" -lt 86400 ] && _BRAIN_DO_PULL=0 fi if [ "$_BRAIN_DO_PULL" = "1" ]; then ( cd "$_GSTACK_HOME" && git fetch origin >/dev/null 2>&1 && git merge --ff-only "origin/$(git rev-parse --abbrev-ref HEAD)" >/dev/null 2>&1 ) || true echo "$_BRAIN_NOW" > "$_BRAIN_LAST_PULL_FILE" fi "$_BRAIN_SYNC_BIN" --once 2>/dev/null || true fi
if [ "$_GBRAIN_MCP_MODE" = "remote-http" ]; then

Remote-MCP模式:本地产物同步无效(由brain管理员的服务器从GitHub/GitLab拉取数据)。告知用户这是设计如此,而非功能故障。

_GBRAIN_HOST=$(jq -r '.mcpServers.gbrain.url // empty' "$HOME/.claude.json" 2>/dev/null | sed -E 's|^https?://([^/:]+).*|\1|') echo "ARTIFACTS_SYNC: 远程模式(由brain服务器${_GBRAIN_HOST:-remote}管理)" elif [ -d "$_GSTACK_HOME/.git" ] && [ "$_BRAIN_SYNC_MODE" != "off" ]; then _BRAIN_QUEUE_DEPTH=0 [ -f "$_GSTACK_HOME/.brain-queue.jsonl" ] && _BRAIN_QUEUE_DEPTH=$(wc -l < "$_GSTACK_HOME/.brain-queue.jsonl" | tr -d ' ') _BRAIN_LAST_PUSH="never" [ -f "$_GSTACK_HOME/.brain-last-push" ] && _BRAIN_LAST_PUSH=$(cat "$_GSTACK_HOME/.brain-last-push" 2>/dev/null || echo never) echo "ARTIFACTS_SYNC: mode=$_BRAIN_SYNC_MODE | last_push=$_BRAIN_LAST_PUSH | queue=$_BRAIN_QUEUE_DEPTH" else echo "ARTIFACTS_SYNC: off" fi



隐私检查:如果输出显示`ARTIFACTS_SYNC: off`,`artifacts_sync_mode_prompted`为`false`,且gbrain已在PATH中或`gbrain doctor --fast --json`可正常运行,则询问一次:

> gstack可以将你的产物(CEO计划、设计、报告)发布到私有GitHub仓库,供GBrain跨机器索引。需要同步多少内容?

选项:
- A) 所有允许的内容(推荐)
- B) 仅同步产物
- C) 拒绝,保持所有内容本地存储

用户回答后:

```bash

Chosen mode: full | artifacts-only | off

选择的模式:full | artifacts-only | off

"$_BRAIN_CONFIG_BIN" set artifacts_sync_mode <choice> "$_BRAIN_CONFIG_BIN" set artifacts_sync_mode_prompted true

If A/B and `~/.gstack/.git` is missing, ask whether to run `gstack-artifacts-init`. Do not block the skill.

At skill END before telemetry:

```bash
"~/.claude/skills/gstack/bin/gstack-brain-sync" --discover-new 2>/dev/null || true
"~/.claude/skills/gstack/bin/gstack-brain-sync" --once 2>/dev/null || true
"$_BRAIN_CONFIG_BIN" set artifacts_sync_mode <choice> "$_BRAIN_CONFIG_BIN" set artifacts_sync_mode_prompted true

如果选择A/B且`~/.gstack/.git`不存在,询问是否运行`gstack-artifacts-init`。不要阻塞技能运行。

在技能结束、遥测之前:

```bash
"~/.claude/skills/gstack/bin/gstack-brain-sync" --discover-new 2>/dev/null || true
"~/.claude/skills/gstack/bin/gstack-brain-sync" --once 2>/dev/null || true

Model-Specific Behavioral Patch (claude)

模型特定行为补丁(claude)

The following nudges are tuned for the claude model family. They are subordinate to skill workflow, STOP points, AskUserQuestion gates, plan-mode safety, and /ship review gates. If a nudge below conflicts with skill instructions, the skill wins. Treat these as preferences, not rules.
Todo-list discipline. When working through a multi-step plan, mark each task complete individually as you finish it. Do not batch-complete at the end. If a task turns out to be unnecessary, mark it skipped with a one-line reason.
Think before heavy actions. For complex operations (refactors, migrations, non-trivial new features), briefly state your approach before executing. This lets the user course-correct cheaply instead of mid-flight.
Dedicated tools over Bash. Prefer Read, Edit, Write, Glob, Grep over shell equivalents (cat, sed, find, grep). The dedicated tools are cheaper and clearer.
以下调整针对claude模型家族进行了优化。它们优先级低于技能工作流、STOP节点、AskUserQuestion检查点、计划模式安全规则和/ship评审检查点。如果以下调整与技能指令冲突,以技能指令为准。将这些视为偏好设置,而非强制规则。
待办事项纪律。在执行多步骤计划时,完成每个任务后单独标记为已完成。不要在最后批量标记完成。如果某个任务被证明是不必要的,标记为已跳过并附上一行说明原因。
执行复杂操作前先思考。对于复杂操作(重构、迁移、非平凡的新功能),在执行前简要说明你的方法。这样用户可以在早期低成本地纠正方向,而不是在操作中途。
优先使用专用工具而非Bash。优先使用Read、Edit、Write、Glob、Grep而非shell等效命令(cat、sed、find、grep)。专用工具成本更低、更清晰。

Voice

语言风格

Direct, concrete, builder-to-builder. Name the file, function, command, and user-visible impact. No filler.
No em dashes. No AI vocabulary: delve, crucial, robust, comprehensive, nuanced, multifaceted. Never corporate or academic. Short paragraphs. End with what to do.
The user has context you do not. Cross-model agreement is a recommendation, not a decision. The user decides.
直接、具体,以开发者对开发者的语气沟通。明确提及文件名、函数、命令和对用户可见的影响。不要冗余内容。
不要使用破折号。不要使用AI词汇:delve、crucial、robust、comprehensive、nuanced、multifaceted。避免企业或学术风格。使用短段落。结尾明确下一步操作。
用户拥有你不知道的上下文。跨模型一致性是建议,而非决策。最终由用户决定。

Completion Status Protocol

完成状态协议

When completing a skill workflow, report status using one of:
  • DONE — completed with evidence.
  • DONE_WITH_CONCERNS — completed, but list concerns.
  • BLOCKED — cannot proceed; state blocker and what was tried.
  • NEEDS_CONTEXT — missing info; state exactly what is needed.
Escalate after 3 failed attempts, uncertain security-sensitive changes, or scope you cannot verify. Format:
STATUS
,
REASON
,
ATTEMPTED
,
RECOMMENDATION
.
完成技能工作流时,使用以下状态之一报告:
  • DONE — 已完成并提供证据。
  • DONE_WITH_CONCERNS — 已完成,但列出存在的问题。
  • BLOCKED — 无法继续;说明阻塞原因和已尝试的操作。
  • NEEDS_CONTEXT — 缺少信息;明确说明需要的内容。
在3次尝试失败、不确定的安全敏感更改或无法验证的范围时,升级问题。格式:
STATUS
REASON
ATTEMPTED
RECOMMENDATION

Operational Self-Improvement

操作自改进

Before completing, if you discovered a durable project quirk or command fix that would save 5+ minutes next time, log it:
bash
~/.claude/skills/gstack/bin/gstack-learnings-log '{"skill":"SKILL_NAME","type":"operational","key":"SHORT_KEY","insight":"DESCRIPTION","confidence":N,"source":"observed"}'
Do not log obvious facts or one-time transient errors.
完成前,如果你发现了持久的项目特性或命令修复,能在下次节省5分钟以上时间,请记录:
bash
~/.claude/skills/gstack/bin/gstack-learnings-log '{"skill":"SKILL_NAME","type":"operational","key":"SHORT_KEY","insight":"DESCRIPTION","confidence":N,"source":"observed"}'
不要记录明显的事实或一次性临时错误。

Telemetry (run last)

遥测(最后运行)

After workflow completion, log telemetry. Use skill
name:
from frontmatter. OUTCOME is success/error/abort/unknown.
PLAN MODE EXCEPTION — ALWAYS RUN: This command writes telemetry to
~/.gstack/analytics/
, matching preamble analytics writes.
Run this bash:
bash
_TEL_END=$(date +%s)
_TEL_DUR=$(( _TEL_END - _TEL_START ))
rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true
工作流完成后,记录遥测数据。使用前置内容中的技能
name:
。OUTCOME为success/error/abort/unknown。
PLAN MODE EXCEPTION — ALWAYS RUN: 此命令将遥测数据写入
~/.gstack/analytics/
,与前置操作中的分析写入一致。
运行以下bash命令:
bash
_TEL_END=$(date +%s)
_TEL_DUR=$(( _TEL_END - _TEL_START ))
rm -f ~/.gstack/analytics/.pending-"$_SESSION_ID" 2>/dev/null || true

Session timeline: record skill completion (local-only, never sent anywhere)

会话时间线:记录技能完成情况(仅本地存储,绝不会发送到外部)

~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"SKILL_NAME","event":"completed","branch":"'$(git branch --show-current 2>/dev/null || echo unknown)'","outcome":"OUTCOME","duration_s":"'"$_TEL_DUR"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null || true
~/.claude/skills/gstack/bin/gstack-timeline-log '{"skill":"SKILL_NAME","event":"completed","branch":"'$(git branch --show-current 2>/dev/null || echo unknown)'","outcome":"OUTCOME","duration_s":"'"$_TEL_DUR"'","session":"'"$_SESSION_ID"'"}' 2>/dev/null || true

Local analytics (gated on telemetry setting)

本地分析(受遥测设置控制)

if [ "$_TEL" != "off" ]; then echo '{"skill":"SKILL_NAME","duration_s":"'"$_TEL_DUR"'","outcome":"OUTCOME","browse":"USED_BROWSE","session":"'"$_SESSION_ID"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true fi
if [ "$_TEL" != "off" ]; then echo '{"skill":"SKILL_NAME","duration_s":"'"$_TEL_DUR"'","outcome":"OUTCOME","browse":"USED_BROWSE","session":"'"$_SESSION_ID"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true fi

Remote telemetry (opt-in, requires binary)

远程遥测(可选加入,需要二进制文件)

if [ "$_TEL" != "off" ] && [ -x ~/.claude/skills/gstack/bin/gstack-telemetry-log ]; then ~/.claude/skills/gstack/bin/gstack-telemetry-log
--skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME"
--used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null & fi

Replace `SKILL_NAME`, `OUTCOME`, and `USED_BROWSE` before running.
if [ "$_TEL" != "off" ] && [ -x ~/.claude/skills/gstack/bin/gstack-telemetry-log ]; then ~/.claude/skills/gstack/bin/gstack-telemetry-log \ --skill "SKILL_NAME" --duration "$_TEL_DUR" --outcome "OUTCOME" \ --used-browse "USED_BROWSE" --session-id "$_SESSION_ID" 2>/dev/null & fi

运行前替换`SKILL_NAME`、`OUTCOME`和`USED_BROWSE`。

Plan Status Footer

计划状态页脚

In plan mode before ExitPlanMode: if the plan file lacks
## GSTACK REVIEW REPORT
, run
~/.claude/skills/gstack/bin/gstack-review-read
and append the standard runs/status/findings table. With
NO_REVIEWS
or empty, append a 5-row placeholder with verdict "NO REVIEWS YET — run
/autoplan
". If a richer report exists, skip.
PLAN MODE EXCEPTION — always allowed (it's the plan file).
在计划模式下、调用ExitPlanMode之前:如果计划文件缺少
## GSTACK REVIEW REPORT
,运行
~/.claude/skills/gstack/bin/gstack-review-read
并追加标准的运行/状态/发现表格。如果显示
NO_REVIEWS
或为空,追加5行占位符, verdict为"NO REVIEWS YET — run
/autoplan
"。如果已有更丰富的报告,跳过此环节。
PLAN MODE EXCEPTION — 始终允许执行(因为操作的是计划文件)。

/benchmark-models — Cross-Model Skill Benchmark

/benchmark-models — 跨模型技能基准测试

You are running the
/benchmark-models
workflow. Wraps the
gstack-model-benchmark
binary with an interactive flow that picks a prompt, confirms providers, previews auth, and runs the benchmark.
Different from
/benchmark
— that skill measures web page performance (Core Web Vitals, load times). This skill measures AI model performance on gstack skills or arbitrary prompts.

你正在运行
/benchmark-models
工作流。该工作流封装了
gstack-model-benchmark
二进制文件,提供交互式流程:选择提示词、确认提供商、预览授权状态并运行基准测试。
/benchmark
不同——该技能用于衡量网页性能(Core Web Vitals、加载时间)。本技能用于衡量AI模型在gstack技能或任意提示词上的性能。

Step 0: Locate the binary

步骤0:定位二进制文件

bash
BIN="$HOME/.claude/skills/gstack/bin/gstack-model-benchmark"
[ -x "$BIN" ] || BIN=".claude/skills/gstack/bin/gstack-model-benchmark"
[ -x "$BIN" ] || { echo "ERROR: gstack-model-benchmark not found. Run ./setup in the gstack install dir." >&2; exit 1; }
echo "BIN: $BIN"
If not found, stop and tell the user to reinstall gstack.

bash
BIN="$HOME/.claude/skills/gstack/bin/gstack-model-benchmark"
[ -x "$BIN" ] || BIN=".claude/skills/gstack/bin/gstack-model-benchmark"
[ -x "$BIN" ] || { echo "ERROR: gstack-model-benchmark未找到。请在gstack安装目录运行./setup。" >&2; exit 1; }
echo "BIN: $BIN"
如果未找到,停止并告知用户重新安装gstack。

Step 1: Choose a prompt

步骤1:选择提示词

Use AskUserQuestion with the preamble format:
  • Re-ground: current project + branch.
  • Simplify: "A cross-model benchmark runs the same prompt through 2-3 AI models and shows you how they compare on speed, cost, and output quality. What prompt should we use?"
  • RECOMMENDATION: A because benchmarking against a real skill exposes tool-use differences, not just raw generation.
  • Options:
    • A) Benchmark one of my gstack skills (we'll pick which skill next). Completeness: 10/10.
    • B) Use an inline prompt — type it on the next turn. Completeness: 8/10.
    • C) Point at a prompt file on disk — specify path on the next turn. Completeness: 8/10.
If A: list top-level gstack skills that have SKILL.md files (from
find . -maxdepth 2 -name SKILL.md -not -path './.*'
), ask the user to pick one via a second AskUserQuestion. Use the picked SKILL.md path as the prompt file.
If B: ask the user for the inline prompt. Use it verbatim via
--prompt "<text>"
.
If C: ask for the path. Verify it exists. Use as positional argument.

使用AskUserQuestion,格式如下:
  • 重新定位上下文: 当前项目 + 分支。
  • 简化说明: "跨模型基准测试会将同一提示词通过2-3个AI模型运行,并展示它们在速度、成本和输出质量上的差异。我们应该使用哪个提示词?"
  • 推荐选项: A,因为针对真实技能进行基准测试能暴露工具使用差异,而不仅仅是原始生成能力。
  • 选项:
    • A) 基准测试我的一个gstack技能(接下来我们会选择具体技能)。完整性:10/10。
    • B) 使用内联提示词——在下一回合输入。完整性:8/10。
    • C) 指定磁盘上的提示词文件——在下一回合提供路径。完整性:8/10。
如果选择A:列出顶级的gstack技能(包含SKILL.md文件,来自
find . -maxdepth 2 -name SKILL.md -not -path './.*'
),通过第二个AskUserQuestion让用户选择一个。将选中的SKILL.md路径作为提示词文件。
如果选择B:询问用户输入内联提示词。通过
--prompt "<text>"
直接使用该提示词。
如果选择C:询问文件路径。验证文件是否存在。将其作为位置参数使用。

Step 2: Choose providers

步骤2:选择提供商

bash
"$BIN" --prompt "unused, dry-run" --models claude,gpt,gemini --dry-run
Show the dry-run output. The "Adapter availability" section tells the user which providers will actually run (OK) vs skip (NOT READY — remediation hint included).
If ALL three show NOT READY: stop with a clear message — benchmark can't run without at least one authed provider. Suggest
claude login
,
codex login
, or
gemini login
/
export GOOGLE_API_KEY
.
If at least one is OK: AskUserQuestion:
  • Simplify: "Which models should we include? The dry-run above showed which are authed. Unauthed ones will be skipped cleanly — they won't abort the batch."
  • RECOMMENDATION: A (all authed providers) because running as many as possible gives the richest comparison.
  • Options:
    • A) All authed providers. Completeness: 10/10.
    • B) Only Claude. Completeness: 6/10 (no cross-model signal — use /ship's review for solo claude benchmarks instead).
    • C) Pick two — specify on next turn. Completeness: 8/10.

bash
"$BIN" --prompt "unused, dry-run" --models claude,gpt,gemini --dry-run
显示试运行输出。“Adapter availability”部分会告知用户哪些提供商可以实际运行(OK),哪些会被跳过(NOT READY — 包含修复提示)。
如果三个提供商都显示NOT READY:停止并给出明确信息——基准测试至少需要一个已授权的提供商才能运行。建议用户执行
claude login
codex login
gemini login
/
export GOOGLE_API_KEY
如果至少有一个显示OK:通过AskUserQuestion询问:
  • 简化说明: "我们应该包含哪些模型?上面的试运行显示了已授权的提供商。未授权的提供商会被干净地跳过——不会中断批量运行。"
  • 推荐选项: A(所有已授权的提供商),因为运行尽可能多的模型能提供最全面的对比。
  • 选项:
    • A) 所有已授权的提供商。完整性:10/10。
    • B) 仅Claude。完整性:6/10(没有跨模型对比信号——请使用/ship的评审进行单独Claude基准测试)。
    • C) 选择两个——在下一回合指定。完整性:8/10。

Step 3: Decide on judge

步骤3:决定是否启用评判者

bash
[ -n "$ANTHROPIC_API_KEY" ] || grep -q 'ANTHROPIC' "$HOME/.claude/.credentials.json" 2>/dev/null && echo "JUDGE_AVAILABLE" || echo "JUDGE_UNAVAILABLE"
If judge is available, AskUserQuestion:
  • Simplify: "The quality judge scores each model's output on a 0-10 scale using Anthropic's Claude as a tiebreaker. Adds ~$0.05/run. Recommended if you care about output quality, not just latency and cost."
  • RECOMMENDATION: A — the whole point is comparing quality, not just speed.
  • Options:
    • A) Enable judge (adds ~$0.05). Completeness: 10/10.
    • B) Skip judge — speed/cost/tokens only. Completeness: 7/10.
If judge is NOT available, skip this question and omit the
--judge
flag.

bash
[ -n "$ANTHROPIC_API_KEY" ] || grep -q 'ANTHROPIC' "$HOME/.claude/.credentials.json" 2>/dev/null && echo "JUDGE_AVAILABLE" || echo "JUDGE_UNAVAILABLE"
如果评判者可用,通过AskUserQuestion询问:
  • 简化说明: "质量评判者会使用Anthropic的Claude作为决胜者,对每个模型的输出进行0-10分的评分。每次运行约增加$0.05成本。如果你关心输出质量(而不仅仅是延迟和成本),推荐启用。"
  • 推荐选项: A——基准测试的核心就是对比质量,而不仅仅是速度。
  • 选项:
    • A) 启用评判者(约增加$0.05成本)。完整性:10/10。
    • B) 跳过评判者——仅对比速度/成本/token。完整性:7/10。
如果评判者不可用,跳过此问题并省略
--judge
标志。

Step 4: Run the benchmark

步骤4:运行基准测试

Construct the command from Step 1, 2, 3 decisions:
bash
"$BIN" <prompt-spec> --models <picked-models> [--judge] --output table
Where
<prompt-spec>
is either
--prompt "<text>"
(Step 1B), a file path (Step 1A or 1C), and
<picked-models>
is the comma-separated list from Step 2.
Stream the output as it arrives. This is slow — each provider runs the prompt fully. Expect 30s-5min depending on prompt complexity and whether
--judge
is on.

根据步骤1、2、3的决定构建命令:
bash
"$BIN" <prompt-spec> --models <picked-models> [--judge] --output table
其中
<prompt-spec>
--prompt "<text>"
(步骤1B)、文件路径(步骤1A或1C),
<picked-models>
是步骤2选择的逗号分隔列表。
实时流式输出结果。此过程较慢——每个提供商都会完整运行提示词。根据提示词复杂度和是否启用
--judge
,预计耗时30秒到5分钟。

Step 5: Interpret results

步骤5:解读结果

After the table prints, summarize for the user:
  • Fastest — provider with lowest latency.
  • Cheapest — provider with lowest cost.
  • Highest quality (if
    --judge
    ran) — provider with highest score.
  • Best overall — use judgment. If judge ran: quality-weighted. Otherwise: note the tradeoff the user needs to make.
If any provider hit an error (auth/timeout/rate_limit), call it out with the remediation path.

表格打印完成后,为用户总结:
  • 最快 — 延迟最低的提供商。
  • 最便宜 — 成本最低的提供商。
  • 最高质量(如果启用了
    --judge
    ) — 评分最高的提供商。
  • 整体最佳 — 综合判断。如果启用了评判者:以质量为权重。否则:说明用户需要做出的权衡。
如果任何提供商出现错误(授权/超时/速率限制),指出问题并提供修复路径。

Step 6: Offer to save results

步骤6:提供保存结果选项

AskUserQuestion:
  • Simplify: "Save this benchmark as JSON so you can compare future runs against it?"
  • RECOMMENDATION: A — skill performance drifts as providers update their models; a saved baseline catches quality regressions.
  • Options:
    • A) Save to
      ~/.gstack/benchmarks/<date>-<skill-or-prompt-slug>.json
      . Completeness: 10/10.
    • B) Just print, don't save. Completeness: 5/10 (loses trend data).
If A: re-run with
--output json
and tee to the dated file. Print the path so the user can diff future runs against it.

通过AskUserQuestion询问:
  • 简化说明: "是否将此基准测试结果保存为JSON,以便未来运行时进行对比?"
  • 推荐选项: A——随着提供商更新模型,技能性能会变化;保存基线可以发现质量退化。
  • 选项:
    • A) 保存到
      ~/.gstack/benchmarks/<date>-<skill-or-prompt-slug>.json
      。完整性:10/10。
    • B) 仅打印,不保存。完整性:5/10(失去趋势数据)。
如果选择A:使用
--output json
重新运行并将结果写入带日期的文件。打印文件路径,方便用户未来对比。

Important Rules

重要规则

  • Never run a real benchmark without Step 2's dry-run first. Users need to see auth status before spending API calls.
  • Never hardcode model names. Always pass providers from user's Step 2 choice — the binary handles the rest.
  • Never auto-include
    --judge
    .
    It adds real cost; user must opt in.
  • If zero providers are authed, STOP. Don't attempt the benchmark — it produces no useful output.
  • Cost is visible. Every run shows per-provider cost in the table. Users should see it before the next run.
  • 永远不要在步骤2的试运行前运行真实基准测试。 用户需要在消耗API调用前查看授权状态。
  • 永远不要硬编码模型名称。 始终传递用户步骤2选择的提供商——二进制文件会处理其余部分。
  • 永远不要自动包含
    --judge
    这会产生实际成本;必须由用户选择启用。
  • 如果没有已授权的提供商,立即停止。 不要尝试运行基准测试——不会产生有用输出。
  • 成本可见。 每次运行都会在表格中显示每个提供商的成本。用户应该在下次运行前看到成本信息。",