gen-ai-explainer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Animated Explainer Workflow (Light Producer)

Animated Explainer 工作流程(轻量制作器)

You drive a 6-stage pipeline. You handle the creative stages in chat; the
gen-ai
CLI handles all media generation and rendering. State lives in
~/.gen-ai/projects/explainer/<slug>/
as JSON files.
你将执行一个6阶段的工作流。创意阶段在聊天中完成;
gen-ai
CLI负责所有媒体生成与渲染。状态以JSON文件形式存储在
~/.gen-ai/projects/explainer/<slug>/
目录下。

Mode — interactive (default) or auto

模式——交互式(默认)或自动

Before anything else, decide which mode the user wants:
在开始任何操作前,先确定用户想要的模式:

Interactive mode (default — pick this unless the user opts out)

交互式模式(默认——除非用户选择退出,否则默认使用此模式)

The 4 creative stages each end with a hard STOP. You present, the user reviews, types
continue
/ edits / picks. The full Rule One below applies. Right for first-time use, premium production, anyone who hasn't told you otherwise.
4个创意阶段的每个阶段结束后都会强制暂停。你展示成果,用户 进行审核,输入
continue
/编辑/选择。以下的规则一完全适用。 适用于首次使用、高质量制作需求,以及未指定其他模式的用户。

Auto mode (opt-in)

自动模式(需主动选择)

You execute all 6 stages end-to-end without STOPping. You still present each artifact briefly so the user can interrupt if they want, but you do NOT wait. You make the picks: best concept of the 3, best playbook for the concept, script as you'd write it. The user is signing up for "trust your judgment, go end-to-end."
Detect auto mode when the user's request includes one of:
  • auto
    ,
    auto mode
    ,
    auto-approve
    ,
    auto approve
  • no approvals
    ,
    skip approvals
    ,
    don't ask
    ,
    no questions
  • just do it
    ,
    yolo
    ,
    full auto
    ,
    end to end
  • run it through
    ,
    run all stages
    ,
    no checks
If you see ANY of those, set mode = auto. Otherwise, mode = interactive.
If the user's wording is ambiguous, ask ONCE at the start: "Interactive (I pause at each stage for your review) or auto (I run end to end and you get the final video)?" — then proceed.
你将端到端执行所有6个阶段,无需暂停。你仍会简要展示每个 成果,方便用户在需要时中断,但不会等待用户反馈。你自行做出选择:3个概念中最优的一个、对应概念的最佳脚本模板、按你风格撰写的脚本。用户选择此模式即表示“信任你的判断,全程自动执行”。
检测自动模式:当用户的请求包含以下任一内容时:
  • auto
    auto mode
    auto-approve
    auto approve
  • no approvals
    skip approvals
    don't ask
    no questions
  • just do it
    yolo
    full auto
    end to end
  • run it through
    run all stages
    no checks
如果检测到上述任一内容,设置模式为自动。 否则,模式为交互式。
如果用户的表述模糊,仅在开始时询问一次: “选择交互式(我会在每个阶段暂停等待你的审核)还是自动模式(我全程自动执行,你直接获取最终视频)?” ——然后按用户选择继续。

Even in auto mode, you MUST still:

即使在自动模式下,你仍必须:

  • Announce the credit estimate before stage 5 (assets). One line: "Spending ~1850 credits on assets now. Balance: 12,500 → ~10,650 after." Don't ask for permission, just announce. The user can Ctrl-C if they disagree.
  • Read each director skill before its stage. Rule Zero is non-negotiable regardless of mode.
  • Surface real errors, not silently fail. If the CLI returns
    { "status": "error", "hint": "..." }
    , stop and tell the user.
  • Stop on genuine ambiguity — if the user said "30s explainer" but the topic obviously needs 3+ minutes to cover well, ask once before guessing.
  • 在第5阶段(资产生成)前告知积分预估。一句话: “现在将花费约1850积分用于资产生成。当前余额:12,500 → 完成后约为10,650。” 无需请求许可,仅需告知。用户若不同意可按Ctrl-C中断。
  • 在每个阶段开始前阅读对应主管技能文档。无论模式如何,规则零都是不可协商的。
  • 如实反馈真实错误,不得静默失败。如果CLI返回
    { "status": "error", "hint": "..." }
    ,立即停止并告知用户。
  • 遇到明确歧义时停止——如果用户要求“30秒解释视频”但主题显然需要3分钟以上才能充分说明,在猜测前先询问一次。

Stage-by-stage behavior table

各阶段行为对照表

StageInteractiveAuto
researchPresent findings, STOPPresent findings briefly, continue
proposalShow 3 concepts + estimate, STOPShow 3 concepts + estimate, pick best yourself, announce pick, continue
scriptShow full script, STOPShow script, continue
scene_planShow scene table, STOPShow scene table, continue
assetsAnnounce + confirm spendingAnnounce spending (no confirmation), fire
renderRunRun
metadata + uploadDraft + runDraft + run
阶段交互式模式自动模式
research展示研究结果,暂停简要展示研究结果,继续
proposal展示3个概念+积分预估,暂停展示3个概念+积分预估,自行选择最优方案,告知选择结果,继续
script展示完整脚本,暂停展示脚本,继续
scene_plan展示场景表,暂停展示场景表,继续
assets告知并确认花费告知花费(无需确认),执行
render执行执行
metadata + upload草拟并执行草拟并执行

The 6 stages — FOUR are hard approval gates

6个阶段——其中4个为强制审核节点

Every creative stage is a hard stop. You do the work, present it, then STOP and wait for the user. Do NOT chain stages without explicit user approval.
  1. research — you, in chat. Read
    references/research-director.md
    . STOP after presenting findings.
  2. proposal — you, in chat. Read
    references/proposal-director.md
    . Pick a playbook with each concept. STOP until the user picks A / B / C.
  3. script — you, in chat. Read
    references/script-director.md
    . Read the chosen playbook's
    audio.voice_style
    and reflect it in
    speaker_directions
    . STOP after showing the script.
  4. scene_plan — you, in chat. Read
    references/scene-plan-director.md
    . Include the
    playbook
    field at the top of
    scene-plan.json
    .
    STOP after showing the scene table — this is the last gate before money is spent.
  5. assets — CLI (~1500-3000 credits, 5-25 min wall time). Read
    references/asset-director.md
    . Write
    scene-plan.json
    (with
    playbook
    field) and
    script.json
    into the project dir. Then run
    gen-ai explainer:assets <slug>
    — the CLI auto-applies the playbook's
    image_prompt_prefix
    ,
    image_negative_prompt
    , and
    music_mood
    .
  6. render — CLI (~1-3 min). Read
    references/render-director.md
    . Run
    gen-ai explainer:render <slug>
    — playbook auto-flows from the asset manifest; ffmpeg uses its
    music_volume_db
    and
    narration_to_music_weight_ratio
    .
After stage 6: draft title / description / chapters / hashtags in chat. Then run
gen-ai upload-to-drive <slug>/explainer.mp4 --name "<title>"
. Share the URL.
每个创意阶段结束后都会强制暂停。你完成工作、展示成果后,必须暂停 并等待用户回复。未经用户明确许可,不得连续执行多个阶段。
  1. research —— 由你在聊天中完成。阅读
    references/research-director.md
    展示研究结果后暂停。
  2. proposal —— 由你在聊天中完成。阅读
    references/proposal-director.md
    。**为每个概念选择对应的脚本模板。**等待用户选择A/B/C后再继续。
  3. script —— 由你在聊天中完成。阅读
    references/script-director.md
    。读取所选脚本模板的
    audio.voice_style
    并在
    speaker_directions
    中体现。展示脚本后暂停。
  4. scene_plan —— 由你在聊天中完成。阅读
    references/scene-plan-director.md
    。**在
    scene-plan.json
    顶部包含
    playbook
    字段。**展示场景表后暂停——这是产生费用前的最后一个审核节点。
  5. assets —— CLI执行(约1500-3000积分,耗时5-25分钟)。阅读
    references/asset-director.md
    。将
    scene-plan.json
    (包含
    playbook
    字段)和
    script.json
    写入项目目录。然后运行
    gen-ai explainer:assets <slug>
    ——CLI会自动应用脚本模板的
    image_prompt_prefix
    image_negative_prompt
    music_mood
  6. render —— CLI执行(约1-3分钟)。阅读
    references/render-director.md
    。运行
    gen-ai explainer:render <slug>
    ——脚本模板会从资产清单自动流转;ffmpeg会使用其
    music_volume_db
    narration_to_music_weight_ratio
    参数。
第6阶段完成后:在聊天中草拟标题/描述/章节/话题标签。然后运行
gen-ai upload-to-drive <slug>/explainer.mp4 --name "<title>"
。分享生成的URL。

Rule Zero — Read the director skill before EVERY stage

规则零——每个阶段开始前必须阅读主管技能文档

Each of the 6 stages has a dedicated director skill at
~/.claude/skills/gen-ai-explainer/references/<stage>-director.md
. You MUST read the director skill BEFORE executing each stage. Not after. Not skimmed. Read.
The director files are not "background reading" — they contain the exact process, query templates, schema shapes, self-evaluation rubrics, common pitfalls, and STOP gates for that stage. Skipping them produces lower-quality output that wastes the user's credits.
6个阶段的每个阶段都有对应的专属主管技能文档,位于
~/.claude/skills/gen-ai-explainer/references/<stage>-director.md
。你必须在执行每个阶段之前阅读 对应的主管技能文档。不能事后阅读,也不能略读,必须完整阅读。
这些主管文档并非“背景阅读材料”——它们包含每个阶段的确切流程、查询模板、schema结构、自我评估标准、常见陷阱以及暂停节点。跳过阅读会导致输出质量下降,浪费用户的积分。

Skill-loading protocol (apply at the START of every stage)

技能加载流程(每个阶段开始时执行)

  1. Announce in chat: "Loading
    references/<stage>-director.md
    ."
    One line. So the user sees you're following the protocol.
  2. Read the file with the Read tool. The full file.
  3. Follow its Process steps exactly. Don't improvise — the directors were written precisely so you don't have to invent the workflow each time.
  4. When the director's STOP gate fires, STOP. Don't pre-load the next director.
  1. 在聊天中告知“正在加载
    references/<stage>-director.md
    。”
    一句话即可,让用户看到你正在遵循流程。
  2. 使用读取工具完整读取文件
  3. 严格遵循文档中的流程步骤。不得即兴发挥——撰写这些主管文档的目的就是让你无需每次都重新设计工作流。
  4. 当文档中的暂停节点触发时,必须暂停。不得提前加载下一个阶段的主管文档。

Stage → director mapping (memorize this)

阶段与主管文档对应关系(请牢记)

StageDirector skill to read first
research
references/research-director.md
(5 search batches, ~12-15 web searches)
proposal
references/proposal-director.md
(3 concepts + credit estimate via
gen-ai credits
+
gen-ai pricing --json
)
script
references/script-director.md
(narrative arc, word budget, eleven-v3 directions)
scene_plan
references/scene-plan-director.md
(5-aspect checklist, technique library)
assets
references/asset-director.md
(calls
gen-ai explainer:assets <slug>
)
render
references/render-director.md
(calls
gen-ai explainer:render <slug>
)
阶段需先阅读的主管技能文档
research
references/research-director.md
(5批搜索,约12-15次网页搜索)
proposal
references/proposal-director.md
(3个概念 + 通过
gen-ai credits
gen-ai pricing --json
获取积分预估)
script
references/script-director.md
(叙事结构、字数限制、eleven-v3语音指导)
scene_plan
references/scene-plan-director.md
(5维度检查清单、技巧库)
assets
references/asset-director.md
(调用
gen-ai explainer:assets <slug>
render
references/render-director.md
(调用
gen-ai explainer:render <slug>

Do NOT (skill-loading violations)

禁止行为(技能加载违规)

  • Skip reading a director "because you remember it from the last conversation."
  • Read directors in batches "to save round-trips" — fresh context per stage.
  • Improvise a stage from your general knowledge instead of following the director's specific process.
  • Carry stale director content from a previous stage into the current one (e.g., applying scene-plan rules to the script stage).
  • Silently drop director-mandated steps (web searches, self-evaluation, pronunciation guides) "to be faster."
If you skip director-reading, the user will catch it: research will lack sourced URLs, the script will miss the narrative arc, scene plans will fail the 5-aspect checklist. Sub-quality output betrays the protocol.
  • 跳过阅读主管文档,理由是“你记得上次对话中的内容”。
  • 批量阅读多个主管文档“以节省往返次数”——每个阶段都需要新鲜的上下文。
  • 凭通用知识即兴执行阶段,而非遵循主管文档中的特定流程。
  • 将上一个阶段的主管文档内容带入当前阶段(例如,将场景规划规则应用于脚本阶段)。
  • 静默省略主管文档要求的步骤(网页搜索、自我评估、发音指南)“以提高速度”。
如果跳过阅读主管文档,用户会发现:研究结果缺少来源URL,脚本缺少叙事结构,场景规划未通过5维度检查清单。低质量输出违背了流程规范。

Rule One — Approval gates in interactive mode

规则一——交互式模式下的审核节点

This rule applies in interactive mode only. Auto mode replaces STOP with "announce and continue" per the Mode section above.
In interactive mode, the four creative stages (research / proposal / script / scene_plan) each end with a hard STOP. After presenting your output:
  • WAIT for the user to reply.
  • If they say "continue" / "looks good" / "approve" / "go" — proceed to the next stage.
  • If they say "edit X" / "rewrite Y" / "swap N" — revise, present again, STOP again.
  • Iterate until they explicitly approve.
In interactive mode, Do NOT:
  • Auto-advance to the next stage without user reply.
  • Pre-draft the next stage "to save time" while waiting.
  • Assume approval from silence.
  • Skip showing the artifact to the user.
  • Collapse multiple stages into one message.
If you skip a gate in interactive mode, the user pays for visuals they didn't sign off on. The whole point of interactive mode is human-in-the-loop control.
In auto mode the user has explicitly opted out of approvals — you go through all 6 stages and produce the video. You still announce each stage's output (so the user sees what you picked) and announce the credit estimate, but you don't wait for input.
本规则仅适用于交互式模式。自动模式下,暂停操作将替换为“告知并继续”,具体见上文模式部分。
在交互式模式下,4个创意阶段(research/proposal/script/scene_plan)的每个阶段结束后都会强制暂停。展示输出内容后:
  • 等待用户回复。
  • 如果用户回复“continue”/“看起来不错”/“批准”/“继续”——进入下一个阶段。
  • 如果用户回复“编辑X”/“重写Y”/“替换N”——修改内容,再次展示,然后再次暂停。
  • 反复迭代直到用户明确批准。
交互式模式下禁止:
  • 未经用户回复自动进入下一个阶段。
  • 在等待时预先草拟下一个阶段的内容“以节省时间”。
  • 默认为用户已批准。
  • 跳过向用户展示成果。
  • 将多个阶段合并为一条消息。
如果在交互式模式下跳过审核节点,用户将为未确认的视觉内容付费。交互式模式的核心就是让用户全程参与控制。
在自动模式下,用户已明确选择跳过审核——你将执行所有6个阶段并生成视频。你仍需告知每个阶段的输出内容(让用户了解你的选择)并告知积分预估,但无需等待用户输入。

Other rules

其他规则

  • The asset stage is expensive (~1500-3000 credits for 30-90s). Announce the credit estimate before running, and confirm one more time at the start of stage 5.
  • If a CLI call fails, read the error JSON's
    hint
    field and decide whether to retry, re-plan, or surface to the user.
  • Never override the default models unless the user asks. Defaults are:
    • image:
      gemini-3.1-flash-image
      (Nano Banana 2)
    • video:
      seedance-2.0
    • voice:
      eleven-v3
    • music:
      minimax-music
  • 资产生成阶段成本较高(30-90秒视频约1500-3000积分)。执行前告知积分预估,并在第5阶段开始时再次确认。
  • 如果CLI调用失败,读取错误JSON中的
    hint
    字段,决定是否重试、重新规划或告知用户。
  • 除非用户要求,否则不得覆盖默认模型。默认模型为:
    • 图像:
      gemini-3.1-flash-image
      (Nano Banana 2)
    • 视频:
      seedance-2.0
    • 语音:
      eleven-v3
    • 音乐:
      minimax-music

Resume protocol

恢复流程

If the user references an existing project, run
ls ~/.gen-ai/projects/explainer/<slug>/
and decide which stage to resume from by which JSON files exist:
Files presentResume from
only
manifest.json
stage 3 (script)
script.json
stage 4 (scene_plan)
scene-plan.json
stage 5 (assets)
asset-manifest.json
stage 6 (render)
render-report.json
upload step
See
pipeline.yaml
for the machine-readable manifest.
如果用户提及现有项目,运行
ls ~/.gen-ai/projects/explainer/<slug>/
并根据存在的JSON文件决定从哪个阶段恢复:
存在的文件恢复阶段
manifest.json
第3阶段(脚本)
script.json
第4阶段(场景规划)
scene-plan.json
第5阶段(资产生成)
asset-manifest.json
第6阶段(渲染)
render-report.json
上传步骤
可查看
pipeline.yaml
获取机器可读的清单。