video-podcast-maker
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseREQUIRED: Load Remotion Best Practices FirstThis skill depends on. You MUST invoke it before proceeding:remotion-best-practicesInvoke the skill/tool named: remotion-best-practices
必填项:先加载Remotion最佳实践本技能依赖。在继续操作前必须调用它:remotion-best-practicesInvoke the skill/tool named: remotion-best-practices
Video Podcast Maker
视频播客制作工具
Automated pipeline for 4K Bilibili horizontal knowledge videos from a topic. Coding agent + TTS backend + Remotion + FFmpeg.
基于主题自动生成B站4K横版知识视频的流水线。包含代码Agent + TTS后端 + Remotion + FFmpeg。
Contents
目录
- Bootstrap — update check + prerequisites (run before Step 1)
- Execution Modes — Auto vs Interactive, default decisions
- Workflow — the 15 steps + phase-file pointers + mandatory stops
- Hard Rules — non-negotiable production constraints + output specs
- Per-Video Layout — directory structure, , naming
--public-dir - Additional Resources — when to load each file
references/ - User Preferences
- Troubleshooting
Bootstrap
初始化
Resolve to the directory containing this . If your agent exposes a built-in skill directory variable (e.g. ), map it to .
SKILL_DIRSKILL.md${CLAUDE_SKILL_DIR}SKILL_DIRbash
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"将解析为包含本文件的目录。如果你的Agent提供内置技能目录变量(如),请将其映射到。
SKILL_DIRSKILL.md${CLAUDE_SKILL_DIR}SKILL_DIRbash
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"1. Update check (notify-only, throttled to 24h)
1. 更新检查(仅通知,每24小时限制一次)
"${SKILL_DIR}/scripts/check_update.sh"
"${SKILL_DIR}/scripts/check_update.sh"
2. Prerequisites (CLIs + backend env vars)
2. 前置条件检查(命令行工具 + 后端环境变量)
python3 "${SKILL_DIR}/scripts/check_prereqs.py"
**`check_update.sh` output**:
- `UPDATE_AVAILABLE vX.Y.Z -> vA.B.C` — tell the user the version delta and ask before running `git -C "${SKILL_DIR}" pull --ff-only`. **Notify-only by design — never pull without consent (the skill directory belongs to the user).**
- `UP_TO_DATE` / `SKIPPED_RECENT_CHECK` / `MANUAL_INSTALL` — continue silently.
**Prereqs failures** — see README.md for setup. The check is backend-aware (resolves `TTS_BACKEND` env → `user_prefs.json` `global.tts.backend` → `edge` default), so only env vars required by the active backend are validated.
> **Design Learning shortcut**: If the user provides a reference video/image or asks to save/list/delete style profiles, see [references/design-learning.md](references/design-learning.md) instead of running the workflow below.
---python3 "${SKILL_DIR}/scripts/check_prereqs.py"
**`check_update.sh`输出说明**:
- `UPDATE_AVAILABLE vX.Y.Z -> vA.B.C` — 告知用户版本差异,并在运行`git -C "${SKILL_DIR}" pull --ff-only`前征得同意。**设计为仅通知模式——未经用户同意绝不能自动拉取更新(技能目录属于用户)。**
- `UP_TO_DATE` / `SKIPPED_RECENT_CHECK` / `MANUAL_INSTALL` — 静默继续流程。
**前置条件检查失败** — 请查看README.md进行配置。检查逻辑支持后端感知(优先解析`TTS_BACKEND`环境变量 → `user_prefs.json`中的`global.tts.backend` → 默认使用`edge`),因此仅验证当前激活后端所需的环境变量。
> **设计学习快捷路径**: 如果用户提供参考视频/图片,或要求保存/列出/删除风格配置文件,请查看[references/design-learning.md](references/design-learning.md),而非运行以下工作流。
---Execution Modes
执行模式
Detect at workflow start:
- "Make a video about..." / no special instructions → Auto Mode (default)
- "I want to control each step" / "interactive" → Interactive Mode
在工作流开始时检测:
- “制作一个关于...的视频” / 无特殊指令 → 自动模式(默认)
- “我想控制每个步骤” / “交互” → 交互模式
Auto Mode defaults
自动模式默认规则
Full pipeline with sensible defaults. Mandatory stop at Step 9 (Studio review); Step 10 (4K render) only fires when the user says "render 4K" / "render final".
| Step | Decision | Auto Default |
|---|---|---|
| 3 | Title position | top-center |
| 5 | Media assets | Skip (text-only animations) |
| 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) |
| 9 | Outro animation | Pre-made MP4 (white/black by theme) |
| 12 | Subtitle method | Remotion-native (skip legacy FFmpeg burn) |
| 14 | Cleanup | Auto-clean temp files |
Override any default in the initial request:
- "make a video about AI, burn subtitles" → auto + subtitles on
- "use dark theme, AI thumbnails" → auto + dark + imagen
- "need screenshots" → auto + media collection enabled
完整流水线搭配合理默认值。步骤9(Studio预览)为强制停顿节点;仅当用户说出“渲染4K” / “渲染最终版”时,才会执行步骤10(4K渲染)。
| 步骤 | 决策项 | 自动默认值 |
|---|---|---|
| 3 | 标题位置 | 顶部居中 |
| 5 | 媒体素材 | 跳过(仅文字动画) |
| 7 | 封面生成方式 | Remotion自动生成(16:9 + 4:3比例) |
| 9 | 片尾动画 | 预制MP4文件(根据主题选择白底/黑底) |
| 12 | 字幕生成方式 | Remotion原生方案(跳过旧版FFmpeg硬字幕) |
| 14 | 清理操作 | 自动清理临时文件 |
可在初始请求中覆盖任意默认值:
- “制作一个关于AI的视频,添加硬字幕” → 自动模式 + 开启字幕
- “使用深色主题,AI生成封面” → 自动模式 + 深色主题 + Imagen生成
- “需要截图素材” → 自动模式 + 启用媒体素材收集
Interactive Mode
交互模式
Prompts at each decision point.
在每个决策点向用户发起询问。
Workflow
工作流
At Step 1 start, create one task per step in your agent's tracker (Claude Code / Codex todo list / equivalent). Mark on start, on finish. Files in are the durable record — if interrupted, inspect the directory to determine where to resume.
TaskCreatein_progresscompletedvideos/{name}/| # | Step | Output | Phase file |
|---|---|---|---|
| 1 | Define topic direction | | workflow-script.md |
| 2 | Research topic | | workflow-script.md |
| 3 | Design 5-7 sections | (in-memory) | workflow-script.md |
| 4 | Write narration script | | workflow-script.md |
| 4.5 | Pronunciation pre-flight (zh-CN) | | workflow-script.md |
| 5 | Collect media (Auto: skip) | | workflow-production.md |
| 6 | Generate publish info (Part 1) | | workflow-production.md |
| 7 | Generate thumbnails (16:9 + 4:3) | | workflow-production.md |
| 8 | Generate TTS audio | | workflow-production.md |
| 9 | Remotion composition + Studio preview | — | workflow-production.md |
| 10 | Render 4K video (only on user request) | | workflow-production.md |
| 11 | Mix background music | | workflow-production.md |
| 12 | Finalize (optional legacy subtitle burn) | | workflow-publish.md |
| 13 | Complete publish info (Part 2) | chapter timestamps | workflow-publish.md |
| 14 | Verify output ( | — | workflow-publish.md |
| 15 | Generate vertical shorts (optional) | | workflow-publish.md |
Mandatory stops (bold rows above):
- Step 9 — Studio review. MUST launch and wait for user feedback before rendering. NEVER render 4K until the user explicitly confirms ("render 4K" / "render final").
npx remotion studio - Step 14 — . MUST pass before declaring the video done. Exit 0 = green; exit 2 = warnings still publishable. Auto-fixes common omissions (creates
verify_output.pyif missing). For machine-readable output addfinal_video.mp4(auto when piped).--format json
Pre-render audit (recommended) — before Step 9:
bash
python3 ${SKILL_DIR}/scripts/audit_beat_sync.py <Video.tsx> <timing.json>Flags beats that drift > 1.5s from narration. Especially important for kinetic-typography videos.
在步骤1开始时,在你的Agent任务追踪器中为每个步骤创建一个任务(如Claude Code的 / Codex待办列表 / 等效功能)。开始时标记为,完成后标记为。目录下的文件是持久化记录——如果流程中断,可检查该目录以确定恢复位置。
TaskCreatein_progresscompletedvideos/{name}/| 序号 | 步骤 | 输出 | 阶段文件 |
|---|---|---|---|
| 1 | 定义主题方向 | | workflow-script.md |
| 2 | 主题调研 | | workflow-script.md |
| 3 | 设计5-7个内容板块 | (内存中暂存) | workflow-script.md |
| 4 | 撰写旁白脚本 | | workflow-script.md |
| 4.5 | 发音预检查(中文) | | workflow-script.md |
| 5 | 收集媒体素材(自动模式:跳过) | | workflow-production.md |
| 6 | 生成发布信息(第一部分) | | workflow-production.md |
| 7 | 生成封面图(16:9 + 4:3比例) | | workflow-production.md |
| 8 | 生成TTS音频 | | workflow-production.md |
| 9 | Remotion合成 + Studio预览 | — | workflow-production.md |
| 10 | 渲染4K视频(仅在用户请求时执行) | | workflow-production.md |
| 11 | 混合背景音乐 | | workflow-production.md |
| 12 | 最终处理(可选旧版硬字幕添加) | | workflow-publish.md |
| 13 | 完善发布信息(第二部分) | 章节时间戳 | workflow-publish.md |
| 14 | 验证输出( | — | workflow-publish.md |
| 15 | 生成竖版短视频(可选) | | workflow-publish.md |
强制停顿节点(上方加粗行):
- 步骤9 — Studio预览。必须启动并等待用户反馈后再进行渲染。在用户明确确认(“渲染4K” / “渲染最终版”)前,绝不能执行4K渲染。
npx remotion studio - 步骤14 — 。必须验证通过后才能宣布视频制作完成。退出码0=验证通过;退出码2=存在警告但仍可发布。可自动修复常见遗漏(如缺失
verify_output.py时自动创建)。添加final_video.mp4参数可生成机器可读输出(管道传输时自动启用)。--format json
渲染前审计(推荐) — 步骤9前执行:
bash
python3 ${SKILL_DIR}/scripts/audit_beat_sync.py <Video.tsx> <timing.json>标记与旁白时间差超过1.5秒的节奏点,对于动态排版视频尤为重要。
Validation Checkpoints
验证检查点
| After Step | Check |
|---|---|
| 8 (TTS) | |
| 10 (Render) | |
| 14 (Verify) | |
| 步骤完成后 | 检查项 |
|---|---|
| 8(TTS生成) | |
| 10(渲染完成) | |
| 14(验证完成) | |
Hard Rules
硬性规则
| Rule | Requirement |
|---|---|
| Single Project | All videos under |
| 4K Output | 3840×2160 (or 2160×3840 vertical), use |
| Audio Sync | All animations driven by |
| Thumbnail | MUST generate both 16:9 (1920×1080) AND 4:3 (1200×900) — see design-guide.md |
| Studio Before Render | MUST launch |
| Every Remotion command uses |
Visual minimums (text sizes, content width, safe zones, animation safety) live in references/design-guide.md. MUST load before Step 9.
| 规则 | 要求 |
|---|---|
| 单一项目 | 所有视频存储在用户Remotion项目的 |
| 4K输出 | 分辨率为3840×2160(竖版为2160×3840),在1920×1080设计空间外使用 |
| 音频同步 | 所有动画由 |
| 封面图 | 必须同时生成16:9(1920×1080)和4:3(1200×900)两种比例——详见design-guide.md |
| 渲染前必须预览 | 必须启动 |
| 所有Remotion命令必须使用 |
视觉最低要求(文字大小、内容宽度、安全区域、动画规范)详见references/design-guide.md。必须在步骤9前加载。
Output Specs
输出规格
| Parameter | Horizontal (16:9) | Vertical (9:16) |
|---|---|---|
| Resolution | 3840×2160 (4K) | 2160×3840 (4K) |
| Frame rate | 30 fps | 30 fps |
| Encoding | H.264, 16Mbps | H.264, 16Mbps |
| Audio | AAC, 192kbps | AAC, 192kbps |
| Duration | 1-15 min | 60-90s (highlight) |
| 参数 | 横版(16:9) | 竖版(9:16) |
|---|---|---|
| 分辨率 | 3840×2160(4K) | 2160×3840(4K) |
| 帧率 | 30 fps | 30 fps |
| 编码格式 | H.264,16Mbps | H.264,16Mbps |
| 音频 | AAC,192kbps | AAC,192kbps |
| 时长 | 1-15分钟 | 60-90秒(精华版) |
Per-Video Layout
单视频目录结构
project-root/ # Remotion project root
├── src/remotion/ # Remotion source (Root.tsx, compositions, index.ts)
├── videos/{video-name}/ # Per-video assets (the agent's working dir)
│ ├── topic_definition.md # Step 1
│ ├── topic_research.md # Step 2
│ ├── podcast.txt # Step 4: narration script
│ ├── phonemes.json # Step 4.5: zh-CN pronunciation overrides
│ ├── podcast_audio.wav # Step 8: TTS audio
│ ├── podcast_audio.srt # Step 8: subtitles
│ ├── timing.json # Step 8: timeline (drives animations)
│ ├── thumbnail_*.png # Step 7
│ ├── output.mp4 # Step 10: 4K render (no BGM)
│ ├── video_with_bgm.mp4 # Step 11
│ ├── final_video.mp4 # Step 12: final output
│ └── bgm.mp3 # Background music
└── remotion.config.tsproject-root/ # Remotion项目根目录
├── src/remotion/ # Remotion源码(Root.tsx、合成组件、index.ts)
├── videos/{video-name}/ # 单视频素材目录(Agent工作目录)
│ ├── topic_definition.md # 步骤1输出
│ ├── topic_research.md # 步骤2输出
│ ├── podcast.txt # 步骤4:旁白脚本
│ ├── phonemes.json # 步骤4.5:中文发音修正
│ ├── podcast_audio.wav # 步骤8:TTS音频
│ ├── podcast_audio.srt # 步骤8:字幕文件
│ ├── timing.json # 步骤8:时间轴(驱动动画)
│ ├── thumbnail_*.png # 步骤7:封面图
│ ├── output.mp4 # 步骤10:4K渲染视频(无背景音乐)
│ ├── video_with_bgm.mp4 # 步骤11:添加背景音乐后的视频
│ ├── final_video.mp4 # 步骤12:最终输出视频
│ └── bgm.mp3 # 背景音乐文件
└── remotion.config.ts--public-dir
per video
--public-dir单视频的--public-dir
参数
--public-dirRemotion commands MUST use — each video's assets stay in its own directory, no copy to . Enables parallel renders.
--public-dir videos/{name}/public/bash
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/Remotion命令必须使用——每个视频的素材保存在独立目录中,无需复制到目录。支持并行渲染。
--public-dir videos/{name}/public/bash
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/Naming
命名规则
- Video name : lowercase English, hyphen-separated (e.g.
{video-name})reference-manager-comparison - Section name : lowercase English, underscore-separated, matches
{section}[SECTION:xxx] - Thumbnail naming (16:9 AND 4:3 both required):
| Type | 16:9 | 4:3 |
|---|---|---|
| Remotion | | |
| AI | | |
- 视频名称: 小写英文,连字符分隔(如
{video-name})reference-manager-comparison - 板块名称: 小写英文,下划线分隔,与
{section}格式匹配[SECTION:xxx] - 封面图命名(必须同时生成16:9和4:3两种比例):
| 类型 | 16:9比例 | 4:3比例 |
|---|---|---|
| Remotion生成 | | |
| AI生成 | | |
Additional Resources
额外资源
Load on demand — do NOT load all at once:
| File | Load when |
|---|---|
| references/workflow-script.md | Steps 1-4 (topic → script) |
| references/workflow-production.md | Steps 5-11 (media → TTS → Remotion → render → BGM) |
| references/workflow-publish.md | Steps 12-15 (subtitles, publish, cleanup, shorts) |
| references/design-guide.md | MUST load before Step 9 — visual minimums, typography, animation safety |
| references/design-learning.md | User provides a reference video/image, or manages style profiles |
| references/azure-tts-pitfalls.md | Choosing Azure voice/style, debugging hoarse/glitchy audio |
| references/troubleshooting.md | On error, or user asks about preferences/BGM |
| templates/presets/kinetic-typography/ | Bold type-driven preset (opinion / argument / declaration videos) |
| examples/ | Reference for composition structure and |
按需加载——请勿一次性加载所有资源:
| 文件 | 加载时机 |
|---|---|
| references/workflow-script.md | 步骤1-4(主题→脚本) |
| references/workflow-production.md | 步骤5-11(媒体素材→TTS→Remotion→渲染→背景音乐) |
| references/workflow-publish.md | 步骤12-15(字幕、发布、清理、短视频) |
| references/design-guide.md | 必须在步骤9前加载——视觉最低要求、排版规范、动画安全准则 |
| references/design-learning.md | 用户提供参考视频/图片,或管理风格配置文件时 |
| references/azure-tts-pitfalls.md | 选择Azure语音/风格、调试嘶哑/卡顿音频时 |
| references/troubleshooting.md | 出现错误,或用户询问偏好设置/背景音乐选项时 |
| templates/presets/kinetic-typography/ | 动态排版预设(适用于观点/论证/宣言类视频) |
| examples/ | 合成组件结构和 |
Script suite dispatcher
脚本套件调度器
All scripts under are reachable through one hierarchical entry point:
${SKILL_DIR}/scripts/bash
python3 ${SKILL_DIR}/scripts/cli.py --help # list resources
python3 ${SKILL_DIR}/scripts/cli.py <resource> --help # list actions
python3 ${SKILL_DIR}/scripts/cli.py <resource> <action> --help # forwards to underlying script
python3 ${SKILL_DIR}/scripts/cli.py schema [<method>] # JSON parameter schemaRoutes: , , , , , , , . Direct script invocation () keeps working — the dispatcher is additive.
tts run|validateverifyaudit beatsshorts gendesign list|show|delete|addprereqsprefs get|migrate|backend|bgm-pathschema [<method>]python3 scripts/<name>.py ...${SKILL_DIR}/scripts/bash
python3 ${SKILL_DIR}/scripts/cli.py --help # 列出所有资源
python3 ${SKILL_DIR}/scripts/cli.py <resource> --help # 列出资源对应的操作
python3 ${SKILL_DIR}/scripts/cli.py <resource> <action> --help # 跳转至对应脚本的帮助信息
python3 ${SKILL_DIR}/scripts/cli.py schema [<method>] # 生成JSON参数 schema路由列表: , , , , , , , 。直接调用脚本()仍可正常工作——调度器为附加功能。
tts run|validateverifyaudit beatsshorts gendesign list|show|delete|addprereqsprefs get|migrate|backend|bgm-pathschema [<method>]python3 scripts/<name>.py ...User Preferences
用户偏好设置
Skill auto-learns and applies preferences. Full commands and learning details: references/troubleshooting.md.
- Storage: (auto-created from
user_prefs.json, schema inuser_prefs.template.json).prefs_schema.json - Priority: .
Root.tsx defaults < global < topic_patterns[type] < current instructions - User commands: "show preferences" · "reset preferences" · "save as X default".
本技能可自动学习并应用用户偏好。完整命令和学习细节详见references/troubleshooting.md。
- 存储: (从
user_prefs.json自动创建,schema定义在user_prefs.template.json中)。prefs_schema.json - 优先级: 。
Root.tsx默认值 < 全局设置 < topic_patterns[type] < 当前指令 - 用户命令: "显示偏好设置" · "重置偏好设置" · "保存为X类默认值"。
Troubleshooting
故障排查
See references/troubleshooting.md on errors, BGM options, preference learning, design-learning issues.
出现错误、背景音乐选项、偏好学习、设计学习相关问题时,请查看references/troubleshooting.md。