video-podcast-maker

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

REQUIRED: Load Remotion Best Practices First
This skill depends on
remotion-best-practices
. You MUST invoke it before proceeding:
Invoke the skill/tool named: remotion-best-practices

必填项：先加载Remotion最佳实践
本技能依赖
remotion-best-practices
。在继续操作前必须调用它：
Invoke the skill/tool named: remotion-best-practices

Video Podcast Maker

视频播客制作工具

Automated pipeline for 4K Bilibili horizontal knowledge videos from a topic. Coding agent + TTS backend + Remotion + FFmpeg.

基于主题自动生成B站4K横版知识视频的流水线。包含代码Agent + TTS后端 + Remotion + FFmpeg。

Bootstrap — update check + prerequisites (run before Step 1)
Execution Modes — Auto vs Interactive, default decisions
Workflow — the 15 steps + phase-file pointers + mandatory stops
Hard Rules — non-negotiable production constraints + output specs
Per-Video Layout — directory structure,
```
--public-dir
```
, naming
Additional Resources — when to load each
```
references/
```
file
User Preferences
Troubleshooting

初始化 — 更新检查 + 前置条件（步骤1前运行）
执行模式 — 自动模式 vs 交互模式，默认决策规则
工作流 — 15个步骤 + 阶段文件指引 + 强制停顿节点
硬性规则 — 不可协商的制作约束 + 输出规格
单视频目录结构 — 目录结构、
```
--public-dir
```
参数、命名规则
额外资源 — 各
```
references/
```
文件的加载时机
用户偏好设置
故障排查

Bootstrap

初始化

Resolve

SKILL_DIR

to the directory containing this

SKILL.md

. If your agent exposes a built-in skill directory variable (e.g.

${CLAUDE_SKILL_DIR}

), map it to

SKILL_DIR

bash

SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"

将

SKILL_DIR

解析为包含本

SKILL.md

文件的目录。如果你的Agent提供内置技能目录变量（如

${CLAUDE_SKILL_DIR}

），请将其映射到

SKILL_DIR

。

bash

SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"

1. Update check (notify-only, throttled to 24h)

1. 更新检查（仅通知，每24小时限制一次）

"${SKILL_DIR}/scripts/check_update.sh"

2. Prerequisites (CLIs + backend env vars)

2. 前置条件检查（命令行工具 + 后端环境变量）

python3 "${SKILL_DIR}/scripts/check_prereqs.py"


**`check_update.sh` output**:
- `UPDATE_AVAILABLE vX.Y.Z -> vA.B.C` — tell the user the version delta and ask before running `git -C "${SKILL_DIR}" pull --ff-only`. **Notify-only by design — never pull without consent (the skill directory belongs to the user).**
- `UP_TO_DATE` / `SKIPPED_RECENT_CHECK` / `MANUAL_INSTALL` — continue silently.

**Prereqs failures** — see README.md for setup. The check is backend-aware (resolves `TTS_BACKEND` env → `user_prefs.json` `global.tts.backend` → `edge` default), so only env vars required by the active backend are validated.

> **Design Learning shortcut**: If the user provides a reference video/image or asks to save/list/delete style profiles, see [references/design-learning.md](references/design-learning.md) instead of running the workflow below.

---

python3 "${SKILL_DIR}/scripts/check_prereqs.py"


**`check_update.sh`输出说明**:
- `UPDATE_AVAILABLE vX.Y.Z -> vA.B.C` — 告知用户版本差异，并在运行`git -C "${SKILL_DIR}" pull --ff-only`前征得同意。**设计为仅通知模式——未经用户同意绝不能自动拉取更新（技能目录属于用户）。**
- `UP_TO_DATE` / `SKIPPED_RECENT_CHECK` / `MANUAL_INSTALL` — 静默继续流程。

**前置条件检查失败** — 请查看README.md进行配置。检查逻辑支持后端感知（优先解析`TTS_BACKEND`环境变量 → `user_prefs.json`中的`global.tts.backend` → 默认使用`edge`），因此仅验证当前激活后端所需的环境变量。

> **设计学习快捷路径**: 如果用户提供参考视频/图片，或要求保存/列出/删除风格配置文件，请查看[references/design-learning.md](references/design-learning.md)，而非运行以下工作流。

---

Execution Modes

执行模式

Detect at workflow start:

"Make a video about..." / no special instructions → Auto Mode (default)
"I want to control each step" / "interactive" → Interactive Mode

在工作流开始时检测:

“制作一个关于...的视频” / 无特殊指令 → 自动模式（默认）
“我想控制每个步骤” / “交互” → 交互模式

Auto Mode defaults

自动模式默认规则

Full pipeline with sensible defaults. Mandatory stop at Step 9 (Studio review); Step 10 (4K render) only fires when the user says "render 4K" / "render final".

Step	Decision	Auto Default
3	Title position	top-center
5	Media assets	Skip (text-only animations)
7	Thumbnail method	Remotion-generated (16:9 + 4:3)
9	Outro animation	Pre-made MP4 (white/black by theme)
12	Subtitle method	Remotion-native (skip legacy FFmpeg burn)
14	Cleanup	Auto-clean temp files

Override any default in the initial request:

"make a video about AI, burn subtitles" → auto + subtitles on
"use dark theme, AI thumbnails" → auto + dark + imagen
"need screenshots" → auto + media collection enabled

完整流水线搭配合理默认值。步骤9（Studio预览）为强制停顿节点；仅当用户说出“渲染4K” / “渲染最终版”时，才会执行步骤10（4K渲染）。

步骤	决策项	自动默认值
3	标题位置	顶部居中
5	媒体素材	跳过（仅文字动画）
7	封面生成方式	Remotion自动生成（16:9 + 4:3比例）
9	片尾动画	预制MP4文件（根据主题选择白底/黑底）
12	字幕生成方式	Remotion原生方案（跳过旧版FFmpeg硬字幕）
14	清理操作	自动清理临时文件

可在初始请求中覆盖任意默认值:

“制作一个关于AI的视频，添加硬字幕” → 自动模式 + 开启字幕
“使用深色主题，AI生成封面” → 自动模式 + 深色主题 + Imagen生成
“需要截图素材” → 自动模式 + 启用媒体素材收集

Interactive Mode

交互模式

Prompts at each decision point.

在每个决策点向用户发起询问。

Workflow

工作流

At Step 1 start, create one task per step in your agent's tracker (Claude Code

TaskCreate

/ Codex todo list / equivalent). Mark

in_progress

on start,

completed

on finish. Files in

videos/{name}/

are the durable record — if interrupted, inspect the directory to determine where to resume.

#	Step	Output	Phase file
1	Define topic direction	`topic_definition.md`	workflow-script.md
2	Research topic	`topic_research.md`	workflow-script.md
3	Design 5-7 sections	(in-memory)	workflow-script.md
4	Write narration script	`podcast.txt`	workflow-script.md
4.5	Pronunciation pre-flight (zh-CN)	`phonemes.json`	workflow-script.md
5	Collect media (Auto: skip)	`media_manifest.json`	workflow-production.md
6	Generate publish info (Part 1)	`publish_info.md`	workflow-production.md
7	Generate thumbnails (16:9 + 4:3)	`thumbnail_*.png`	workflow-production.md
8	Generate TTS audio	`podcast_audio.wav` , `timing.json`	workflow-production.md
9	Remotion composition + Studio preview	—	workflow-production.md
10	Render 4K video (only on user request)	`output.mp4`	workflow-production.md
11	Mix background music	`video_with_bgm.mp4`	workflow-production.md
12	Finalize (optional legacy subtitle burn)	`final_video.mp4`	workflow-publish.md
13	Complete publish info (Part 2)	chapter timestamps	workflow-publish.md
14	Verify output ( `scripts/verify_output.py` )	—	workflow-publish.md
15	Generate vertical shorts (optional)	`shorts/`	workflow-publish.md

Mandatory stops (bold rows above):

Step 9 — Studio review. MUST launch
```
npx remotion studio
```
and wait for user feedback before rendering. NEVER render 4K until the user explicitly confirms ("render 4K" / "render final").
Step 14 —
verify_output.py
. MUST pass before declaring the video done. Exit 0 = green; exit 2 = warnings still publishable. Auto-fixes common omissions (creates
```
final_video.mp4
```
if missing). For machine-readable output add
```
--format json
```
(auto when piped).

Pre-render audit (recommended) — before Step 9:

bash

python3 ${SKILL_DIR}/scripts/audit_beat_sync.py <Video.tsx> <timing.json>

Flags beats that drift > 1.5s from narration. Especially important for kinetic-typography videos.

在步骤1开始时，在你的Agent任务追踪器中为每个步骤创建一个任务（如Claude Code的

TaskCreate

/ Codex待办列表 / 等效功能）。开始时标记为

in_progress

，完成后标记为

completed

。

videos/{name}/

目录下的文件是持久化记录——如果流程中断，可检查该目录以确定恢复位置。

序号	步骤	输出	阶段文件
1	定义主题方向	`topic_definition.md`	workflow-script.md
2	主题调研	`topic_research.md`	workflow-script.md
3	设计5-7个内容板块	（内存中暂存）	workflow-script.md
4	撰写旁白脚本	`podcast.txt`	workflow-script.md
4.5	发音预检查（中文）	`phonemes.json`	workflow-script.md
5	收集媒体素材（自动模式：跳过）	`media_manifest.json`	workflow-production.md
6	生成发布信息（第一部分）	`publish_info.md`	workflow-production.md
7	生成封面图（16:9 + 4:3比例）	`thumbnail_*.png`	workflow-production.md
8	生成TTS音频	`podcast_audio.wav` , `timing.json`	workflow-production.md
9	Remotion合成 + Studio预览	—	workflow-production.md
10	渲染4K视频（仅在用户请求时执行）	`output.mp4`	workflow-production.md
11	混合背景音乐	`video_with_bgm.mp4`	workflow-production.md
12	最终处理（可选旧版硬字幕添加）	`final_video.mp4`	workflow-publish.md
13	完善发布信息（第二部分）	章节时间戳	workflow-publish.md
14	验证输出（ `scripts/verify_output.py` ）	—	workflow-publish.md
15	生成竖版短视频（可选）	`shorts/`	workflow-publish.md

强制停顿节点（上方加粗行）:

步骤9 — Studio预览。必须启动
```
npx remotion studio
```
并等待用户反馈后再进行渲染。在用户明确确认（“渲染4K” / “渲染最终版”）前，绝不能执行4K渲染。
步骤14 —
verify_output.py
。必须验证通过后才能宣布视频制作完成。退出码0=验证通过；退出码2=存在警告但仍可发布。可自动修复常见遗漏（如缺失
```
final_video.mp4
```
时自动创建）。添加
```
--format json
```
参数可生成机器可读输出（管道传输时自动启用）。

渲染前审计（推荐） — 步骤9前执行:

bash

python3 ${SKILL_DIR}/scripts/audit_beat_sync.py <Video.tsx> <timing.json>

标记与旁白时间差超过1.5秒的节奏点，对于动态排版视频尤为重要。

Validation Checkpoints

验证检查点

After Step	Check
8 (TTS)	`podcast_audio.wav` plays · `timing.json` covers all sections · SRT is UTF-8
10 (Render)	`output.mp4` is 3840×2160 · audio-video sync · no black frames
14 (Verify)	`verify_output.py` exits 0 (or 2 with reviewed warnings)

步骤完成后	检查项
8（TTS生成）	`podcast_audio.wav` 可正常播放 · `timing.json` 覆盖所有板块 · SRT文件为UTF-8编码
10（渲染完成）	`output.mp4` 分辨率为3840×2160 · 音视频同步 · 无黑帧
14（验证完成）	`verify_output.py` 退出码为0（或退出码2且警告已审核）

Hard Rules

硬性规则

Rule	Requirement
Single Project	All videos under `videos/{name}/` in user's Remotion project. NEVER create a new project per video.
4K Output	3840×2160 (or 2160×3840 vertical), use `scale(2)` wrapper over 1920×1080 design space
Audio Sync	All animations driven by `timing.json` timestamps
Thumbnail	MUST generate both 16:9 (1920×1080) AND 4:3 (1200×900) — see design-guide.md
Studio Before Render	MUST launch `remotion studio` for review. NEVER render 4K until user explicitly confirms.
`--public-dir`	Every Remotion command uses `--public-dir videos/{name}/`

Visual minimums (text sizes, content width, safe zones, animation safety) live in references/design-guide.md. MUST load before Step 9.

规则	要求
单一项目	所有视频存储在用户Remotion项目的 `videos/{name}/` 目录下。绝不能为每个视频创建新项目。
4K输出	分辨率为3840×2160（竖版为2160×3840），在1920×1080设计空间外使用 `scale(2)` 包装器
音频同步	所有动画由 `timing.json` 时间戳驱动
封面图	必须同时生成16:9（1920×1080）和4:3（1200×900）两种比例——详见design-guide.md
渲染前必须预览	必须启动 `remotion studio` 进行预览。在用户明确确认前，绝不能执行4K渲染。
`--public-dir` 参数	所有Remotion命令必须使用 `--public-dir videos/{name}/`

视觉最低要求（文字大小、内容宽度、安全区域、动画规范）详见references/design-guide.md。必须在步骤9前加载。

Output Specs

输出规格

Parameter	Horizontal (16:9)	Vertical (9:16)
Resolution	3840×2160 (4K)	2160×3840 (4K)
Frame rate	30 fps	30 fps
Encoding	H.264, 16Mbps	H.264, 16Mbps
Audio	AAC, 192kbps	AAC, 192kbps
Duration	1-15 min	60-90s (highlight)

参数	横版（16:9）	竖版（9:16）
分辨率	3840×2160（4K）	2160×3840（4K）
帧率	30 fps	30 fps
编码格式	H.264，16Mbps	H.264，16Mbps
音频	AAC，192kbps	AAC，192kbps
时长	1-15分钟	60-90秒（精华版）

Per-Video Layout

单视频目录结构

project-root/                           # Remotion project root
├── src/remotion/                       # Remotion source (Root.tsx, compositions, index.ts)
├── videos/{video-name}/                # Per-video assets (the agent's working dir)
│   ├── topic_definition.md             # Step 1
│   ├── topic_research.md               # Step 2
│   ├── podcast.txt                     # Step 4: narration script
│   ├── phonemes.json                   # Step 4.5: zh-CN pronunciation overrides
│   ├── podcast_audio.wav               # Step 8: TTS audio
│   ├── podcast_audio.srt               # Step 8: subtitles
│   ├── timing.json                     # Step 8: timeline (drives animations)
│   ├── thumbnail_*.png                 # Step 7
│   ├── output.mp4                      # Step 10: 4K render (no BGM)
│   ├── video_with_bgm.mp4              # Step 11
│   ├── final_video.mp4                 # Step 12: final output
│   └── bgm.mp3                         # Background music
└── remotion.config.ts

project-root/                           # Remotion项目根目录
├── src/remotion/                       # Remotion源码（Root.tsx、合成组件、index.ts）
├── videos/{video-name}/                # 单视频素材目录（Agent工作目录）
│   ├── topic_definition.md             # 步骤1输出
│   ├── topic_research.md               # 步骤2输出
│   ├── podcast.txt                     # 步骤4：旁白脚本
│   ├── phonemes.json                   # 步骤4.5：中文发音修正
│   ├── podcast_audio.wav               # 步骤8：TTS音频
│   ├── podcast_audio.srt               # 步骤8：字幕文件
│   ├── timing.json                     # 步骤8：时间轴（驱动动画）
│   ├── thumbnail_*.png                 # 步骤7：封面图
│   ├── output.mp4                      # 步骤10：4K渲染视频（无背景音乐）
│   ├── video_with_bgm.mp4              # 步骤11：添加背景音乐后的视频
│   ├── final_video.mp4                 # 步骤12：最终输出视频
│   └── bgm.mp3                         # 背景音乐文件
└── remotion.config.ts

--public-dir

per video

单视频的

--public-dir

参数

Remotion commands MUST use

--public-dir videos/{name}/

— each video's assets stay in its own directory, no copy to

public/

. Enables parallel renders.

bash

npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/

Remotion命令必须使用

--public-dir videos/{name}/

——每个视频的素材保存在独立目录中，无需复制到

public/

目录。支持并行渲染。

bash

npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/

Naming

命名规则

Video name
{video-name}
: lowercase English, hyphen-separated (e.g.
```
reference-manager-comparison
```
)
Section name
{section}
: lowercase English, underscore-separated, matches
```
[SECTION:xxx]
```
Thumbnail naming (16:9 AND 4:3 both required):

Type	16:9	4:3
Remotion	`thumbnail_remotion_16x9.png`	`thumbnail_remotion_4x3.png`
AI	`thumbnail_ai_16x9.png`	`thumbnail_ai_4x3.png`

视频名称
{video-name}
: 小写英文，连字符分隔（如
```
reference-manager-comparison
```
）
板块名称
{section}
: 小写英文，下划线分隔，与
```
[SECTION:xxx]
```
格式匹配
封面图命名（必须同时生成16:9和4:3两种比例）:

类型	16:9比例	4:3比例
Remotion生成	`thumbnail_remotion_16x9.png`	`thumbnail_remotion_4x3.png`
AI生成	`thumbnail_ai_16x9.png`	`thumbnail_ai_4x3.png`

Additional Resources

额外资源

Load on demand — do NOT load all at once:

File	Load when
references/workflow-script.md	Steps 1-4 (topic → script)
references/workflow-production.md	Steps 5-11 (media → TTS → Remotion → render → BGM)
references/workflow-publish.md	Steps 12-15 (subtitles, publish, cleanup, shorts)
references/design-guide.md	MUST load before Step 9 — visual minimums, typography, animation safety
references/design-learning.md	User provides a reference video/image, or manages style profiles
references/azure-tts-pitfalls.md	Choosing Azure voice/style, debugging hoarse/glitchy audio
references/troubleshooting.md	On error, or user asks about preferences/BGM
templates/presets/kinetic-typography/	Bold type-driven preset (opinion / argument / declaration videos)
examples/	Reference for composition structure and `timing.json` format

按需加载——请勿一次性加载所有资源:

文件	加载时机
references/workflow-script.md	步骤1-4（主题→脚本）
references/workflow-production.md	步骤5-11（媒体素材→TTS→Remotion→渲染→背景音乐）
references/workflow-publish.md	步骤12-15（字幕、发布、清理、短视频）
references/design-guide.md	必须在步骤9前加载——视觉最低要求、排版规范、动画安全准则
references/design-learning.md	用户提供参考视频/图片，或管理风格配置文件时
references/azure-tts-pitfalls.md	选择Azure语音/风格、调试嘶哑/卡顿音频时
references/troubleshooting.md	出现错误，或用户询问偏好设置/背景音乐选项时
templates/presets/kinetic-typography/	动态排版预设（适用于观点/论证/宣言类视频）
examples/	合成组件结构和 `timing.json` 格式参考

Script suite dispatcher

脚本套件调度器

All scripts under

${SKILL_DIR}/scripts/

are reachable through one hierarchical entry point:

bash

python3 ${SKILL_DIR}/scripts/cli.py --help                  # list resources
python3 ${SKILL_DIR}/scripts/cli.py <resource> --help       # list actions
python3 ${SKILL_DIR}/scripts/cli.py <resource> <action> --help    # forwards to underlying script
python3 ${SKILL_DIR}/scripts/cli.py schema [<method>]       # JSON parameter schema

Routes:

tts run|validate

verify

audit beats

shorts gen

design list|show|delete|add

prereqs

prefs get|migrate|backend|bgm-path

schema [<method>]

. Direct script invocation (

python3 scripts/<name>.py ...

) keeps working — the dispatcher is additive.

${SKILL_DIR}/scripts/

下的所有脚本可通过一个层级化入口访问:

bash

python3 ${SKILL_DIR}/scripts/cli.py --help                  # 列出所有资源
python3 ${SKILL_DIR}/scripts/cli.py <resource> --help       # 列出资源对应的操作
python3 ${SKILL_DIR}/scripts/cli.py <resource> <action> --help    # 跳转至对应脚本的帮助信息
python3 ${SKILL_DIR}/scripts/cli.py schema [<method>]       # 生成JSON参数 schema

路由列表:

tts run|validate

verify

audit beats

shorts gen

design list|show|delete|add

prereqs

prefs get|migrate|backend|bgm-path

schema [<method>]

。直接调用脚本（

python3 scripts/<name>.py ...

）仍可正常工作——调度器为附加功能。

User Preferences

用户偏好设置

Skill auto-learns and applies preferences. Full commands and learning details: references/troubleshooting.md.

Storage:

user_prefs.json

(auto-created from

user_prefs.template.json

, schema in

prefs_schema.json

Priority:

Root.tsx defaults < global < topic_patterns[type] < current instructions

User commands: "show preferences" · "reset preferences" · "save as X default".

本技能可自动学习并应用用户偏好。完整命令和学习细节详见references/troubleshooting.md。

存储:

user_prefs.json

（从

user_prefs.template.json

自动创建，schema定义在

prefs_schema.json

中）。

优先级:

Root.tsx默认值 < 全局设置 < topic_patterns[type] < 当前指令

。

用户命令: "显示偏好设置" · "重置偏好设置" · "保存为X类默认值"。

Troubleshooting

故障排查

See references/troubleshooting.md on errors, BGM options, preference learning, design-learning issues.

出现错误、背景音乐选项、偏好学习、设计学习相关问题时，请查看references/troubleshooting.md。

video-podcast-maker

Original

Translation

Video Podcast Maker

视频播客制作工具

Contents

目录

Bootstrap

初始化

1. Update check (notify-only, throttled to 24h)

1. 更新检查（仅通知，每24小时限制一次）

2. Prerequisites (CLIs + backend env vars)

2. 前置条件检查（命令行工具 + 后端环境变量）

Execution Modes

执行模式

Auto Mode defaults

自动模式默认规则

Interactive Mode

交互模式

Workflow

工作流

Validation Checkpoints

验证检查点

Hard Rules

硬性规则

Output Specs

输出规格

Per-Video Layout

单视频目录结构

--public-dir per video

单视频的--public-dir参数

Naming

命名规则

Additional Resources

额外资源

Script suite dispatcher

脚本套件调度器

User Preferences

用户偏好设置

Troubleshooting

故障排查

`--public-dir`
per video

单视频的
`--public-dir`
参数