beat-sync-video-editing

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Beat-Sync Video Editing

节拍同步视频编辑

Purpose

用途

Provide domain expertise for creating beat-synced video edits: taking a source video and an audio track, selecting clips from the video that align with the music's rhythm, and rendering the final output with FFmpeg.
提供节拍同步视频剪辑领域的专业知识:基于源视频和音轨,挑选与音乐节奏对齐的视频片段,再通过FFmpeg渲染生成最终输出。

Core Concept: The EditPlan

核心概念:EditPlan

Every edit starts as an EditPlan — a JSON structure that describes which video clips to use and where in the audio to place them:
json
{
  "audio_start": "00:13",
  "audio_duration": 6.5,
  "clips": [
    { "video_start": "00:08", "duration": 2.0, "description": "Opening shot" },
    { "video_start": "00:45", "duration": 1.5, "description": "Action moment" },
    { "video_start": "01:22", "duration": 3.0, "description": "Build-up" }
  ],
  "reasoning": "Matches rising intensity with beat drops"
}
Timestamp format:
audio_start
and
video_start
use MM:SS strings (e.g.
"01:15"
for 1 minute 15 seconds).
audio_duration
and clip
duration
use numbers in seconds.
Critical constraints:
  • audio_start
    must be valid MM:SS format, non-negative
  • audio_duration
    must be positive (seconds)
  • Every clip must have valid MM:SS
    video_start
    and positive
    duration
    (seconds)
  • Sum of all clip durations must equal
    audio_duration
    (within 0.5s tolerance)
  • Clip order is intentional — not necessarily chronological. Non-linear ordering creates dynamic edits.
所有剪辑都从EditPlan开始——它是一种JSON结构,用于描述要使用的视频片段,以及这些片段对应放置在音频中的位置:
json
{
  "audio_start": "00:13",
  "audio_duration": 6.5,
  "clips": [
    { "video_start": "00:08", "duration": 2.0, "description": "Opening shot" },
    { "video_start": "00:45", "duration": 1.5, "description": "Action moment" },
    { "video_start": "01:22", "duration": 3.0, "description": "Build-up" }
  ],
  "reasoning": "Matches rising intensity with beat drops"
}
时间戳格式
audio_start
video_start
使用MM:SS格式的字符串(例如1分15秒对应
"01:15"
)。
audio_duration
和片段
duration
使用以秒为单位的数值。
关键约束
  • audio_start
    必须为合法的MM:SS格式,值非负
  • audio_duration
    必须为正数(单位:秒)
  • 每个片段必须包含合法MM:SS格式的
    video_start
    和正数形式的
    duration
    (单位:秒)
  • 所有片段的时长总和必须等于
    audio_duration
    (允许0.5秒以内的误差)
  • 片段顺序是人为设定的——不一定按时间先后排列,非线性排序可以制作出更有动感的剪辑效果。

Workflow: From Files to Final Video

工作流:从文件到最终视频

Step 1: Generate EditPlan via Gemini

步骤1:通过Gemini生成EditPlan

Run the Gemini script to analyze video + audio and produce a plan:
bash
undefined
运行Gemini脚本分析视频+音频并生成剪辑方案:
bash
undefined

Fresh upload (files auto-deleted after):

全新上传(文件会在结束后自动删除):

bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
--video <video_path>
--audio <audio_path>
--prompt "<user's edit description>"
bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
--video <video_path>
--audio <audio_path>
--prompt "<user's edit description>"

Keep files for reuse (outputs ECLIPTIC_FILES JSON to stderr):

保留文件供后续复用(会将ECLIPTIC_FILES JSON输出到标准错误流):

bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
--video <video_path>
--audio <audio_path>
--prompt "<description>" --no-cleanup
bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
--video <video_path>
--audio <audio_path>
--prompt "<description>" --no-cleanup

Reuse previously uploaded files (skips upload, much faster):

复用之前上传的文件(跳过上传步骤,速度更快):

bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
--video-uri <uri> --video-mime <mime>
--audio-uri <uri> --audio-mime <mime>
--prompt "<different description>"

- Outputs EditPlan JSON to stdout, progress to stderr
- Requires `GEMINI_API_KEY`, `curl`, and `jq`
- Gemini watches the video and listens to the audio simultaneously
- Use `--no-cleanup` + reuse mode for fast iteration with different prompts
bash ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
--video-uri <uri> --video-mime <mime>
--audio-uri <uri> --audio-mime <mime>
--prompt "<different description>"

- 会将EditPlan JSON输出到标准输出流,进度信息输出到标准错误流
- 需要`GEMINI_API_KEY`、`curl`和`jq`依赖
- Gemini会同时分析视频画面和音频内容
- 使用`--no-cleanup`+复用模式可以针对不同提示词快速迭代调整

Step 2: Validate the Plan

步骤2:验证剪辑方案

bash
echo '<plan_json>' | bash ${CLAUDE_PLUGIN_ROOT}/scripts/validate-plan.sh
  • Outputs
    {"valid": true, "errors": []}
    or
    {"valid": false, "errors": [...]}
  • Exit code 0 = valid, 1 = invalid
bash
echo '<plan_json>' | bash ${CLAUDE_PLUGIN_ROOT}/scripts/validate-plan.sh
  • 输出
    {"valid": true, "errors": []}
    或者
    {"valid": false, "errors": [...]}
  • 退出码0=合法,1=不合法

Step 3: Build FFmpeg Filters

步骤3:构建FFmpeg滤镜

bash
echo '<plan_json>' | bash ${CLAUDE_PLUGIN_ROOT}/scripts/build-filter.sh
  • Outputs
    {"videoFilter": "...", "audioFilter": "...", "fullFilter": "..."}
  • The
    fullFilter
    field is what goes into FFmpeg's
    -filter_complex
    argument
bash
echo '<plan_json>' | bash ${CLAUDE_PLUGIN_ROOT}/scripts/build-filter.sh
  • 输出
    {"videoFilter": "...", "audioFilter": "...", "fullFilter": "..."}
  • fullFilter
    字段的内容可以直接传入FFmpeg的
    -filter_complex
    参数

Step 4: Render with FFmpeg

步骤4:使用FFmpeg渲染

bash
ffmpeg -y -i "<video_path>" -i "<audio_path>" \
  -filter_complex "<fullFilter>" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -preset fast -crf 23 \
  -c:a aac -shortest \
  "<output_path>"
bash
ffmpeg -y -i "<video_path>" -i "<audio_path>" \
  -filter_complex "<fullFilter>" \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -preset fast -crf 23 \
  -c:a aac -shortest \
  "<output_path>"

FFmpeg Filter Anatomy

FFmpeg滤镜结构

For a 3-clip edit, the
fullFilter
looks like:
[0:v]trim=start=8.000:duration=2.000,setpts=PTS-STARTPTS[v0];
[0:v]trim=start=45.000:duration=1.500,setpts=PTS-STARTPTS[v1];
[0:v]trim=start=22.000:duration=3.000,setpts=PTS-STARTPTS[v2];
[v0][v1][v2]concat=n=3:v=1:a=0[outv];
[1:a]atrim=start=13.000:duration=6.500,asetpts=PTS-STARTPTS[outa]
  • [0:v]
    = first input (video),
    [1:a]
    = second input (audio)
  • trim
    extracts a segment,
    setpts=PTS-STARTPTS
    resets timestamps
  • concat
    joins all video segments in order
  • atrim
    extracts the audio section
For the full FFmpeg filter reference, see
references/ffmpeg-filters.md
.
对于包含3个片段的剪辑,
fullFilter
的内容如下:
[0:v]trim=start=8.000:duration=2.000,setpts=PTS-STARTPTS[v0];
[0:v]trim=start=45.000:duration=1.500,setpts=PTS-STARTPTS[v1];
[0:v]trim=start=22.000:duration=3.000,setpts=PTS-STARTPTS[v2];
[v0][v1][v2]concat=n=3:v=1:a=0[outv];
[1:a]atrim=start=13.000:duration=6.500,asetpts=PTS-STARTPTS[outa]
  • [0:v]
    = 第一个输入(视频),
    [1:a]
    = 第二个输入(音频)
  • trim
    用于提取视频片段,
    setpts=PTS-STARTPTS
    用于重置时间戳
  • concat
    按顺序拼接所有视频片段
  • atrim
    用于提取指定的音频片段
完整的FFmpeg滤镜参考请查看
references/ffmpeg-filters.md

Troubleshooting

故障排查

Duration mismatch error: Clip durations don't sum to
audio_duration
. Fix by adjusting the last clip's duration to absorb the difference, or re-run Gemini with a stricter prompt.
FFmpeg "Error" in stderr: FFmpeg writes progress and warnings to stderr. Only treat it as a real error if the output file wasn't created. Check for actual error patterns like
No such file
,
Invalid data
, or
Conversion failed
.
Gemini returns poor clips: Add specificity to the prompt. Instead of "make an edit", say "make a fast 30-second action edit, cut every 1-2 seconds on the beat drops, start from the chorus".
时长不匹配错误:片段总时长与
audio_duration
不相等。可以调整最后一个片段的时长来抵消差值,或者使用更严格的提示词重新运行Gemini生成方案。
FFmpeg标准错误流中出现"Error":FFmpeg会将进度和警告信息输出到标准错误流,只有当输出文件没有生成时才判定为真实错误。可以检查是否存在
No such file
Invalid data
或者
Conversion failed
这类实际错误标识。
Gemini返回的片段质量差:给提示词添加更具体的要求。不要只写「做一个剪辑」,而是写「制作一个30秒的快节奏动作剪辑,每1-2秒在节拍点裁切,从副歌部分开始」。

Additional Resources

额外资源

Reference Files

参考文件

  • references/ffmpeg-filters.md
    — Detailed FFmpeg filter_complex syntax, encoding options, common flags
  • references/edit-plan-schema.md
    — Full EditPlan JSON schema, validation rules, edge cases
  • references/ffmpeg-filters.md
    ——详细的FFmpeg filter_complex语法、编码选项、常用参数
  • references/edit-plan-schema.md
    ——完整的EditPlan JSON schema、校验规则、边界场景处理

Scripts

脚本

  • ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
    — Upload to Gemini via REST API, get EditPlan (supports
    --no-cleanup
    and file reuse)
  • ${CLAUDE_PLUGIN_ROOT}/scripts/validate-plan.sh
    — Validate EditPlan JSON
  • ${CLAUDE_PLUGIN_ROOT}/scripts/build-filter.sh
    — Convert EditPlan to FFmpeg filters
  • ${CLAUDE_PLUGIN_ROOT}/scripts/cleanup-gemini-files.sh
    — Delete uploaded files from Gemini when done iterating
  • ${CLAUDE_PLUGIN_ROOT}/scripts/gemini-edit-plan.sh
    ——通过REST API上传内容到Gemini,获取EditPlan(支持
    --no-cleanup
    和文件复用)
  • ${CLAUDE_PLUGIN_ROOT}/scripts/validate-plan.sh
    ——校验EditPlan JSON合法性
  • ${CLAUDE_PLUGIN_ROOT}/scripts/build-filter.sh
    ——将EditPlan转换为FFmpeg滤镜
  • ${CLAUDE_PLUGIN_ROOT}/scripts/cleanup-gemini-files.sh
    ——迭代完成后删除上传到Gemini的文件