youtube-clip-extractor
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseYouTube Clip Extractor
YouTube 片段提取工具
Overview
概述
This skill downloads YouTube videos, analyzes transcripts for compelling clip moments, extracts clips using ffmpeg, and generates platform-ready on-screen text and captions. It integrates with the existing caption and social content skills to deliver complete, publishable assets.
本技能可下载YouTube视频,分析字幕文稿以识别有吸引力的片段时刻,使用ffmpeg提取片段,并生成适用于各平台的屏幕文本和字幕。它与现有的字幕和社交媒体内容技能集成,可交付完整的可发布素材。
When to Use This Skill
何时使用本技能
- You have a YouTube URL and want to extract the best clips
- You want automated clip identification based on hook/coda criteria
- You need clips cut and ready for Descript or other editors
- You want on-screen text hooks and platform-specific captions for each clip
Do NOT use for:
- Full podcast production workflow (use skill instead)
podcast-production - Text-only social posts (use skill)
social-content-creation - Already-downloaded videos (skip to Phase 2)
- 你有一个YouTube URL,想要提取最佳片段
- 你希望基于钩子/结尾标准自动识别片段
- 你需要剪辑好的片段,可直接用于Descript或其他编辑器
- 你想要为每个片段生成屏幕文本钩子和针对特定平台的字幕
请勿用于:
- 完整播客制作工作流(请改用技能)
podcast-production - 纯文本社交媒体帖子(请改用技能)
social-content-creation - 已下载的视频(直接跳至阶段2)
Prerequisites
前置要求
Required Tools (install via Homebrew)
必备工具(通过Homebrew安装)
bash
brew install yt-dlp ffmpegbash
brew install yt-dlp ffmpegFile Location
文件存储位置
All downloads go to:
Content/YouTube Transcripts/Structure:
Content/YouTube Transcripts/
├── {video_id}.mp4 # Full video (H.264 encoded)
├── {video_id}.en.vtt # Timestamped subtitles
└── clips/
└── {video_id}/
├── clip_01_{name}.mp4 # Individual clips
├── clip_02_{name}.mp4
└── {video_id}_Clip_Assets.md # Captions & hooks所有下载内容将保存至:
Content/YouTube Transcripts/目录结构:
Content/YouTube Transcripts/
├── {video_id}.mp4 # 完整视频(H.264编码)
├── {video_id}.en.vtt # 带时间戳的字幕
└── clips/
└── {video_id}/
├── clip_01_{name}.mp4 # 单个片段
├── clip_02_{name}.mp4
└── {video_id}_Clip_Assets.md # 字幕与钩子文本The 4-Phase Workflow
四阶段工作流
Phase 1: Download Video & Transcript
阶段1:下载视频与字幕文稿
Goal: Get video and subtitles from YouTube URL
目标: 从YouTube URL获取视频和字幕
Step 1: Download with H.264 Encoding
步骤1:以H.264编码格式下载
Use H.264 format for Descript compatibility (NOT AV1):
bash
undefined使用H.264格式以确保兼容Descript(请勿使用AV1):
bash
undefinedDownload video in H.264 format (Descript-compatible)
以H.264格式下载视频(兼容Descript)
yt-dlp -f "bestvideo[vcodec^=avc]+bestaudio[ext=m4a]/best[vcodec^=avc]"
--merge-output-format mp4
-o "Content/YouTube Transcripts/{video_id}.mp4"
"YOUTUBE_URL"
--merge-output-format mp4
-o "Content/YouTube Transcripts/{video_id}.mp4"
"YOUTUBE_URL"
yt-dlp -f "bestvideo[vcodec^=avc]+bestaudio[ext=m4a]/best[vcodec^=avc]"
--merge-output-format mp4
-o "Content/YouTube Transcripts/{video_id}.mp4"
"YOUTUBE_URL"
--merge-output-format mp4
-o "Content/YouTube Transcripts/{video_id}.mp4"
"YOUTUBE_URL"
If H.264 unavailable, download best quality then re-encode:
如果无法获取H.264格式,先下载最高画质再重新编码:
yt-dlp -f "bestvideo+bestaudio" --merge-output-format mp4
-o "Content/YouTube Transcripts/{video_id}_temp.mp4"
"YOUTUBE_URL"
-o "Content/YouTube Transcripts/{video_id}_temp.mp4"
"YOUTUBE_URL"
yt-dlp -f "bestvideo+bestaudio" --merge-output-format mp4
-o "Content/YouTube Transcripts/{video_id}_temp.mp4"
"YOUTUBE_URL"
-o "Content/YouTube Transcripts/{video_id}_temp.mp4"
"YOUTUBE_URL"
Re-encode to H.264 for Descript compatibility
重新编码为H.264以兼容Descript
ffmpeg -i "{video_id}_temp.mp4" -c:v libx264 -preset fast -crf 22
-c:a aac -b:a 128k "{video_id}.mp4"
-c:a aac -b:a 128k "{video_id}.mp4"
undefinedffmpeg -i "{video_id}_temp.mp4" -c:v libx264 -preset fast -crf 22
-c:a aac -b:a 128k "{video_id}.mp4"
-c:a aac -b:a 128k "{video_id}.mp4"
undefinedStep 2: Download Subtitles
步骤2:下载字幕
bash
yt-dlp --write-auto-sub --sub-lang en --skip-download \
-o "Content/YouTube Transcripts/{video_id}.%(ext)s" \
"YOUTUBE_URL"bash
yt-dlp --write-auto-sub --sub-lang en --skip-download \
-o "Content/YouTube Transcripts/{video_id}.%(ext)s" \
"YOUTUBE_URL"Phase 1 Output:
阶段1输出:
- — Full video (H.264)
{video_id}.mp4 - — Timestamped subtitles
{video_id}.en.vtt
- — 完整视频(H.264编码)
{video_id}.mp4 - — 带时间戳的字幕
{video_id}.en.vtt
Phase 2: Analyze Transcript for Clips
阶段2:分析字幕文稿以选择片段
Goal: Identify 5-8 compelling clip moments with strong hooks and codas
目标: 识别5-8个具有出色钩子和结尾的优质片段
Clip Selection Criteria
片段选择标准
A good clip has:
-
Strong Hook (First 3 Seconds)
- Polarizing statement ("Your kid's addiction is actually genius")
- Counter-intuitive reveal ("My son's first job sucked. Perfect.")
- Direct challenge ("Never give up on the weird kid")
- Curiosity gap ("Then everything changed...")
-
Complete Arc (30-90 seconds)
- Clear beginning, middle, end
- Not just a "good quote" — a complete thought
- Setup → Tension → Resolution OR Setup → Tension → Cliffhanger
-
Stakes
- Why does this matter?
- Who cares?
- What's at risk?
-
Strong Coda/Ending
- Insight or surprising conclusion
- Cuts right before the answer (cliffhanger)
- Quotable final line
优质片段需具备:
-
强力钩子(前3秒)
- 有争议的表述(如“你孩子的沉迷其实是种天赋”)
- 反直觉的爆料(如“我儿子的第一份工作糟透了。完美。”)
- 直接挑战(如“永远不要放弃那个奇怪的孩子”)
- 好奇心缺口(如“然后一切都变了……”)
-
完整叙事弧(30-90秒)
- 清晰的开头、中间、结尾
- 不只是“精彩引用”,而是完整的想法
- 铺垫 → 冲突 → 解决 或 铺垫 → 冲突 → 悬念
-
明确的利害关系
- 这为什么重要?
- 谁会关心?
- 风险是什么?
-
出色的结尾/收尾
- 深刻见解或意外结论
- 在揭晓答案前戛然而止(制造悬念)
- 值得引用的收尾语句
Scan Transcript For:
扫描字幕文稿时需关注:
Inflection Points:
- "Then everything changed..."
- "I realized..."
- "That's when I knew..."
- "The moment I..."
Vulnerability Moments:
- Personal stakes, failures, struggles
- "I was terrified..."
- "I almost gave up..."
- "Nobody believed..."
Contradiction Moments:
- "We thought X but actually..."
- "Everyone says... but the truth is..."
- "The opposite happened..."
Surprising Insights:
- Research, data, unexpected findings
- Counter-intuitive conclusions
- "What we found was..."
Character in Action:
- Showing, not telling
- Doing, not describing
- Specific moments, not abstractions
转折点:
- “然后一切都变了……”
- “我意识到……”
- “就在那时我知道……”
- “那一刻我……”
脆弱时刻:
- 个人利害、失败、挣扎
- “我当时吓坏了……”
- “我差点放弃……”
- “没人相信……”
矛盾时刻:
- “我们原本以为X,但实际上……”
- “所有人都说……但真相是……”
- “相反的情况发生了……”
意外见解:
- 研究、数据、意外发现
- 反直觉结论
- “我们发现的是……”
行动中的角色:
- 展示而非讲述
- 行动而非描述
- 具体时刻而非抽象概念
Quality Tests (Pass 4/5):
质量测试(需通过5项中的4项):
- Stranger Test: Would someone with zero context care?
- Itch Test: Creates need to know more?
- Stakes Test: Clear why it matters?
- Tease Test: Hints without giving away?
- Emotion Test: Feel something in first 5 seconds?
- 陌生人测试: 完全不了解背景的人会关心吗?
- 渴望测试: 是否会让人想要了解更多?
- 利害测试: 是否明确体现重要性?
- 预告测试: 是否只做暗示而不泄露全部?
- 情感测试: 前5秒是否能让人产生情绪共鸣?
Phase 2 Output Format:
阶段2输出格式:
Create analysis document with clip recommendations:
markdown
undefined创建分析文档,包含片段推荐:
markdown
undefined{Video Title} - Clip Analysis
{视频标题} - 片段分析
Video Details
视频详情
- URL: [YouTube URL]
- Duration: [Total length]
- Speaker(s): [Names]
- Topic: [Primary subject]
- URL: [YouTube URL]
- 时长: [总时长]
- 发言人: [姓名]
- 主题: [核心主题]
Recommended Clips
推荐片段
CLIP 1: "{Descriptive Name}"
片段1:“{描述性名称}”
Timestamp: (XX seconds)
Hook: [First line or opening moment]
Arc: [Setup → Middle → Ending summary]
Coda: [How it ends / final line]
MM:SS - MM:SSKey Quotes:
- "[Verbatim quote 1]"
- "[Verbatim quote 2]"
- "[Verbatim quote 3]"
Quality Tests: Stranger ✅ | Itch ✅ | Stakes ✅ | Tease ✅ | Emotion ✅
Why It Works: [1-2 sentence rationale]
Priority: HIGH / MEDIUM / LOW
时间戳: (XX秒)
钩子: [第一句或开场时刻]
叙事弧: [铺垫 → 中间 → 结尾总结]
结尾: [收尾方式 / 最后一句]
MM:SS - MM:SS关键引用:
- “[原文引用1]”
- “[原文引用2]”
- “[原文引用3]”
质量测试: 陌生人 ✅ | 渴望 ✅ | 利害 ✅ | 预告 ✅ | 情感 ✅
推荐理由: [1-2句说明]
优先级: 高 / 中 / 低
CLIP 2: "{Descriptive Name}"
片段2:“{描述性名称}”
[Repeat structure...]
[重复上述结构...]
Summary Table
汇总表格
| # | Clip Name | Timestamp | Length | Hook | Coda | Priority |
|---|---|---|---|---|---|---|
| 1 | [Name] | MM:SS-MM:SS | XXs | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | HIGH |
| 2 | [Name] | MM:SS-MM:SS | XXs | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | HIGH |
| 3 | [Name] | MM:SS-MM:SS | XXs | ⭐⭐⭐ | ⭐⭐⭐⭐ | MEDIUM |
---| 序号 | 片段名称 | 时间戳 | 时长 | 钩子 | 结尾 | 优先级 |
|---|---|---|---|---|---|---|
| 1 | [名称] | MM:SS-MM:SS | XXs | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | 高 |
| 2 | [名称] | MM:SS-MM:SS | XXs | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | 高 |
| 3 | [名称] | MM:SS-MM:SS | XXs | ⭐⭐⭐ | ⭐⭐⭐⭐ | 中 |
---Phase 3: Cut Clips with FFmpeg
阶段3:使用FFmpeg剪辑片段
Goal: Extract approved clips as separate video files
目标: 将选定的片段提取为独立视频文件
Cutting Commands
剪辑命令
Basic clip extraction (fast, uses keyframes):
bash
ffmpeg -i "{video_id}.mp4" -ss MM:SS -to MM:SS -c copy \
"clips/{video_id}/clip_01_{name}.mp4"Precise cutting with re-encoding (slower but frame-accurate):
bash
ffmpeg -ss MM:SS -i "{video_id}.mp4" -t DURATION \
-c:v libx264 -preset fast -crf 22 -c:a aac -b:a 128k \
"clips/{video_id}/clip_01_{name}.mp4"Notes:
- before
-ss= faster seeking (recommended)-i - = no re-encoding (fast but may have keyframe issues)
-c copy - = re-encode to H.264 (slower but precise)
-c:v libx264 - Use H.264 output for Descript compatibility
基础片段提取(快速,使用关键帧):
bash
ffmpeg -i "{video_id}.mp4" -ss MM:SS -to MM:SS -c copy \
"clips/{video_id}/clip_01_{name}.mp4"精确剪辑并重新编码(速度较慢但帧级精准):
bash
ffmpeg -ss MM:SS -i "{video_id}.mp4" -t DURATION \
-c:v libx264 -preset fast -crf 22 -c:a aac -b:a 128k \
"clips/{video_id}/clip_01_{name}.mp4"注意事项:
- 放在
-ss之前 = 更快的定位(推荐)-i - = 不重新编码(速度快但可能存在关键帧问题)
-c copy - = 重新编码为H.264(速度慢但精准)
-c:v libx264 - 输出格式使用H.264以兼容Descript
Batch Cutting Example
批量剪辑示例
bash
undefinedbash
undefinedCreate clips directory
创建片段目录
mkdir -p "Content/YouTube Transcripts/clips/{video_id}"
mkdir -p "Content/YouTube Transcripts/clips/{video_id}"
Cut each clip
剪辑每个片段
ffmpeg -i "{video_id}.mp4" -ss 06:59 -to 08:10 -c copy "clips/{video_id}/clip_01_covid_revelation.mp4"
ffmpeg -i "{video_id}.mp4" -ss 20:50 -to 21:54 -c copy "clips/{video_id}/clip_02_whiteboard_teen.mp4"
ffmpeg -i "{video_id}.mp4" -ss 25:00 -to 26:10 -c copy "clips/{video_id}/clip_03_college_loans.mp4"
undefinedffmpeg -i "{video_id}.mp4" -ss 06:59 -to 08:10 -c copy "clips/{video_id}/clip_01_covid_revelation.mp4"
ffmpeg -i "{video_id}.mp4" -ss 20:50 -to 21:54 -c copy "clips/{video_id}/clip_02_whiteboard_teen.mp4"
ffmpeg -i "{video_id}.mp4" -ss 25:00 -to 26:10 -c copy "clips/{video_id}/clip_03_college_loans.mp4"
undefinedPhase 3 Output:
阶段3输出:
- Individual MP4 files for each clip
- All files in directory
clips/{video_id}/ - H.264 encoded for Descript compatibility
- 每个片段对应的独立MP4文件
- 所有文件存储在 目录中
clips/{video_id}/ - 采用H.264编码以兼容Descript
Phase 4: Generate On-Screen Text & Captions
阶段4:生成屏幕文本与字幕
Goal: Create platform-optimized hooks and captions for each clip
This phase uses the skill methodology.
video-caption-creation目标: 为每个片段生成针对平台优化的钩子和字幕
本阶段使用技能的方法。
video-caption-creationFor Each Clip, Generate:
为每个片段生成:
1. On-Screen Text Hook (3-5 options)
The text that appears in the first 3 seconds of the video. Must be:
- 2-4 words maximum (mobile readable)
- Stops the scroll
- Passes McDonald's Test (accessible language)
- Complements (not duplicates) audio
Hook Categories:
- Polarizing: "Your kid's [negative] is actually genius"
- Counter-Intuitive: "My son's first job sucked. Perfect."
- Direct Challenge: "Never give up on the weird kid"
- Curiosity Gap: "Then everything changed..."
2. Platform-Specific Captions
| Platform | On-Screen Text | Caption Style | Hashtags |
|---|---|---|---|
| Same | Short, emoji OK, accessible | 5-10 | |
| TikTok | Same | Short, emoji OK, accessible | 3-5 |
| YouTube Shorts | Same | Short, minimal emoji | 3-5 + #Shorts |
| Same | Slightly longer, conversational, NO external links | 0-2 |
Facebook Difference: Caption can be longer and more conversational. NO hashtags or external links (kills reach).
3. Algorithm Optimization
Per the Triple Word Score system:
- Audio: Topic words spoken in first 10 seconds
- On-Screen Text: Reinforces (not competes with) audio
- Caption: Topic-relevant keywords in first sentence
- Hashtags: Broad → Mid → Specific → Niche (10-12 total)
1. 屏幕文本钩子(3-5个选项)
视频前3秒显示的文本,需满足:
- 最多2-4个单词(适配移动端阅读)
- 能吸引用户停留
- 通过麦当劳测试(语言通俗易懂)
- 补充(而非重复)音频内容
钩子类别:
- 有争议型: “你孩子的[负面行为]其实是天赋”
- 反直觉型: “我儿子的第一份工作糟透了。完美。”
- 直接挑战型: “永远不要放弃那个奇怪的孩子”
- 好奇心缺口型: “然后一切都变了……”
2. 平台专属字幕
| 平台 | 屏幕文本 | 字幕风格 | 话题标签 |
|---|---|---|---|
| 相同 | 简短,可使用表情符号,通俗易懂 | 5-10个 | |
| TikTok | 相同 | 简短,可使用表情符号,通俗易懂 | 3-5个 |
| YouTube Shorts | 相同 | 简短,尽量少用表情符号 | 3-5个 + #Shorts |
| 相同 | 稍长,口语化,禁止外部链接 | 0-2个 |
Facebook差异: 字幕可以更长、更口语化。禁止使用话题标签或外部链接(会降低曝光量)。
3. 算法优化
遵循三重关键词评分系统:
- 音频: 前10秒提及主题关键词
- 屏幕文本: 强化(而非冲突)音频内容
- 字幕: 第一句包含主题相关关键词
- 话题标签: 从宽泛到细分(共10-12个)
Phase 4 Output Format:
阶段4输出格式:
Create file:
clips/{video_id}/{video_id}_CLIP_PACKAGE.mdmarkdown
undefined创建文件:
clips/{video_id}/{video_id}_CLIP_PACKAGE.mdmarkdown
undefined{Video Title} - Clip Package
{视频标题} - 片段素材包
Source Video
源视频
- URL: [YouTube URL]
- Title: [Video title]
- Duration: [Total length]
- Downloaded File:
{video_id}.mp4
- URL: [YouTube URL]
- 标题: [视频标题]
- 时长: [总时长]
- 下载文件:
{video_id}.mp4
Context
背景上下文
[2-3 sentences explaining the backstory needed to understand the clip. Who is the speaker? What's their situation? What happened before/after the moments in the clip? This context ensures on-screen text and captions are coherent with the actual story.]
[2-3句话说明理解片段所需的背景故事。发言人是谁?他们的处境如何?片段内容发生的前后背景是什么?此上下文确保屏幕文本和字幕与实际故事保持连贯。]
Editing Instructions
剪辑说明
SEQUENCE (Rearranged from original - NOT linear):
| Order | Timestamp | Speaker | Line |
|---|---|---|---|
| 1 | MM:SS-MM:SS | [Name] | "[Verbatim quote]" |
| 2 | MM:SS-MM:SS | [Name] | "[Verbatim quote]" |
| 3 | MM:SS-MM:SS | [Name] | "[Verbatim quote]" |
OPTIONAL EXTENSION:
| Order | Timestamp | Speaker | Line |
|---|---|---|---|
| 4 | MM:SS-MM:SS | [Name] | "[Verbatim quote]" |
顺序(重新排列,非原视频线性顺序):
| 序号 | 时间戳 | 发言人 | 台词 |
|---|---|---|---|
| 1 | MM:SS-MM:SS | [姓名] | “[原文引用]” |
| 2 | MM:SS-MM:SS | [姓名] | “[原文引用]” |
| 3 | MM:SS-MM:SS | [姓名] | “[原文引用]” |
可选扩展:
| 序号 | 时间戳 | 发言人 | 台词 |
|---|---|---|---|
| 4 | MM:SS-MM:SS | [姓名] | “[原文引用]” |
On-Screen Text Hook Options
屏幕文本钩子选项
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [Hook text] - [Category]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
- [钩子文本] - [类别]
Platform Captions
平台专属字幕
TikTok / Instagram Reels / YouTube Shorts
TikTok / Instagram Reels / YouTube Shorts
[Caption text]
[Hashtags: 3-5]
[字幕文本]
[话题标签:3-5个]
[Longer caption, conversational, NO hashtags]
[更长的字幕,口语化,无话题标签]
[Professional tone caption]
[Hashtags: 3-5]
undefined[专业语气的字幕]
[话题标签:3-5个]
undefinedOn-Screen Text Hook Categories
屏幕文本钩子类别
- Story Setup - Provides context that makes the clip make sense (e.g., "Homeschooler tries public school")
- Polarizing - Bold statement that divides opinion (e.g., "Most schools are awful")
- Contrast - Juxtaposition that creates tension (e.g., "First in class. Zero joy.")
- Curiosity Gap - Teases without revealing (e.g., "#1 out of 1,200 students")
- Story Tease - Hints at narrative arc (e.g., "She went back to homeschool after this")
- Pattern Interrupt - Subverts expectations (e.g., "This isn't anti-public school")
- 故事铺垫 - 提供让片段有意义的上下文(例如:“在家上学的孩子尝试公立学校”)
- 有争议型 - 引发分歧的大胆表述(例如:“大多数学校都很糟糕”)
- 对比型 - 制造张力的对比(例如:“全班第一。毫无快乐。”)
- 好奇心缺口型 - 只做暗示不泄露全部(例如:“1200名学生中的第1名”)
- 故事预告 - 暗示叙事弧(例如:“这件事后她又回到了在家上学”)
- 模式打断 - 颠覆预期(例如:“这不是反对公立学校”)
Context-Caption Coherence
上下文与字幕的连贯性
Critical: On-screen text and captions must be coherent with the actual story in the transcript. Before writing hooks:
- Understand the full context (who, what, when, why)
- Identify what viewers need to know for the clip to make sense
- Choose hooks that accurately represent the story
- Avoid hooks that would confuse viewers when they hear the audio
Example: If the clip shows someone criticizing public school, but they were actually a homeschooler who tried public school once, hooks like "Homeschooler tries public school" or "She tried public school for one year" provide necessary context that makes the story coherent.
关键: 屏幕文本和字幕必须与字幕文稿中的实际故事保持连贯。撰写钩子前:
- 理解完整上下文(人物、事件、时间、原因)
- 确定观众理解片段所需的信息
- 选择能准确反映故事的钩子
- 避免会让观众听到音频后产生困惑的钩子
示例: 如果片段中有人批评公立学校,但实际上他们是一名曾尝试过一次公立学校的在家上学学生,那么像“在家上学的孩子尝试公立学校”或“她试了一年公立学校”这样的钩子能提供必要的上下文,让故事更连贯。
Complete Workflow Example
完整工作流示例
bash
undefinedbash
undefinedPHASE 1: Download
阶段1:下载
yt-dlp -f "bestvideo[vcodec^=avc]+bestaudio" --merge-output-format mp4
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.mp4"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.mp4"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
yt-dlp --write-auto-sub --sub-lang en --skip-download
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.%(ext)s"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.%(ext)s"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
yt-dlp -f "bestvideo[vcodec^=avc]+bestaudio" --merge-output-format mp4
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.mp4"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.mp4"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
yt-dlp --write-auto-sub --sub-lang en --skip-download
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.%(ext)s"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
-o "Content/YouTube Transcripts/cvGtVmI4jTQ.%(ext)s"
"https://www.youtube.com/watch?v=cvGtVmI4jTQ"
PHASE 2: Analyze transcript (manual review)
阶段2:分析字幕文稿(人工审核)
Read VTT file, identify clips using criteria above
阅读VTT文件,根据上述标准识别片段
PHASE 3: Cut clips
阶段3:剪辑片段
mkdir -p "Content/YouTube Transcripts/clips/cvGtVmI4jTQ"
ffmpeg -i "cvGtVmI4jTQ.mp4" -ss 06:59 -to 08:10
-c:v libx264 -preset fast -crf 22 -c:a aac
"clips/cvGtVmI4jTQ/clip_01_covid_revelation.mp4"
-c:v libx264 -preset fast -crf 22 -c:a aac
"clips/cvGtVmI4jTQ/clip_01_covid_revelation.mp4"
mkdir -p "Content/YouTube Transcripts/clips/cvGtVmI4jTQ"
ffmpeg -i "cvGtVmI4jTQ.mp4" -ss 06:59 -to 08:10
-c:v libx264 -preset fast -crf 22 -c:a aac
"clips/cvGtVmI4jTQ/clip_01_covid_revelation.mp4"
-c:v libx264 -preset fast -crf 22 -c:a aac
"clips/cvGtVmI4jTQ/clip_01_covid_revelation.mp4"
PHASE 4: Generate assets (create markdown file with hooks/captions)
阶段4:生成素材(创建包含钩子/字幕的markdown文件)
---
---Related Skills
相关技能
This skill integrates with:
| Skill | When to Use | What It Provides |
|---|---|---|
| video-caption-creation | Phase 4 | On-screen text hook categories, Triple Word Score system, platform caption guidelines |
| youtube-downloader | Phase 1 (alternative) | Detailed yt-dlp installation checks, error handling, transcript-only workflow |
| text-content | After clips ready | Framework fitting for text posts about clips |
| podcast-production | Full episode workflow | Complete 4-checkpoint production system |
本技能与以下技能集成:
| 技能 | 使用场景 | 提供内容 |
|---|---|---|
| video-caption-creation | 阶段4 | 屏幕文本钩子类别、三重关键词评分系统、平台字幕指南 |
| youtube-downloader | 阶段1(替代方案) | 详细的yt-dlp安装检查、错误处理、仅字幕文稿工作流 |
| text-content | 片段准备完成后 | 适用于片段文本帖子的框架 |
| podcast-production | 完整剧集工作流 | 完整的4检查点制作系统 |
Skill Cross-References
技能交叉引用
From video-caption-creation:
- Hook categories (Polarizing, Counter-Intuitive, Direct Challenge, Curiosity Gap)
- Triple Word Score system (Audio + On-Screen + Caption + Hashtags)
- Platform-specific hashtag counts
- McDonald's Test for accessibility
From text-content:
- Platform voice guidelines (LinkedIn vs Facebook vs Instagram)
- Framework fitting method
- 360+ templates in references/
来自video-caption-creation:
- 钩子类别(有争议型、反直觉型、直接挑战型、好奇心缺口型)
- 三重关键词评分系统(音频 + 屏幕文本 + 字幕 + 话题标签)
- 平台专属话题标签数量
- 麦当劳可访问性测试
来自text-content:
- 平台语气指南(LinkedIn vs Facebook vs Instagram)
- 框架适配方法
- 参考资料中的360+模板
Common Mistakes to Avoid
需避免的常见错误
Download Issues
下载问题
- ❌ Downloading AV1 codec (Descript can't import)
- ❌ Not re-encoding to H.264 when needed
- ❌ Forgetting to download subtitles
- ❌ 下载AV1编码的视频(Descript无法导入)
- ❌ 必要时未重新编码为H.264
- ❌ 忘记下载字幕
Clip Selection Issues
片段选择问题
- ❌ Choosing "good quotes" instead of complete arcs
- ❌ Clips too long (>90 seconds) or too short (<30 seconds)
- ❌ No clear hook in first 3 seconds
- ❌ Giving away the punchline in the hook
- ❌ 选择“精彩引用”而非完整叙事弧
- ❌ 片段过长(>90秒)或过短(<30秒)
- ❌ 前3秒没有清晰的钩子
- ❌ 在钩子中泄露核心内容
Cutting Issues
剪辑问题
- ❌ Cutting at non-keyframes (use re-encode for precision)
- ❌ Starting mid-sentence
- ❌ Ending before natural conclusion
- ❌ 在非关键帧处剪辑(如需精准请使用重新编码)
- ❌ 从句子中间开始剪辑
- ❌ 在自然结束前停止剪辑
Caption Issues
字幕问题
- ❌ On-screen text too long (>4 words)
- ❌ Same caption for Facebook as other platforms
- ❌ External links in Facebook caption
- ❌ Hashtags in Facebook caption
- ❌ 屏幕文本过长(>4个单词)
- ❌ Facebook使用与其他平台相同的字幕
- ❌ Facebook字幕中包含外部链接
- ❌ Facebook字幕中使用话题标签
Quality Checklist
质量检查清单
Before delivering clips:
Video Files:
- All clips are H.264 encoded
- Each clip is 30-90 seconds
- Audio and video are synced
- Clean start/end points (no mid-word cuts)
Clip Selection:
- Each clip passes 4/5 quality tests
- Strong hook in first 3 seconds
- Complete arc (not just a quote)
- Clear stakes (why it matters)
Captions & Hooks:
- 3-5 on-screen text options per clip
- On-screen text is 2-4 words max
- Platform-specific captions created
- Facebook caption is different (longer, no hashtags)
- Hashtag strategy spans broad to niche
交付片段前:
视频文件:
- 所有片段均为H.264编码
- 每个片段时长为30-90秒
- 音频与视频同步
- 开头/结尾干净(无中途截断单词)
片段选择:
- 每个片段通过5项质量测试中的4项
- 前3秒有强力钩子
- 完整叙事弧(不只是引用)
- 明确的利害关系(重要性)
字幕与钩子:
- 每个片段有3-5个屏幕文本选项
- 屏幕文本最多2-4个单词
- 已创建平台专属字幕
- Facebook字幕与其他平台不同(更长,无话题标签)
- 话题标签策略覆盖从宽泛到细分
Version History
版本历史
-
v1.1 (2025-12-20): Streamlined output format
- Removed "Target Length" and "Concept" sections from output
- Removed "Why This Edit Works" section
- Added "Context" section for backstory coherence
- Simplified on-screen text hooks to numbered list with categories only
- Added "Story Setup" hook category for context-providing hooks
- Added "Context-Caption Coherence" guidance
- Updated output template to match streamlined format
-
v1.0 (2025-12-02): Initial skill creation
- 4-phase workflow: Download → Analyze → Cut → Caption
- Integration with video-caption-creation skill
- H.264 encoding for Descript compatibility
- Platform-specific caption guidelines
- Quality tests from podcast-production skill
-
v1.1(2025-12-20):简化输出格式
- 移除输出中的“目标时长”和“概念”部分
- 移除“此剪辑为何有效”部分
- 添加“背景上下文”部分以确保故事连贯
- 将屏幕文本钩子简化为带类别的编号列表
- 添加“故事铺垫”钩子类别以提供上下文
- 添加“上下文与字幕连贯性”指导
- 更新输出模板以匹配简化格式
-
v1.0(2025-12-02):初始技能创建
- 4阶段工作流:下载 → 分析 → 剪辑 → 字幕
- 与video-caption-creation技能集成
- 采用H.264编码以兼容Descript
- 平台专属字幕指南
- 来自podcast-production技能的质量测试