search-youtube

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

YouTube Research

YouTube Research

Multi-platform video research toolkit. Operates in two modes: toolkit (individual operations) and research (autonomous search-to-synthesis pipeline). All operations use a single CLI at
${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py
.
Run
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py <subcommand> --help
for full flag reference on any subcommand.
多平台视频研究工具包。支持两种模式:工具包模式(独立操作)和研究模式(自主搜索到合成的流水线)。所有操作均通过位于
${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py
的单一CLI执行。
运行
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py <subcommand> --help
可查看任意子命令的完整参数说明。

Toolkit Mode

工具包模式

Invoke individual subcommands for targeted operations. Default mode when the user requests a specific action (transcript, search, metadata, audio, channel scan).
调用独立子命令执行针对性操作。当用户请求特定操作(转录、搜索、元数据、音频、频道扫描)时,默认使用此模式。

Search

搜索

Find videos matching a query. Returns structured results with metadata.
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py search "<query>" --count 10
Add filters to narrow results:
--min-duration 600
(seconds),
--after 20250101
(YYYYMMDD),
--min-views 50000
. Filters are applied client-side after fetching, so the tool over-fetches automatically to compensate. Output is JSON by default; add
-f text
for human-readable.
查找匹配查询词的视频。返回包含元数据的结构化结果。
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py search "<query>" --count 10
可添加筛选条件缩小结果范围:
--min-duration 600
(秒)、
--after 20250101
(YYYYMMDD格式)、
--min-views 50000
。筛选条件在获取结果后在客户端应用,因此工具会自动超额获取结果以作补偿。默认输出格式为JSON;添加
-f text
可输出人类可读格式。

Transcript

转录

Download and clean subtitles to LLM-ready text.
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript "<url>"
Outputs clean prose to stdout by default. Add
--timestamps
for SRT with timing cues. Add
--save -t <topic>
to persist to
~/youtube-research/<topic>/
. Use
--lang all
to list available subtitle languages before downloading. Fallback chain: manual subs then auto-generated. Exit 4 if no subtitles exist in the requested language.
下载并清理字幕,生成适合LLM处理的文本。
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript "<url>"
默认将清理后的文本输出到标准输出。添加
--timestamps
可生成带时间标记的SRT格式。添加
--save -t <topic>
可将内容保存到
~/youtube-research/<topic>/
目录。使用
--lang all
可在下载前查看可用的字幕语言。优先级顺序:手动字幕优先,其次是自动生成字幕。如果请求的语言没有字幕,将返回退出码4。

Metadata

元数据

Extract full video information without downloading.
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py metadata "<url>"
Returns: title, description, channel, duration, chapters, view/like counts, tags, available subtitle languages, thumbnail URL. Add
--playlist
for playlist entry listings.
无需下载即可提取完整的视频信息。
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py metadata "<url>"
返回内容包括:标题、描述、频道、时长、章节、观看/点赞数、标签、可用字幕语言、缩略图URL。添加
--playlist
可获取播放列表条目列表。

Audio

音频

Download audio in the requested format.
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py audio "<url>" --audio-format mp3 -t <topic>
Saves to
~/youtube-research/<topic>/audio/
. Supported formats: mp3, m4a, opus, wav. Always saves to disk (audio cannot go to stdout). Prints the file path on success.
下载指定格式的音频。
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py audio "<url>" --audio-format mp3 -t <topic>
保存到
~/youtube-research/<topic>/audio/
目录。支持的格式:mp3、m4a、opus、wav。音频文件始终保存到磁盘(无法输出到标准输出)。成功后会打印文件路径。

Channel

频道

Scan a channel's content.
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py channel "<url-or-@handle>" --limit 20
Supports tabs:
--tab videos
(default),
shorts
,
streams
,
playlists
. Filter with
--after
/
--before
(YYYYMMDD). Sort with
--sort views
for most-viewed-first.
扫描频道内容。
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py channel "<url-or-@handle>" --limit 20
支持不同分类标签:
--tab videos
(默认)、
shorts
streams
playlists
。可使用
--after
/
--before
(YYYYMMDD格式)进行筛选。使用
--sort views
可按观看量从高到低排序。

Batch Processing

批量处理

Any subcommand except
search
accepts
--batch <file>
(or
--batch -
for stdin) to process multiple URLs. One URL per line; lines starting with
#
or
;
are skipped.
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript --batch urls.txt --save -t <topic>
search
外的所有子命令均支持
--batch <file>
(或
--batch -
从标准输入读取)以处理多个URL。每行一个URL;以
#
;
开头的行将被跳过。
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript --batch urls.txt --save -t <topic>

Research Mode

研究模式

Activate when the user asks to "research", "investigate", or "find out about" a topic using YouTube as a source. This is an adaptive multi-round discovery pipeline designed for niche and emerging topics where popular videos often under-serve.
当用户要求使用YouTube作为来源“研究”“调查”或“了解”某个主题时激活此模式。这是一个自适应的多轮发现流水线,专为小众和新兴主题设计——这类主题的热门视频往往无法满足需求。

Round 1: Divergent Search

第一轮:发散搜索

Generate 4-6 query variants that cover different angles of the topic:
  • Exact tool/concept name (e.g.,
    "openclaw"
    )
  • Tool + ecosystem context (e.g.,
    "openclaw claude code"
    )
  • Problem the tool solves (e.g.,
    "claude code documentation framework"
    )
  • Workflow/demo framing (e.g.,
    "openclaw walkthrough demo"
    )
  • Alternative names, abbreviations, or common misspellings if applicable
Spawn one
Task
agent per query variant simultaneously, each running:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py search "<query-variant>" --count 15
Collect all results and deduplicate by video ID. Aim for 30-50 unique candidates across all threads.
生成4-6个覆盖主题不同角度的查询变体:
  • 精确的工具/概念名称(例如:
    "openclaw"
  • 工具+生态系统背景(例如:
    "openclaw claude code"
  • 工具解决的问题(例如:
    "claude code documentation framework"
  • 工作流/演示场景(例如:
    "openclaw walkthrough demo"
  • 替代名称、缩写或常见拼写错误(如有)
为每个查询变体同时启动一个
Task
代理,每个代理执行:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py search "<query-variant>" --count 15
收集所有结果并按视频ID去重。目标是在所有线程中获取30-50个独特候选视频。

Round 1 Evaluation: Niche-First Heuristics

第一轮评估:小众优先原则

For niche, emerging, or edge-of-tech topics, these signals predict quality:
Positive signals (use these):
  • Title specifically names the tool or concept (not "top AI tools 2025")
  • Small channel (< 50K subscribers) — for new tech, practitioners publish before educators discover the topic
  • Technical, specific description (mentions code, config, architecture, or commands)
  • Structured content: chapters, timestamps, or detailed description
  • Recent upload date
Negative signals (treat as red flags on niche topics):
  • View count > 100K — on a narrow topic this usually means beginner-level or clickbait
  • "Tutorial for beginners" / "complete guide" in the title for brand-new tools
  • Large generalist channel covering many unrelated topics
Select 6-10 candidates from the combined pool. Note which channels produced the strongest results — those are targets for Round 2.
对于小众、新兴或前沿技术主题,以下信号可预测内容质量:
正向信号(优先选择):
  • 标题明确提及工具或概念(而非“2025年顶级AI工具”)
  • 小型频道(订阅者少于5万)——对于新技术,从业者会在教育者发现该主题前发布内容
  • 技术化、具体的描述(提及代码、配置、架构或命令)
  • 结构化内容:章节、时间戳或详细描述
  • 近期上传日期
负向信号(小众主题需警惕):
  • 观看量超过10万——在狭窄主题下,这通常意味着内容是入门级或标题党
  • 针对全新工具的标题包含“初学者教程”/“完整指南”
  • 大型综合频道,涵盖许多不相关主题
从合并的候选池中选择6-10个视频。记录哪些频道产出了优质结果——这些将作为第二轮的目标。

Round 2: Channel Discovery and Refinement

第二轮:频道发现与优化

For each channel that surfaced a strong Round 1 result, scan its recent videos:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py channel "<channel-url>" --limit 20
Also run 1-2 refined search queries using specific terminology that appeared in strong Round 1 titles or descriptions. Add any new candidates to the pool.
Quality gate: If Round 1 candidates are weak (generic titles, all high-view generalist content, nothing specifically about the topic), surface this to the user and run another search round with reformulated queries before proceeding to transcripts.
对于在第一轮中产出优质结果的每个频道,扫描其近期视频:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py channel "<channel-url>" --limit 20
同时使用第一轮优质标题或描述中出现的特定术语,运行1-2个优化后的搜索查询。将新的候选视频添加到池中。
质量门槛: 如果第一轮候选视频质量不佳(标题通用、均为高观看量综合内容、没有与主题直接相关的内容),需告知用户并重新制定查询词进行另一轮搜索,再进行转录操作。

Round 3: Confirm and Transcribe

第三轮:确认与转录

From the enriched candidate pool, select 4-7 videos using the Round 1 criteria. Extract metadata to confirm relevance before committing to downloads:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py metadata "<url>"
Download transcripts in parallel — always spawn one
Task
agent per video, even for 2:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript "<url>" --save -t <topic>
Read each transcript and synthesize findings into a report. Use
WebSearch
or brave-cli to cross-reference claims or fill gaps when YouTube sources disagree or leave questions unanswered.
从扩充后的候选池中,根据第一轮的标准选择4-7个视频。先提取元数据确认相关性,再进行下载:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py metadata "<url>"
并行下载转录文本——即使只有2个视频,也要为每个视频启动一个
Task
代理:
bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/search-youtube/scripts/yt_research.py transcript "<url>" --save -t <topic>
读取每个转录文本并将发现内容合成为报告。当YouTube来源存在分歧或有未解答的问题时,使用
WebSearch
或 brave-cli 进行交叉验证或填补空白。

Research Report Format

研究报告格式

Present the synthesis as a structured markdown report:
  • Title and one-paragraph summary of the research question
  • Key findings (3-7 bullet points of the most important takeaways across all sources)
  • Points of agreement between sources (what multiple videos confirm)
  • Points of disagreement (where sources contradict, with attribution)
  • Unique insights (notable points from individual videos not repeated elsewhere)
  • Gaps in coverage (what the sources collectively missed)
  • Sources table: video title, channel, duration, and URL for each video used
Attribute specific claims to their source video. Include timestamps when the transcript preserves them. Save all transcripts under the topic directory for future reference. See
examples/research-report.md
for a sample report structure.
将合成内容整理为结构化Markdown报告:
  • 标题和研究问题的一段摘要
  • 关键发现(3-7个要点,涵盖所有来源的最重要结论)
  • 来源共识(多个视频确认的内容)
  • 来源分歧(来源相互矛盾的内容,并注明出处)
  • 独特见解(单个视频中提及的、未在其他地方重复的重要观点)
  • 覆盖空白(所有来源共同遗漏的内容)
  • 来源表格:使用的每个视频的标题、频道、时长和URL
将特定观点归因于对应的来源视频。如果转录文本保留了时间戳,请包含在内。将所有转录文本保存到主题目录,以备未来参考。示例报告结构请参见
examples/research-report.md

Research Composability

研究可组合性

See
references/cli-reference.md
for pipeline patterns that chain subcommands with standard Unix tools (search → jq → batch transcript).
有关将子命令与标准Unix工具(搜索→jq→批量转录)链接的流水线模式,请参见
references/cli-reference.md

Error Recovery

错误恢复

Exit CodeMeaningRecovery Action
0Success
1Usage errorCheck
--help
for correct syntax
2yt-dlp not foundTell user to install:
pip install yt-dlp
3Network/download errorCheck URL validity; try
--cookies <browser>
for private/restricted content
4No resultsFor transcripts: try
--lang all
to list available languages. For search: broaden query or remove filters
退出码含义恢复操作
0成功
1使用错误查看
--help
获取正确语法
2未找到yt-dlp告知用户安装:
pip install yt-dlp
3网络/下载错误检查URL有效性;尝试使用
--cookies <browser>
获取私有/受限内容
4无结果转录操作:尝试
--lang all
查看可用语言。搜索操作:放宽查询词或移除筛选条件

Platform Notes

平台说明

Load
references/platforms.md
when processing a non-YouTube URL or when a yt-dlp command fails with exit code 3 on an unfamiliar platform. YouTube is the primary platform, but any yt-dlp-supported URL works (Vimeo, Twitter, Twitch, etc.).
After extracting a transcript, read the output and summarize key points for the user unless they asked for raw output only.
当处理非YouTube URL,或yt-dlp命令因不熟悉的平台返回退出码3时,请查阅
references/platforms.md
。YouTube是主要平台,但所有yt-dlp支持的URL均可使用(Vimeo、Twitter、Twitch等)。
提取转录文本后,除非用户明确要求仅输出原始内容,否则需读取输出并为用户总结关键点。