Loading...
Loading...
Use when transcribing audio/video to text with timestamps, speaker labels, and chapters. Supports YouTube URLs and local files. Produces structured markdown output.
npx skill4agent add lattifai/omni-captions-skills omnicaptions-transcribe1. Check captions: yt-dlp --list-subs "URL"
2. Has caption → Use /omnicaptions:download to get existing captions (better quality)
3. No caption → Transcribe directly with URL (don't download first!)# YouTube URL (recommended, no download needed)
omnicaptions transcribe "https://www.youtube.com/watch?v=VIDEO_ID"
# Local files
omnicaptions transcribe video.mp4-o/omnicaptions:download| Method | Description |
|---|---|
| Transcribe file or URL (sync) |
| Translate captions |
| Save text to file |
pip install omni-captions-skillsGEMINI_API_KEY.env~/.config/omnicaptions/config.jsonPlease enter your Gemini API key (get from https://aistudio.google.com/apikey):-k <key>transcribetranslateconvert# Transcribe (auto-output to same directory)
omnicaptions transcribe video.mp4 # → ./video_GeminiUnd.md
omnicaptions transcribe "https://youtu.be/abc" # → ./abc_GeminiUnd.md
# Specify output file or directory
omnicaptions transcribe video.mp4 -o output/ # → output/video_GeminiUnd.md
omnicaptions transcribe video.mp4 -o my.md # → my.md
# Options
omnicaptions transcribe -m gemini-3-pro-preview video.mp4
omnicaptions transcribe -l zh video.mp4 # Force Chinese| Option | Description |
|---|---|
| Gemini API key (auto-prompted if missing) |
| Output file or directory (default: auto) |
| Model (default: gemini-3-flash-preview) |
| Force language (zh, en, ja) |
| Translate to language (one-step) |
| Bilingual output (with -t) |
| Verbose output |
-t <lang> --bilingualomnicaptions transcribe video.mp4 -t zh --bilingual## Table of Contents
* [00:00:00] Introduction
* [00:02:15] Main Topic
## [00:00:00] Introduction
**Host:** Welcome to the show. [00:00:01]
**Guest:** Thanks for having me. [00:00:05]
[Applause] [00:00:08]## [HH:MM:SS] Title**Speaker:**[HH:MM:SS][Event]| Mistake | Fix |
|---|---|
| No API key error | Use |
| Empty response | Check file format (mp3/mp4/wav/m4a supported) |
| Upload timeout | File too large (>2GB); split first |
| Wrong language | Use |
| Skill | Use When |
|---|---|
| Convert output to SRT/VTT/ASS |
| Translate (Gemini API or Claude native) |
| Download video/audio first |
# Basic transcription
omnicaptions transcribe video.mp4
# → video_GeminiUnd.md
# Precise timing needed: transcribe → LaiCut align → convert
omnicaptions transcribe video.mp4
omnicaptions LaiCut video.mp4 video_GeminiUnd.md
# → video_GeminiUnd_LaiCut.json
omnicaptions convert video_GeminiUnd_LaiCut.json -o video_GeminiUnd_LaiCut.srtNote: For translation, use(default: Claude, optional: Gemini API)/omnicaptions:translate