Search Results: transcription

Found 102 Skills

trx

Transcribe audio/video using trx CLI and post-process results with agent corrections. Use when: (1) user wants to transcribe a video or audio file, (2) user shares a YouTube/Twitter/Instagram URL for transcription, (3) user says "transcribe", "subtitles", "srt", "transcript", (4) user wants to fix/clean up a whisper transcription, (5) user asks to extract text from a video.

🇺🇸|EnglishTranslated

Document Processingtextops/textops-skills

hebrew-tech-lecture-summary

Summarize Hebrew tech lectures and meetings from transcription files into structured Markdown. Use when the user asks to summarize a transcription, create meeting notes, summarize a lecture/presentation in Hebrew, or mentions סיכום/תמלול/הרצאה.

🇺🇸|EnglishTranslated

1 scripts/Attention

AI & Machine Learninganthropics/knowledge-work...

scribe

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode transcription, batch jobs, or transcript pipeline design.

🇺🇸|EnglishTranslated

AI & Machine Learningtool-belt/skills

elevenlabs-stt

ElevenLabs speech-to-text with Scribe models and forced alignment via inference.sh CLI. Models: Scribe v1/v2 (98%+ accuracy, 90+ languages). Capabilities: transcription, speaker diarization, audio event tagging, word-level timestamps, forced alignment, subtitle generation. Use for: meeting transcription, subtitles, podcast transcripts, lip-sync timing, karaoke. Triggers: elevenlabs stt, elevenlabs transcription, scribe, elevenlabs speech to text, forced alignment, word alignment, subtitle timing, diarization, speaker identification, audio event detection, eleven labs transcribe

🇺🇸|EnglishTranslated

AI & Machine Learningcopilotkit/skills

copilotkit-debug

Use when diagnosing CopilotKit issues -- runtime connectivity failures, agent not responding, streaming errors, tool execution problems, transcription failures, version mismatches, and AG-UI event tracing.

🇺🇸|EnglishTranslated

AI & Machine Learningelevenlabs/skills

voice-isolator

Remove background noise and isolate vocals/speech from audio using ElevenLabs Voice Isolator (audio isolation) API. Use when cleaning up noisy recordings, removing music or background ambience from dialogue, isolating speech from field recordings, preparing audio for transcription, extracting vocals, or any "denoise / clean up / isolate voice" task.

🇺🇸|EnglishTranslated

AI & Machine Learningbadlogic/pi-skills

transcribe

Speech-to-text transcription using Groq Whisper API. Supports m4a, mp3, wav, ogg, flac, webm.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesvideo-db/skills

python

Process videos with the VideoDB Python SDK. Handles trimming, combining clips, audio overlays, background music, subtitles, transcription, voiceover, text/image overlays, transcoding, resolution change, aspect-ratio fix, resizing for social platforms, media generation, search, and real-time capture — all server-side with no ffmpeg or local encoding tools needed.

🇺🇸|EnglishTranslated

11 scripts/Attention

Tools & Utilitiesvideo-db/skills

videodb

🇺🇸|EnglishTranslated

12 scripts/Attention

AI & Machine Learningkouko/monkey-knowledge-yo...

mk-youtube-audio-transcribe

Transcribe audio to text using local whisper.cpp. Use when user wants to convert audio/video to text, get transcription, or speech-to-text.

🇺🇸|EnglishTranslated

13 scripts/Attention

Automationpsycho-baller/ai-agents-c...

letterly-automation

Comprehensive automation for Letterly transcriptions. This skill exports the latest CSV from Letterly, processes "magic" notes into Obsidian markdown with custom metadata, semantically links them using a vector database, and moves them to the final Transcriptions directory. Use when the user asks to "process new letterly transcriptions", "sync letterly", or "import magic notes from letterly".

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningzainhas/togetherai-skills

together-audio

Text-to-speech (TTS) and speech-to-text (STT) via Together AI. TTS models include Orpheus, Kokoro, Cartesia Sonic, Rime, MiniMax with REST, streaming, and WebSocket support. STT models include Whisper and Voxtral. Use when users need voice synthesis, audio generation, speech recognition, transcription, TTS, STT, or real-time voice applications.

🇺🇸|EnglishTranslated

2 scripts/Checked