Loading...
Loading...
Found 87 Skills
Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.
Fast ASR CLI tool for transcribing audio/video files. Use when user wants to transcribe audio/video, generate subtitles (VTT), convert speech to text with timestamps (JSON), or optimize transcription for low memory.
Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides REST API (curl) examples.
Transcribe video files directly into timed transcripts and subtitle-ready artifacts using hosted Whisper video-to-text. Use this when the input is a video and the goal is speech extraction, caption generation, or edit-prep timing.
Find working Deepgram integration examples with third-party platforms and frameworks. Use whenever someone wants to integrate Deepgram with Twilio, LiveKit, LangChain, Vercel AI SDK, Discord, Vonage, Pipecat, Expo, FastAPI, Cloudflare Workers, Slack, Telegram, LlamaIndex, Zoom, Next.js, Nuxt, Django, SvelteKit, NestJS, Spring Boot, CrewAI, Riverside, SignalWire, and more. Examples are full runnable integration demos, not minimal feature snippets.
Transcribe audio files to text using OpenAI Whisper
Install and configure the official Gladia SDKs (@gladiaio/sdk for JS/TS, gladiaio-sdk for Python). Use when the user asks about SDK setup, client initialization, API key configuration, choosing between JS and Python, browser usage, retry/timeout settings, error handling, or SDK vs raw API decisions. The SDK is the recommended default for all Gladia integrations.
Architecting real-time Voice AI agents.
Install and configure Deepgram SDK/CLI authentication. Use when setting up a new Deepgram integration, configuring API keys, or initializing Deepgram in your project. Trigger with phrases like "install deepgram", "setup deepgram", "deepgram auth", "configure deepgram API key".
Deepgram integration. Manage Projects. Use when the user wants to interact with Deepgram data.
Transcribe audio/video files to text using Whisper via OpenKBS AI proxy. Supports MP4, MP3, WAV, OGG, MKV and other ffmpeg-compatible formats. Splits large files into chunks automatically.
Use when the user has audio or video and wants a timestamped transcript (SRT) in the source language. Routes by source language — Chinese defaults to Volcano (豆包) ASR; other languages (Spanish, English, Portuguese, French, Italian, Japanese, Korean, etc.) use OpenAI Whisper API with word-level timestamps and self-assembled cues. Outputs SRT with punctuation-bounded cues capped for on-screen reading. Triggers — "转写", "转成字幕", "做 SRT", "transcribe", "make subtitles", "speech to text", "出字幕".