Search Results: speech-to-text

Found 87 Skills

AI & Machine Learningcinience/alicloud-skills

alicloud-ai-audio-asr

Transcribe non-realtime speech with Alibaba Cloud Model Studio Qwen ASR models (`qwen3-asr-flash`, `qwen-audio-asr`, `qwen3-asr-flash-filetrans`). Use when converting recorded audio files to text, generating transcripts with timestamps, or documenting DashScope/OpenAI-compatible ASR request and response fields.

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitieshyperpuncher/dotagents

chough

Fast ASR CLI tool for transcribing audio/video files. Use when user wants to transcribe audio/video, generate subtitles (VTT), convert speech to text with timestamps (JSON), or optimize transcription for low memory.

🇺🇸|EnglishTranslated

AI & Machine Learningteam-telnyx/skills

telnyx-ai-inference-curl

Access Telnyx LLM inference APIs, embeddings, and AI analytics for call insights and summaries. This skill provides REST API (curl) examples.

🇺🇸|EnglishTranslated

AI & Machine Learningpostplusai/postplus-skill...

video-transcription

Transcribe video files directly into timed transcripts and subtitle-ready artifacts using hosted Whisper video-to-text. Use this when the input is a video and the goal is speech extraction, caption generation, or edit-prep timing.

🇺🇸|EnglishTranslated

12 scripts/Attention

AI & Machine Learningdeepgram/skills

examples

Find working Deepgram integration examples with third-party platforms and frameworks. Use whenever someone wants to integrate Deepgram with Twilio, LiveKit, LangChain, Vercel AI SDK, Discord, Vonage, Pipecat, Expo, FastAPI, Cloudflare Workers, Slack, Telegram, LlamaIndex, Zoom, Next.js, Nuxt, Django, SvelteKit, NestJS, Spring Boot, CrewAI, Riverside, SignalWire, and more. Examples are full runnable integration demos, not minimal feature snippets.

🇺🇸|EnglishTranslated

AI & Machine Learningtrpc-group/trpc-agent-go

whisper

Transcribe audio files to text using OpenAI Whisper

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningkaramouche/skills

sdk-integration

Install and configure the official Gladia SDKs (@gladiaio/sdk for JS/TS, gladiaio-sdk for Python). Use when the user asks about SDK setup, client initialization, API key configuration, choosing between JS and Python, browser usage, retry/timeout settings, error handling, or SDK vs raw API decisions. The SDK is the recommended default for all Gladia integrations.

🇺🇸|EnglishTranslated

AI & Machine Learningdokhacgiakhoa/antigravity...

voice-ai-engine-development

Architecting real-time Voice AI agents.

🇺🇸|EnglishTranslated

AI & Machine Learningjeremylongshore/claude-co...

deepgram-install-auth

Install and configure Deepgram SDK/CLI authentication. Use when setting up a new Deepgram integration, configuring API keys, or initializing Deepgram in your project. Trigger with phrases like "install deepgram", "setup deepgram", "deepgram auth", "configure deepgram API key".

🇺🇸|EnglishTranslated

AI & Machine Learningmembranedev/application-s...

deepgram

Deepgram integration. Manage Projects. Use when the user wants to interact with Deepgram data.

🇺🇸|EnglishTranslated

AI & Machine Learningopen-kbs/skills-file-tran...

file-transcribe

Transcribe audio/video files to text using Whisper via OpenKBS AI proxy. Supports MP4, MP3, WAV, OGG, MKV and other ffmpeg-compatible formats. Splits large files into chunks automatically.

🇺🇸|EnglishTranslated

1 scripts/Attention

AI & Machine Learningjianshuo/claude-skills

wjs-transcribing-audio

Use when the user has audio or video and wants a timestamped transcript (SRT) in the source language. Routes by source language — Chinese defaults to Volcano (豆包) ASR; other languages (Spanish, English, Portuguese, French, Italian, Japanese, Korean, etc.) use OpenAI Whisper API with word-level timestamps and self-assembled cues. Outputs SRT with punctuation-bounded cues capped for on-screen reading. Triggers — "转写", "转成字幕", "做 SRT", "transcribe", "make subtitles", "speech to text", "出字幕".

🇺🇸|EnglishTranslated