Search Results: text-to-speech

Found 121 Skills

AI & Machine Learningcnemri/google-genai-skill...

speech-build

Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).

🇺🇸|EnglishTranslated

Mobile Developmentjchaselubitz/drill-app

expo-audio

Guide for using expo-audio to implement audio playback and recording in React Native apps. Apply when working with audio features, sound playback, recording, or text-to-speech functionality.

🇺🇸|EnglishTranslated

Backend Developmentsinch/skills

sinch-voice-api

Build voice apps with Sinch Voice REST API. Use for phone calls, text-to-speech (TTS), IVR menus, DTMF input, conference calling, call recording, call forwarding, answering machine detection (AMD), SIP routing, WebSocket audio streaming, and SVAML call control.

🇺🇸|EnglishTranslated

5 scripts/Attention

AI & Machine Learningxiaomimimo/mimo-skills

mimo-v2-5-tts

MiMo V2.5 TTS Text-to-Speech. Generate speech using Xiaomi MiMo V2.5 TTS series models. This skill is activated when text needs to be converted to speech, voice messages need to be sent, content needs to be read aloud, or when users request 'speak it out' or 'voice reply'. It supports three modes: preset voice, voice design, and voice cloning, as well as natural language control and director mode. It also supports style tag control for tone, emotion, and dialect, and preset voices support singing.

🇨🇳|ChineseTranslated

4 scripts/Attention

Tools & Utilitiesconversiontools/agent-ski...

conversiontools

Convert files between 140+ formats using the ConversionTools MCP server. Use when the user needs to convert documents (Word, PDF, Excel, PowerPoint), data formats (JSON, CSV, XML, YAML, Parquet), images (PNG, JPG, WebP, AVIF, HEIC, JXL, SVG), audio (MP3, WAV, FLAC), video (MOV, MKV, AVI to MP4), e-books (EPUB, MOBI, AZW), OCR text extraction, AI-powered data extraction, AI text-to-speech (TTS), AI speech-to-text transcription (STT), subtitle conversion (SRT, VTT, ASS), or website screenshots.

🇺🇸|EnglishTranslated

AI & Machine Learningjackspace/claudeskillz

elevenlabs-agents

Use this skill when building AI voice agents with the ElevenLabs Agents Platform. This skill covers the complete platform including agent configuration (system prompts, turn-taking, workflows), voice & language features (multi-voice, pronunciation, speed control), knowledge base (RAG), tools (client/server/MCP/system), SDKs (React, JavaScript, React Native, Swift, Widget), Scribe (real-time STT), WebRTC/WebSocket connections, testing & evaluation, analytics, privacy/compliance (GDPR/HIPAA/SOC 2), cost optimization, CLI workflows ("agents as code"), and DevOps integration. Prevents 17+ common errors including package deprecation, Android audio cutoff, CSP violations, missing dynamic variables, case-sensitive tool names, webhook authentication failures, and WebRTC configuration issues. Provides production-tested templates for React, Next.js, React Native, Swift, and Cloudflare Workers. Token savings: ~73% (22k → 6k tokens). Production tested. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

🇺🇸|EnglishTranslated

AI & Machine Learningsweetretry/skills

fal-ai-model-search

Search and integrate Fal AI models from fal.ai platform. Use when the user wants to (1) search for models on Fal AI platform, (2) get detailed information about a specific Fal AI model, (3) integrate a Fal AI model into the project, (4) explore available AI models on fal.ai, or mentions "fal.ai", "图像生成", "AI video model", "text to image", "text-to-speech".

🇺🇸|EnglishTranslated

AI & Machine Learninginference-skills/skills

ai-avatar-video

Create AI avatar and talking head videos via inference.sh CLI. Recommended: P-Video-Avatar (fastest, cheapest, built-in TTS). Also: OmniHuman, Fabric, PixVerse. Capabilities: audio-driven avatars, text-to-avatar, lipsync videos, talking head generation, virtual presenters. Use for: AI presenters, explainer videos, virtual influencers, dubbing, marketing videos. Triggers: ai avatar, talking head, lipsync, avatar video, virtual presenter, ai spokesperson, audio driven video, heygen alternative, synthesia alternative, talking avatar, lip sync, video avatar, ai presenter, digital human

🇺🇸|EnglishTranslated

177.2k

AI & Machine Learningdigitalsamba/claude-code-...

elevenlabs

Generate AI voiceovers, sound effects, and music using ElevenLabs APIs. Use when creating audio content for videos, podcasts, or games. Triggers include generating voiceovers, narration, dialogue, sound effects from descriptions, background music, soundtrack generation, voice cloning, or any audio synthesis task.

🇺🇸|EnglishTranslated

AI & Machine Learningmichaelboeding/skills

audio-producer-agent

Use this skill to create single-voice audio content like audiobooks, voiceovers, narrations, jingles, and audio ads. Triggers: "create audiobook", "generate voiceover", "narration", "audio ad", "radio ad", "jingle", "brand audio", "sonic logo", "text to audio", "read this aloud", "audio guide", "meditation audio", "soundscape" Orchestrates: narration/TTS, background music, and audio assembly. NOTE: For conversations/dialogues, use podcast-producer instead.

🇺🇸|EnglishTranslated

AI & Machine Learningnoizai/skills

tts

Convert text into speech with Kokoro or Noiz, including simple and timeline-aligned modes.

🇺🇸|EnglishTranslated

4 scripts/Attention

Backend Developmentrunwayml/skills

integrate-audio

Help users integrate Runway audio APIs (TTS, sound effects, voice isolation, dubbing)

🇺🇸|EnglishTranslated