Loading...
Loading...
Found 121 Skills
AI audio generation powered by CellCog. Text-to-speech, voice synthesis, voiceovers, podcast audio, narration, music generation, background music, sound design. Professional audio creation with AI.
Deepgram API reference for speech-to-text, text-to-speech, voice agents, audio intelligence, and account management. Use whenever building with Deepgram APIs — REST or WebSocket. Covers authentication, all endpoints, query parameters, request/response schemas, and WebSocket message formats. Reference files are organized by domain: listen (STT), speak (TTS), agent (voice agents), read (text/audio intelligence), models, projects, auth, and self-hosted.
Text-to-speech and speech-to-text using fal.ai audio models. Use when the user requests "Convert text to speech", "Transcribe audio", "Generate voice", "Speech to text", "TTS", "STT", or similar audio tasks.
Text-to-speech synthesis with ElevenLabs and system voices
ElevenLabs text-to-speech with mac-style say UX.
Guide to audio generation and understanding in MassGen. Covers text-to-speech, music, sound effects, and audio understanding across ElevenLabs and OpenAI backends.
Text-to-speech models, voices, formats, and streaming via Venice.ai. Useful for narration, voiceover, and conversational agent voices.
Text-to-speech synthesis via Google Cloud Text-to-Speech API — MP3 output, configurable language and voice, voice listing.
Expert in building voice interfaces, speech recognition, and text-to-speech systems
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".
Build text-to-speech applications using Qwen3-TTS, a powerful speech generation system supporting voice clone, voice design, and custom voice synthesis. Use when creating TTS applications, generating speech from text, cloning voices from audio samples, designing new voices via natural language descriptions, or fine-tuning TTS models. Supports 10 languages (Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian).
Inworld TTS API. Covers voice cloning, audio markups, timestamps. Keywords: text-to-speech, visemes.