Search Results: text-to-speech

Found 121 Skills

AI & Machine Learninginference-sh/skills

text-to-speech

Convert text to natural speech with DIA TTS, Kokoro, Chatterbox, and more via inference.sh CLI. Models: DIA TTS (conversational), Kokoro TTS, Chatterbox, Higgs Audio, VibeVoice (podcasts). Capabilities: text-to-speech, voice cloning, multi-speaker dialogue, podcast generation, expressive speech. Use for: voiceovers, audiobooks, podcasts, accessibility, video narration, IVR, voice assistants. Triggers: text to speech, tts, voice generation, ai voice, speech synthesis, voice over, generate speech, ai narrator, voice cloning, text to audio, elevenlabs alternative, voice ai, ai voiceover, speech generator, natural voice

🇺🇸|EnglishTranslated

872

AI & Machine Learningmartinholovsky/claude-ski...

text-to-speech

Expert skill for implementing text-to-speech with Kokoro TTS. Covers voice synthesis, audio generation, performance optimization, and secure handling of generated audio for JARVIS voice assistant.

🇺🇸|EnglishTranslated

AI & Machine Learningelevenlabs/skills

text-to-speech

Convert text to speech using ElevenLabs voice AI. Use when generating audio from text, creating voiceovers, building voice apps, or synthesizing speech in 70+ languages.

🇺🇸|EnglishTranslated

AI & Machine Learningheygen-com/skills

text-to-speech

Generate speech audio from text using HeyGen's Starfish TTS model. Use when: (1) Generating standalone speech audio files from text, (2) Converting text to speech with voice selection, speed, and pitch control, (3) Creating audio for voiceovers, narration, or podcasts, (4) Working with HeyGen's /v1/audio endpoints, (5) Listing available TTS voices by language or gender.

🇺🇸|EnglishTranslated

AI & Machine Learningsarvamai/skills

text-to-speech

Convert text to natural speech using Sarvam AI's Bulbul v3 model. Handles audio generation, voiceovers, and voice interfaces for 11 Indian languages with 30+ voices. Supports REST, HTTP streaming, WebSocket, and pronunciation dictionaries. Use when generating spoken audio from text.

🇺🇸|EnglishTranslated

Tools & Utilitieswlzh/skills

text-to-speech

Text-to-Speech Tool - Supports script parsing, emotion tagging, and post-processing, based on Edge TTS

🇨🇳|ChineseTranslated

1 scripts/Checked

AI & Machine Learningbytedance/agentkit-sample...

byted-text-to-speech

Convert text to speech (TTS). Powered by the VolcEngine Doubao Text-to-Speech API, it supports streaming synthesis, multiple voice timbres, adjustments to speech rate/pitch/loudness, Markdown syntax filtering, and LaTeX formula broadcasting. Use this skill when users need to convert text to speech, generate reading audio, dubbing, narration, broadcasts, or mention terms like 'text-to-speech', 'TTS', 'speech synthesis', 'reading aloud', or 'dubbing'.

🇨🇳|ChineseTranslated

2 scripts/Checked

Tools & Utilitiesheygen-com/hyperframes

hyperframes-media

Asset preprocessing for HyperFrames compositions — text-to-speech narration (Kokoro), audio/video transcription (Whisper), and background removal for transparent overlays (u2net). Use when generating voiceover from text, transcribing speech for captions, removing the background from a video or image to use as a transparent overlay, choosing a TTS voice or whisper model, or chaining these (TTS → transcribe → captions). Each command downloads its own model on first run.

🇺🇸|EnglishTranslated

76.6k

AI & Machine Learninginferen-sh/skills

elevenlabs-tts

ElevenLabs text-to-speech with 22+ premium voices, multilingual support, and voice tuning via inference.sh CLI. Models: eleven_multilingual_v2 (highest quality), eleven_turbo_v2_5 (low latency), eleven_flash_v2_5 (ultra-fast). Capabilities: text-to-speech, voice selection, stability/style control, 32 languages. Use for: voiceovers, audiobooks, video narration, podcasts, accessibility, IVR. Triggers: elevenlabs, eleven labs, elevenlabs tts, premium tts, professional voice, ai voice, high quality tts, multilingual tts, eleven labs voice, voice generation, natural speech, realistic voice, voice over, speech synthesis

🇺🇸|EnglishTranslated

52.0k

AI & Machine Learningskill-zero/s

ai-voice-cloning

AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs alternative, natural voice, realistic speech, voice ai

🇺🇸|EnglishTranslated

829

AI & Machine Learningskill-zero/s

ai-podcast-creation

Create AI-powered podcasts with text-to-speech, music, and audio editing. Tools: Kokoro TTS, DIA TTS, Chatterbox, AI music generation, media merger. Capabilities: multi-voice conversations, background music, intro/outro, full episodes. Use for: podcast production, audiobooks, voice content, audio newsletters. Triggers: podcast, ai podcast, text to speech podcast, audio content, voice over, ai audiobook, multi voice, conversation ai, notebooklm alternative, audio generation, podcast automation, ai narrator, voice content, audio newsletter, podcast maker

🇺🇸|EnglishTranslated

688

AI & Machine Learningmicrosoft/agent-skills

podcast-generation

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming.

🇺🇸|EnglishTranslated

1 scripts/Checked