Search Results: tts

Found 153 Skills

AI & Machine Learningtavus-engineering/tavus-s...

tavus-cvi-persona

Configure Tavus CVI personas with custom LLMs, TTS engines, perception, and turn-taking. Use when customizing AI behavior, bringing your own LLM, configuring voice/TTS, enabling vision with Raven, or tuning conversation flow with Sparrow.

🇺🇸|EnglishTranslated

AI & Machine Learningyonatangross/orchestkit

elevenlabs-narration

ElevenLabs TTS integration for video narration. Use when generating voiceover audio, selecting voices, or building script-to-audio pipelines

🇺🇸|EnglishTranslated

Tools & Utilitieswlzh/skills

text-to-speech

Text-to-Speech Tool - Supports script parsing, emotion tagging, and post-processing, based on Edge TTS

🇨🇳|ChineseTranslated

1 scripts/Checked

AI & Machine Learningcodestackr/livekit-skills

agents-py

Build LiveKit Agent backends in Python. Use this skill when creating voice AI agents, voice assistants, or any realtime AI application using LiveKit's Python Agents SDK (livekit-agents). Covers AgentSession, Agent class, function tools, STT/LLM/TTS models, turn detection, and multi-agent workflows.

🇺🇸|EnglishTranslated

AI & Machine Learningmbailey/voicemode

voicemode-connect

Remote voice via VoiceMode Connect. Use when users want to add voice to Claude Code using their phone or web app, without local STT/TTS setup.

🇺🇸|EnglishTranslated

Automationnodnarbnitram/claude-code...

ha-voice

Configure Home Assistant Assist voice control with pipelines, intents, wake words, and speech processing. Use when setting up voice control, creating custom intents, configuring TTS/STT, or building voice satellites. Activates on keywords: Assist, voice control, wake word, intent, sentence, TTS, STT, Piper, Whisper, Wyoming.

🇺🇸|EnglishTranslated

AI & Machine Learninggiggle-official/skills

giggle-generation-speech

Use when the user wants to generate speech, voiceover, or text-to-audio. Converts text to AI voice via Giggle.pro TTS API. Triggers: generate speech, text-to-speech, TTS, voiceover, read this text aloud, synthesize speech.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningminimax-ai/skills

minimax-multimodal-toolkit

MiniMax multimodal model skill — use MiniMax Multi-Modal models for speech, music, video, and image. Create voice, music, video, and images with MiniMax AI: TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract). Use when the user mentions MiniMax, multimodal generation, or wants speech/music/video/image AI, MiniMax APIs, or FFmpeg workflows alongside MiniMax outputs.

🇺🇸|EnglishTranslated

9 scripts/Attention

AI & Machine Learningbytedance/agentkit-sample...

byted-text-to-speech

Convert text to speech (TTS). Powered by the VolcEngine Doubao Text-to-Speech API, it supports streaming synthesis, multiple voice timbres, adjustments to speech rate/pitch/loudness, Markdown syntax filtering, and LaTeX formula broadcasting. Use this skill when users need to convert text to speech, generate reading audio, dubbing, narration, broadcasts, or mention terms like 'text-to-speech', 'TTS', 'speech synthesis', 'reading aloud', or 'dubbing'.

🇨🇳|ChineseTranslated

2 scripts/Checked

Tools & Utilitieswinsorllc/upgraded-carniv...

voice-output

Speak text aloud using system TTS (say command on macOS/Linux) or browser TTS via Chrome DevTools Protocol. Use when: (1) job completes and you want to announce results, (2) user asks to hear something spoken, (3) notifications that need audio alerts, (4) accessibility - reading content aloud.

🇺🇸|EnglishTranslated

5 scripts/Attention

AI & Machine Learningovachiever/droid-tings

openai-api

Build with OpenAI's stateless APIs - Chat Completions (GPT-5, GPT-4o), Embeddings, Images (DALL-E 3), Audio (Whisper + TTS), and Moderation. Includes Node.js SDK and fetch-based approaches for Cloudflare Workers. Use when: implementing chat completions with GPT-5/GPT-4o, streaming responses with SSE, using function calling/tools, creating structured outputs with JSON schemas, generating embeddings for RAG (text-embedding-3-small/large), generating images with DALL-E 3, editing images with GPT-Image-1, transcribing audio with Whisper, synthesizing speech with TTS (11 voices), moderating content (11 safety categories), or troubleshooting rate limits (429), invalid API keys (401), function calling failures, streaming parse errors, embeddings dimension mismatches, or token limit exceeded.

🇺🇸|EnglishTranslated

16 scripts/Attention

AI & Machine Learningcnemri/google-genai-skill...

speech-use

Generate (TTS), Transcribe (STT), and Clone voices using Google's GenAI and Cloud Speech SDKs. Supports Gemini-TTS, Chirp 3, and Instant Custom Voice.

🇺🇸|EnglishTranslated

3 scripts/Checked