Loading...
Loading...
Found 123 Skills
Configure Home Assistant Assist voice control with pipelines, intents, wake words, and speech processing. Use when setting up voice control, creating custom intents, configuring TTS/STT, or building voice satellites. Activates on keywords: Assist, voice control, wake word, intent, sentence, TTS, STT, Piper, Whisper, Wyoming.
Use when the user wants to generate speech, voiceover, or text-to-audio. Converts text to AI voice via Giggle.pro TTS API. Triggers: generate speech, text-to-speech, TTS, voiceover, read this text aloud, synthesize speech.
MiniMax multimodal model skill — use MiniMax Multi-Modal models for speech, music, video, and image. Create voice, music, video, and images with MiniMax AI: TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract). Use when the user mentions MiniMax, multimodal generation, or wants speech/music/video/image AI, MiniMax APIs, or FFmpeg workflows alongside MiniMax outputs.
Speak text aloud using system TTS (say command on macOS/Linux) or browser TTS via Chrome DevTools Protocol. Use when: (1) job completes and you want to announce results, (2) user asks to hear something spoken, (3) notifications that need audio alerts, (4) accessibility - reading content aloud.
Automatically announces plans, issues, and summaries out loud using TTS. Use this skill PROACTIVELY after completing major tasks like finalizing a plan, resolving an issue, or generating a summary. Each project gets a unique voice so users can identify which project is speaking from another room. Providers fallback in order (google, openai, elevenlabs, say) on rate limits.
Voice cloning workflows with Alibaba Cloud Model Studio Qwen TTS VC models. Use when creating cloned voices from sample audio and synthesizing text with cloned timbre.
Voice design workflows with Alibaba Cloud Model Studio Qwen TTS VD models. Use when creating custom synthetic voices from text descriptions and using them for speech synthesis.
Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.
Tired of juggling multiple audio APIs? This skill gives you one-command access to TTS, music generation, sound effects, and voice cloning. Use when you want to generate any audio without managing multiple API keys.
Use when real-time speech synthesis is needed with Alibaba Cloud Model Studio Qwen TTS Realtime models. Use when low-latency interactive speech is required, including instruction-controlled realtime synthesis.
MiMo V2.5 TTS Text-to-Speech. Generate speech using Xiaomi MiMo V2.5 TTS series models. This skill is activated when text needs to be converted to speech, voice messages need to be sent, content needs to be read aloud, or when users request 'speak it out' or 'voice reply'. It supports three modes: preset voice, voice design, and voice cloning, as well as natural language control and director mode. It also supports style tag control for tone, emotion, and dialect, and preset voices support singing.
Generate videos using Flyworks (a.k.a HiFly) Digital Humans. Create talking photo videos from images, use public avatars with TTS, or clone voices for custom audio.