Loading...
Loading...
Found 196 Skills
Generate multi-person talking head podcast videos from scratch using AI — character creation, TTS, avatar animation, and video stitching. Use when the user wants to create a podcast, talking head video, or multi-speaker conversation video.
Use when the user wants to teach / learn an English word as a video — turn a single English word into a self-contained HLS "supercut" lesson built from the mira video base. Stitches every season2 clip where the word is spoken (via the search-app API) into one .m3u8, prepended with a Claude-written bilingual word-intro card (word + IPA + 中文 gloss + usage, Volcano TTS) and appended with a 关注王建硕 CTA card. No MP4 burn. Triggers — "teach <word>", "讲讲 <word>", "学英语 <word>", "把 <word> 做成视频", "/wjs-teaching-english <word>".
Control audio generation requests before execution. Use this when the user asks for TTS, persona voice, voice change, translated dub, cloned voice take, podcast audio, or lip-sync audio handoff and the skill must classify the request before handing execution to voice-batch-runner or a video workflow.
Routes NVIDIA Nemotron Speech (Riva) NIM tasks — deploys, runs, and tests ASR, TTS, and NMT NIMs on build.nvidia.com or self-hosted.
Generate digital-human short videos with Luma / 拾光 / 拾光智能体 / 拾光工具 by composing voice clone, TTS, avatar, lip-sync, subtitle, and enhancement tools.
ElevenLabs TTS integration for video narration. Use when generating voiceover audio, selecting voices, or building script-to-audio pipelines
Configure Home Assistant Assist voice control with pipelines, intents, wake words, and speech processing. Use when setting up voice control, creating custom intents, configuring TTS/STT, or building voice satellites. Activates on keywords: Assist, voice control, wake word, intent, sentence, TTS, STT, Piper, Whisper, Wyoming.
Send voice messages (TTS) to the user via Telegram. Use when replying to voice messages or when a voice reply feels natural.
Complete ElevenLabs AI audio platform: text-to-speech (TTS), speech-to-text (STT/Scribe), voice cloning, voice design, sound effects, music generation, dubbing, voice changer, voice isolator, and conversational voice agents. Use when working with audio generation, voice synthesis, transcription, audio processing, or building voice-enabled applications. Triggers: generate speech, clone voice, transcribe audio, create sound effects, compose music, dub video, change voice, isolate vocals, build voice agent, ElevenLabs API/SDK/CLI/MCP.
Diagnose and resolve TTS and Telegram bot issues. TRIGGERS - tts not working, bot not responding, kokoro error, audio not playing, lock stuck, telegram bot troubleshoot, diagnose issue.
Diagnose Kokoro TTS issues. TRIGGERS - kokoro not working, tts diagnose, kokoro error, tts troubleshoot.
Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.