Search Results: speech-synthesis

Found 13 Skills

dialogue-audio

Multi-speaker dialogue audio creation with Dia TTS. Covers speaker tags, emotion control, pacing, conversation flow, and post-production. Use for: podcasts, audiobooks, explainers, character dialogue, conversational content. Triggers: dialogue audio, multi speaker, conversation audio, dia tts, two speakers, podcast audio, character voices, voice acting, dialogue generation, conversation tts, multi voice, speaker tags, dialogue recording

🇺🇸|EnglishTranslated

732

AI & Machine Learningnvidia/skills

digital-health-clinical-asr-setup

Stage 1 of Clinical ASR Flywheel. Use when bootstrapping a cycle: NVCF+MW disclosure, NVIDIA_API_KEY check, deps install, TTS+ASR smoke test.

🇺🇸|EnglishTranslated

AI & Machine Learningframersai/agentos-skills

streaming-tts-openai

Low-latency streaming text-to-speech via OpenAI TTS API — adaptive sentence chunking, concurrent fetch pipelining, six voices.

🇺🇸|EnglishTranslated

AI & Machine Learningqianwen-ai/qianwen-ai

qianwen-audio-tts

[QianWen] Synthesize speech from text with Qwen TTS models. TRIGGER when: user wants to convert text to speech, create voiceovers, generate audio narration, read text aloud, build TTS applications, mentions speech synthesis/voice generation/audio output from text, or explicitly invokes this skill by name (e.g. use qianwen-audio-tts). DO NOT TRIGGER when: user wants speech recognition/ASR, text generation without audio, non-Qwen audio tasks.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningcinience/alicloud-skills

alicloud-ai-audio-tts-voice-design

Voice design workflows with Alibaba Cloud Model Studio Qwen TTS VD models. Use when creating custom synthetic voices from text descriptions and using them for speech synthesis.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningqwencloud/qwencloud-ai

qwencloud-audio-tts

[QwenCloud] Synthesize speech from text with Qwen TTS models. TRIGGER when: user wants to convert text to speech, create voiceovers, generate audio narration, read text aloud, build TTS applications, mentions speech synthesis/voice generation/audio output from text, or explicitly invokes this skill by name (e.g. use qwencloud-audio-tts). DO NOT TRIGGER when: user wants speech recognition/ASR, text generation without audio, non-Qwen audio tasks.

🇺🇸|EnglishTranslated

4 scripts/Checked

AI & Machine Learningchanjing-ai/chan-skills

chanjing-tts-voice-clone

Use Chanjing TTS API to synthesize speech from text, using user-provided voice

🇺🇸|EnglishTranslated

AI & Machine Learningsmallnest/goal-workflow

listenhub-tts

Convert text to speech (TTS) using the ListenHub API. Three modes are supported: Quick Synthesis (/v1/tts), Multi-role Script (/v1/speech), and Long Text Streaming Synthesis (/v1/flow-speech/episodes). If no voice is specified, automatically retrieve the voice list for user selection, with chat-girl-105-cn (Xiaoman) as the default. Use when user says: "tts", "text to speech", "语音合成", "文字转语音", "朗读", "生成语音", "生成音频", "转音频", "text to audio"

🇨🇳|ChineseTranslated

AI & Machine Learningchanjing-ai/chan-skills

chanjing-tts

Use Chanjing TTS API to convert text to speech

🇺🇸|EnglishTranslated

AI & Machine Learningaliyun/alibabacloud-aiops...

alibabacloud-avatar-video

Use Alibaba Cloud DashScope API and LingMou to generate AI video and speech. Seven capabilities — (1) LivePortrait talking-head (image + audio → video, two-step), (2) EMO talking-head, (3) AA/AnimateAnyone full-body animation (three-step), (4) T2I text-to-image (Wan 2.x, default wan2.2-t2i-flash), (5) I2V image-to-video (Wan 2.x, default wan2.7-i2v-flash, supports T2I→I2V pipeline), (6) Qwen TTS (auto model/voice by scene, default qwen3-tts-vd-realtime-2026-01-15), (7) LingMou digital-human template video with random template, public-template copy, and script confirmation. Trigger when the user needs talking-head, portrait, full-body animation, text-to-image, text-to-video, or speech synthesis.

🇺🇸|EnglishTranslated

9 scripts/Attention

AI & Machine Learningmarswaveai/skills

tts

Text-to-speech and voice narration. Triggers on: "朗读这段", "配音", "TTS", "语音合成", "text to speech", "read this aloud", "convert to speech", "voice narration", "read aloud".

🇺🇸|EnglishTranslated

AI & Machine Learningdaymade/claude-code-skill...

stepfun-tts

Generate Chinese / Japanese speech with StepFun's stepaudio-2.5-tts — Contextual TTS that replaces step-tts-2's `voice_label` with natural-language `instruction` (≤200 chars) plus inline `()` parentheses for句内 prosody. Use when the user wants emotional / prosody control over voice synthesis (whisper, pause, stress, mood pivot mid-sentence), batch-generates game / app voice lines, migrates from `step-tts-2` (the `voice_label → instruction` breaking change), or hits StepFun's stricter 2.5-era censorship (死/消失/political terms). Triggers on 阶跃 TTS, StepAudio 合成, 语音合成, 配音, 文本转语音, TTS 升级, 迁移 step-tts-2. For transcription with the sibling stepaudio-2.5-asr model, use the stepfun-asr skill instead.

🇺🇸|EnglishTranslated

2 scripts/Attention