Search Results: text-to-speech

Found 89 Skills

speech-engine

Add real-time voice conversations to a custom LLM, OpenClaw, or similar agent runtime with ElevenLabs Speech Engine. Use when building Speech Engine servers, WebSocket handlers, WebRTC browser clients, conversation token endpoints, interruption-aware streaming responses, or voice-enabled chat agents that connect a developer-owned LLM to ElevenLabs speech-to-text and text-to-speech.

🇺🇸|EnglishTranslated

AI & Machine Learningrunwayml/skills

runwayml

Generate AI videos, images, and audio with Runway API. Use when generating video from images, text-to-video, video-to-video, character performance, text-to-image, text-to-speech, sound effects, or voice processing with Runway.

🇺🇸|EnglishTranslated

AI & Machine Learningcnemri/google-genai-skill...

speech-build

Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).

🇺🇸|EnglishTranslated

AI & Machine Learningjarmen423/skills

qwen3-tts

Build text-to-speech applications using Qwen3-TTS, a powerful speech generation system supporting voice clone, voice design, and custom voice synthesis. Use when creating TTS applications, generating speech from text, cloning voices from audio samples, designing new voices via natural language descriptions, or fine-tuning TTS models. Supports 10 languages (Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian).

🇺🇸|EnglishTranslated

Backend Developmentteam-telnyx/skills

telnyx-voice-media-python

Play audio files, use text-to-speech, and record calls. Use when building IVR systems, playing announcements, or recording conversations. This skill provides Python SDK examples.

🇺🇸|EnglishTranslated

Backend Developmentsinch/skills

sinch-voice-api

Build voice apps with Sinch Voice REST API. Use for phone calls, text-to-speech (TTS), IVR menus, DTMF input, conference calling, call recording, call forwarding, answering machine detection (AMD), SIP routing, WebSocket audio streaming, and SVAML call control.

🇺🇸|EnglishTranslated

5 scripts/Attention

AI & Machine Learningxiaomimimo/mimo-skills

mimo-v2-5-tts

MiMo V2.5 TTS Text-to-Speech. Generate speech using Xiaomi MiMo V2.5 TTS series models. This skill is activated when text needs to be converted to speech, voice messages need to be sent, content needs to be read aloud, or when users request 'speak it out' or 'voice reply'. It supports three modes: preset voice, voice design, and voice cloning, as well as natural language control and director mode. It also supports style tag control for tone, emotion, and dialect, and preset voices support singing.

🇨🇳|ChineseTranslated

4 scripts/Attention

Tools & Utilitiesconversiontools/agent-ski...

conversiontools

Convert files between 140+ formats using the ConversionTools MCP server. Use when the user needs to convert documents (Word, PDF, Excel, PowerPoint), data formats (JSON, CSV, XML, YAML, Parquet), images (PNG, JPG, WebP, AVIF, HEIC, JXL, SVG), audio (MP3, WAV, FLAC), video (MOV, MKV, AVI to MP4), e-books (EPUB, MOBI, AZW), OCR text extraction, AI-powered data extraction, AI text-to-speech (TTS), AI speech-to-text transcription (STT), subtitle conversion (SRT, VTT, ASS), or website screenshots.

🇺🇸|EnglishTranslated

AI & Machine Learningjackspace/claudeskillz

elevenlabs-agents

Use this skill when building AI voice agents with the ElevenLabs Agents Platform. This skill covers the complete platform including agent configuration (system prompts, turn-taking, workflows), voice & language features (multi-voice, pronunciation, speed control), knowledge base (RAG), tools (client/server/MCP/system), SDKs (React, JavaScript, React Native, Swift, Widget), Scribe (real-time STT), WebRTC/WebSocket connections, testing & evaluation, analytics, privacy/compliance (GDPR/HIPAA/SOC 2), cost optimization, CLI workflows ("agents as code"), and DevOps integration. Prevents 17+ common errors including package deprecation, Android audio cutoff, CSP violations, missing dynamic variables, case-sensitive tool names, webhook authentication failures, and WebRTC configuration issues. Provides production-tested templates for React, Next.js, React Native, Swift, and Cloudflare Workers. Token savings: ~73% (22k → 6k tokens). Production tested. Keywords: ElevenLabs Agents, ElevenLabs voice agents, AI voice agents, conversational AI, @elevenlabs/react, @elevenlabs/client, @elevenlabs/react-native, @elevenlabs/elevenlabs-js, @elevenlabs/agents-cli, elevenlabs SDK, voice AI, TTS, text-to-speech, ASR, speech recognition, turn-taking model, WebRTC voice, WebSocket voice, ElevenLabs conversation, agent system prompt, agent tools, agent knowledge base, RAG voice agents, multi-voice agents, pronunciation dictionary, voice speed control, elevenlabs scribe, @11labs deprecated, Android audio cutoff, CSP violation elevenlabs, dynamic variables elevenlabs, case-sensitive tool names, webhook authentication

🇺🇸|EnglishTranslated

Mobile Developmentjchaselubitz/drill-app

expo-audio

Guide for using expo-audio to implement audio playback and recording in React Native apps. Apply when working with audio features, sound playback, recording, or text-to-speech functionality.

🇺🇸|EnglishTranslated

AI & Machine Learninggaelic-ghost/productivity...

speak-with-profile

Profile-aware speech workflow for narrated notes, spoken drafts, audio summaries, accessibility reads, and other text-to-speech tasks. Use when one front-door workflow should resolve voice profiles, enforce disclosure, and apply manifest tracking before delegating to built-in `$speech` or a deterministic local CLI path.

🇺🇸|EnglishTranslated

3 scripts/Attention

AI & Machine Learningbytedance/agentkit-sample...

byted-text-to-speech

Convert text to speech (TTS). Powered by the VolcEngine Doubao Text-to-Speech API, it supports streaming synthesis, multiple voice timbres, adjustments to speech rate/pitch/loudness, Markdown syntax filtering, and LaTeX formula broadcasting. Use this skill when users need to convert text to speech, generate reading audio, dubbing, narration, broadcasts, or mention terms like 'text-to-speech', 'TTS', 'speech synthesis', 'reading aloud', or 'dubbing'.

🇨🇳|ChineseTranslated

2 scripts/Checked