eachlabs-voice-audio

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

EachLabs Voice & Audio

EachLabs 语音与音频处理

Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.
通过EachLabs Predictions API实现文本转语音、语音转文本、声音转换及音频工具功能。

Authentication

身份验证

Header: X-API-Key: <your-api-key>
Set the
EACHLABS_API_KEY
environment variable. Get your key at eachlabs.ai.
Header: X-API-Key: <your-api-key>
设置
EACHLABS_API_KEY
环境变量。可前往eachlabs.ai获取您的密钥。

Available Models

可用模型

Text-to-Speech

文本转语音

ModelSlugBest For
ElevenLabs TTS
elevenlabs-text-to-speech
High quality TTS
ElevenLabs TTS w/ Timestamps
elevenlabs-text-to-speech-with-timestamp
TTS with word timing
ElevenLabs Text to Dialogue
elevenlabs-text-to-dialogue
Multi-speaker dialogue
ElevenLabs Sound Effects
elevenlabs-sound-effects
Sound effect generation
ElevenLabs Voice Design v2
elevenlabs-voice-design-v2
Custom voice design
Kling V1 TTS
kling-v1-tts
Kling text-to-speech
Kokoro 82M
kokoro-82m
Lightweight TTS
Play AI Dialog
play-ai-text-to-speech-dialog
Dialog TTS
Stable Audio 2.5
stable-audio-2-5-text-to-audio
Text to audio
模型Slug适用场景
ElevenLabs TTS
elevenlabs-text-to-speech
高质量文本转语音
ElevenLabs TTS(带时间戳)
elevenlabs-text-to-speech-with-timestamp
带单词时间戳的文本转语音
ElevenLabs 文本转对话
elevenlabs-text-to-dialogue
多说话人对话生成
ElevenLabs 音效生成
elevenlabs-sound-effects
音效生成
ElevenLabs 语音设计v2
elevenlabs-voice-design-v2
自定义语音设计
Kling V1 TTS
kling-v1-tts
Kling语文本转语音
Kokoro 82M
kokoro-82m
轻量级文本转语音
Play AI 对话TTS
play-ai-text-to-speech-dialog
对话式文本转语音
Stable Audio 2.5
stable-audio-2-5-text-to-audio
文本转音频

Speech-to-Text

语音转文本

ModelSlugBest For
ElevenLabs Scribe v2
elevenlabs-speech-to-text-scribe-v2
Best quality transcription
ElevenLabs STT
elevenlabs-speech-to-text
Standard transcription
Wizper with Timestamp
wizper-with-timestamp
Timestamped transcription
Wizper
wizper
Basic transcription
Whisper
whisper
Open-source transcription
Whisper Diarization
whisper-diarization
Speaker identification
Incredibly Fast Whisper
incredibly-fast-whisper
Fastest transcription
模型Slug适用场景
ElevenLabs Scribe v2
elevenlabs-speech-to-text-scribe-v2
高质量语音转写
ElevenLabs STT
elevenlabs-speech-to-text
标准语音转写
Wizper(带时间戳)
wizper-with-timestamp
带时间戳的语音转写
Wizper
wizper
基础语音转写
Whisper
whisper
开源语音转写
Whisper 说话人分离
whisper-diarization
说话人识别
极速Whisper
incredibly-fast-whisper
最快速度语音转写

Voice Conversion & Cloning

声音转换与克隆

ModelSlugBest For
RVC v2
rvc-v2
Voice conversion
Train RVC
train-rvc
Train custom voice model
ElevenLabs Voice Clone
elevenlabs-voice-clone
Voice cloning
ElevenLabs Voice Changer
elevenlabs-voice-changer
Voice transformation
ElevenLabs Voice Design v3
elevenlabs-voice-design-v3
Advanced voice design
ElevenLabs Dubbing
elevenlabs-dubbing
Video dubbing
Chatterbox S2S
chatterbox-speech-to-speech
Speech to speech
Open Voice
openvoice
Open-source voice clone
XTTS v2
xtts-v2
Multi-language voice clone
Stable Audio 2.5 Inpaint
stable-audio-2-5-inpaint
Audio inpainting
Stable Audio 2.5 A2A
stable-audio-2-5-audio-to-audio
Audio transformation
Audio Trimmer
audio-trimmer-with-fade
Audio trimming with fade
模型Slug适用场景
RVC v2
rvc-v2
声音转换
训练RVC模型
train-rvc
训练自定义语音模型
ElevenLabs 语音克隆
elevenlabs-voice-clone
语音克隆
ElevenLabs 声音变换
elevenlabs-voice-changer
声音变换
ElevenLabs 语音设计v3
elevenlabs-voice-design-v3
进阶语音设计
ElevenLabs 视频配音
elevenlabs-dubbing
视频配音
Chatterbox 语音转语音
chatterbox-speech-to-speech
语音转语音
Open Voice
openvoice
开源语音克隆
XTTS v2
xtts-v2
多语言语音克隆
Stable Audio 2.5 音频修复
stable-audio-2-5-inpaint
音频修复
Stable Audio 2.5 音频转音频
stable-audio-2-5-audio-to-audio
音频变换
音频剪辑(带淡入淡出)
audio-trimmer-with-fade
带淡入淡出的音频剪辑

Audio Utilities

音频工具

ModelSlugBest For
FFmpeg Merge Audio Video
ffmpeg-api-merge-audio-video
Merge audio with video
Toolkit Video Convert
toolkit
Video/audio conversion
模型Slug适用场景
FFmpeg 音视频合并
ffmpeg-api-merge-audio-video
音频与视频合并
音视频转换工具包
toolkit
音视频格式转换

Prediction Flow

预测流程

  1. Check model
    GET https://api.eachlabs.ai/v1/model?slug=<slug>
    — validates the model exists and returns the
    request_schema
    with exact input parameters. Always do this before creating a prediction to ensure correct inputs.
  2. POST
    https://api.eachlabs.ai/v1/prediction
    with model slug, version
    "0.0.1"
    , and input matching the schema
  3. Poll
    GET https://api.eachlabs.ai/v1/prediction/{id}
    until status is
    "success"
    or
    "failed"
  4. Extract the output from the response
  1. 检查模型
    GET https://api.eachlabs.ai/v1/model?slug=<slug>
    — 验证模型是否存在,并返回包含准确输入参数的
    request_schema
    。在创建预测前务必执行此步骤,以确保输入参数正确。
  2. 发送POST请求
    https://api.eachlabs.ai/v1/prediction
    ,携带模型标识(slug)、版本
    "0.0.1"
    及符合schema的输入参数
  3. 轮询请求
    GET https://api.eachlabs.ai/v1/prediction/{id}
    ,直到状态变为
    "success"
    "failed"
  4. 提取结果 从响应中获取输出内容

Examples

示例

Text-to-Speech with ElevenLabs

基于ElevenLabs的文本转语音

bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-text-to-speech",
    "version": "0.0.1",
    "input": {
      "text": "Welcome to our product demo. Today we will walk through the key features.",
      "voice_id": "EXAVITQu4vr4xnSDxMaL",
      "model_id": "eleven_v3",
      "stability": 0.5,
      "similarity_boost": 0.7
    }
  }'
bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-text-to-speech",
    "version": "0.0.1",
    "input": {
      "text": "Welcome to our product demo. Today we will walk through the key features.",
      "voice_id": "EXAVITQu4vr4xnSDxMaL",
      "model_id": "eleven_v3",
      "stability": 0.5,
      "similarity_boost": 0.7
    }
  }'

Transcription with ElevenLabs Scribe

基于ElevenLabs Scribe的语音转写

bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-speech-to-text-scribe-v2",
    "version": "0.0.1",
    "input": {
      "media_url": "https://example.com/recording.mp3",
      "diarize": true,
      "timestamps_granularity": "word"
    }
  }'
bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-speech-to-text-scribe-v2",
    "version": "0.0.1",
    "input": {
      "media_url": "https://example.com/recording.mp3",
      "diarize": true,
      "timestamps_granularity": "word"
    }
  }'

Transcription with Wizper (Whisper)

基于Wizper(Whisper)的语音转写

bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "wizper-with-timestamp",
    "version": "0.0.1",
    "input": {
      "audio_url": "https://example.com/audio.mp3",
      "language": "en",
      "task": "transcribe",
      "chunk_level": "segment"
    }
  }'
bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "wizper-with-timestamp",
    "version": "0.0.1",
    "input": {
      "audio_url": "https://example.com/audio.mp3",
      "language": "en",
      "task": "transcribe",
      "chunk_level": "segment"
    }
  }'

Speaker Diarization with Whisper

基于Whisper的说话人分离

bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "whisper-diarization",
    "version": "0.0.1",
    "input": {
      "file_url": "https://example.com/meeting.mp3",
      "num_speakers": 3,
      "language": "en",
      "group_segments": true
    }
  }'
bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "whisper-diarization",
    "version": "0.0.1",
    "input": {
      "file_url": "https://example.com/meeting.mp3",
      "num_speakers": 3,
      "language": "en",
      "group_segments": true
    }
  }'

Voice Conversion with RVC v2

基于RVC v2的声音转换

bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "rvc-v2",
    "version": "0.0.1",
    "input": {
      "input_audio": "https://example.com/vocals.wav",
      "rvc_model": "CUSTOM",
      "custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
      "pitch_change": 0,
      "output_format": "wav"
    }
  }'
bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "rvc-v2",
    "version": "0.0.1",
    "input": {
      "input_audio": "https://example.com/vocals.wav",
      "rvc_model": "CUSTOM",
      "custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
      "pitch_change": 0,
      "output_format": "wav"
    }
  }'

Merge Audio with Video

音视频合并

bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "ffmpeg-api-merge-audio-video",
    "version": "0.0.1",
    "input": {
      "video_url": "https://example.com/video.mp4",
      "audio_url": "https://example.com/narration.mp3",
      "start_offset": 0
    }
  }'
bash
curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "ffmpeg-api-merge-audio-video",
    "version": "0.0.1",
    "input": {
      "video_url": "https://example.com/video.mp4",
      "audio_url": "https://example.com/narration.mp3",
      "start_offset": 0
    }
  }'

ElevenLabs Voice IDs

ElevenLabs 语音ID

The
elevenlabs-text-to-speech
model supports these voice IDs. Pass the raw ID string:
Voice IDNotes
EXAVITQu4vr4xnSDxMaL
Default voice
9BWtsMINqrJLrRacOk9x
CwhRBWXzGAHq8TQ4Fs17
FGY2WhTYpPnrIDTdsKH5
JBFqnCBsd6RMkjVDRZzb
N2lVS1w4EtoT3dr4eOWO
TX3LPaxmHKxFdv7VOQHJ
XB0fDUnXU5powFXDhCwa
onwK4e9ZLuTAKqWW03F9
pFZP5JQG7iQjIQuC4Bku
elevenlabs-text-to-speech
模型支持以下语音ID,直接传入原始ID字符串即可:
语音ID说明
EXAVITQu4vr4xnSDxMaL
默认语音
9BWtsMINqrJLrRacOk9x
CwhRBWXzGAHq8TQ4Fs17
FGY2WhTYpPnrIDTdsKH5
JBFqnCBsd6RMkjVDRZzb
N2lVS1w4EtoT3dr4eOWO
TX3LPaxmHKxFdv7VOQHJ
XB0fDUnXU5powFXDhCwa
onwK4e9ZLuTAKqWW03F9
pFZP5JQG7iQjIQuC4Bku

Parameter Reference

参数参考

See references/MODELS.md for complete parameter details for each model.
如需查看各模型的完整参数详情,请参阅references/MODELS.md