eachlabs-voice-audio

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

EachLabs Voice & Audio

EachLabs 语音与音频处理

Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.

通过EachLabs Predictions API实现文本转语音、语音转文本、声音转换及音频工具功能。

Authentication

身份验证

Header: X-API-Key: <your-api-key>

Set the

EACHLABS_API_KEY

environment variable. Get your key at eachlabs.ai.

Header: X-API-Key: <your-api-key>

设置

EACHLABS_API_KEY

环境变量。可前往eachlabs.ai获取您的密钥。

Available Models

可用模型

Text-to-Speech

文本转语音

Model	Slug	Best For
ElevenLabs TTS	`elevenlabs-text-to-speech`	High quality TTS
ElevenLabs TTS w/ Timestamps	`elevenlabs-text-to-speech-with-timestamp`	TTS with word timing
ElevenLabs Text to Dialogue	`elevenlabs-text-to-dialogue`	Multi-speaker dialogue
ElevenLabs Sound Effects	`elevenlabs-sound-effects`	Sound effect generation
ElevenLabs Voice Design v2	`elevenlabs-voice-design-v2`	Custom voice design
Kling V1 TTS	`kling-v1-tts`	Kling text-to-speech
Kokoro 82M	`kokoro-82m`	Lightweight TTS
Play AI Dialog	`play-ai-text-to-speech-dialog`	Dialog TTS
Stable Audio 2.5	`stable-audio-2-5-text-to-audio`	Text to audio

模型	Slug	适用场景
ElevenLabs TTS	`elevenlabs-text-to-speech`	高质量文本转语音
ElevenLabs TTS（带时间戳）	`elevenlabs-text-to-speech-with-timestamp`	带单词时间戳的文本转语音
ElevenLabs 文本转对话	`elevenlabs-text-to-dialogue`	多说话人对话生成
ElevenLabs 音效生成	`elevenlabs-sound-effects`	音效生成
ElevenLabs 语音设计v2	`elevenlabs-voice-design-v2`	自定义语音设计
Kling V1 TTS	`kling-v1-tts`	Kling语文本转语音
Kokoro 82M	`kokoro-82m`	轻量级文本转语音
Play AI 对话TTS	`play-ai-text-to-speech-dialog`	对话式文本转语音
Stable Audio 2.5	`stable-audio-2-5-text-to-audio`	文本转音频

Speech-to-Text

语音转文本

Model	Slug	Best For
ElevenLabs Scribe v2	`elevenlabs-speech-to-text-scribe-v2`	Best quality transcription
ElevenLabs STT	`elevenlabs-speech-to-text`	Standard transcription
Wizper with Timestamp	`wizper-with-timestamp`	Timestamped transcription
Wizper	`wizper`	Basic transcription
Whisper	`whisper`	Open-source transcription
Whisper Diarization	`whisper-diarization`	Speaker identification
Incredibly Fast Whisper	`incredibly-fast-whisper`	Fastest transcription

模型	Slug	适用场景
ElevenLabs Scribe v2	`elevenlabs-speech-to-text-scribe-v2`	高质量语音转写
ElevenLabs STT	`elevenlabs-speech-to-text`	标准语音转写
Wizper（带时间戳）	`wizper-with-timestamp`	带时间戳的语音转写
Wizper	`wizper`	基础语音转写
Whisper	`whisper`	开源语音转写
Whisper 说话人分离	`whisper-diarization`	说话人识别
极速Whisper	`incredibly-fast-whisper`	最快速度语音转写

Voice Conversion & Cloning

声音转换与克隆

Model	Slug	Best For
RVC v2	`rvc-v2`	Voice conversion
Train RVC	`train-rvc`	Train custom voice model
ElevenLabs Voice Clone	`elevenlabs-voice-clone`	Voice cloning
ElevenLabs Voice Changer	`elevenlabs-voice-changer`	Voice transformation
ElevenLabs Voice Design v3	`elevenlabs-voice-design-v3`	Advanced voice design
ElevenLabs Dubbing	`elevenlabs-dubbing`	Video dubbing
Chatterbox S2S	`chatterbox-speech-to-speech`	Speech to speech
Open Voice	`openvoice`	Open-source voice clone
XTTS v2	`xtts-v2`	Multi-language voice clone
Stable Audio 2.5 Inpaint	`stable-audio-2-5-inpaint`	Audio inpainting
Stable Audio 2.5 A2A	`stable-audio-2-5-audio-to-audio`	Audio transformation
Audio Trimmer	`audio-trimmer-with-fade`	Audio trimming with fade

模型	Slug	适用场景
RVC v2	`rvc-v2`	声音转换
训练RVC模型	`train-rvc`	训练自定义语音模型
ElevenLabs 语音克隆	`elevenlabs-voice-clone`	语音克隆
ElevenLabs 声音变换	`elevenlabs-voice-changer`	声音变换
ElevenLabs 语音设计v3	`elevenlabs-voice-design-v3`	进阶语音设计
ElevenLabs 视频配音	`elevenlabs-dubbing`	视频配音
Chatterbox 语音转语音	`chatterbox-speech-to-speech`	语音转语音
Open Voice	`openvoice`	开源语音克隆
XTTS v2	`xtts-v2`	多语言语音克隆
Stable Audio 2.5 音频修复	`stable-audio-2-5-inpaint`	音频修复
Stable Audio 2.5 音频转音频	`stable-audio-2-5-audio-to-audio`	音频变换
音频剪辑（带淡入淡出）	`audio-trimmer-with-fade`	带淡入淡出的音频剪辑

Audio Utilities

音频工具

Model	Slug	Best For
FFmpeg Merge Audio Video	`ffmpeg-api-merge-audio-video`	Merge audio with video
Toolkit Video Convert	`toolkit`	Video/audio conversion

模型	Slug	适用场景
FFmpeg 音视频合并	`ffmpeg-api-merge-audio-video`	音频与视频合并
音视频转换工具包	`toolkit`	音视频格式转换

Prediction Flow

预测流程

Check model
```
GET https://api.eachlabs.ai/v1/model?slug=<slug>
```
— validates the model exists and returns the
```
request_schema
```
with exact input parameters. Always do this before creating a prediction to ensure correct inputs.
POST
```
https://api.eachlabs.ai/v1/prediction
```
with model slug, version
```
"0.0.1"
```
, and input matching the schema

Poll

GET https://api.eachlabs.ai/v1/prediction/{id}

until status is

"success"

"failed"

Extract the output from the response

检查模型
```
GET https://api.eachlabs.ai/v1/model?slug=<slug>
```
— 验证模型是否存在，并返回包含准确输入参数的
```
request_schema
```
。在创建预测前务必执行此步骤，以确保输入参数正确。
发送POST请求 至
```
https://api.eachlabs.ai/v1/prediction
```
，携带模型标识（slug）、版本
```
"0.0.1"
```
及符合schema的输入参数

轮询请求

GET https://api.eachlabs.ai/v1/prediction/{id}

，直到状态变为

"success"

或

"failed"

提取结果 从响应中获取输出内容

Examples

示例

Text-to-Speech with ElevenLabs

基于ElevenLabs的文本转语音

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-text-to-speech",
    "version": "0.0.1",
    "input": {
      "text": "Welcome to our product demo. Today we will walk through the key features.",
      "voice_id": "EXAVITQu4vr4xnSDxMaL",
      "model_id": "eleven_v3",
      "stability": 0.5,
      "similarity_boost": 0.7
    }
  }'

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-text-to-speech",
    "version": "0.0.1",
    "input": {
      "text": "Welcome to our product demo. Today we will walk through the key features.",
      "voice_id": "EXAVITQu4vr4xnSDxMaL",
      "model_id": "eleven_v3",
      "stability": 0.5,
      "similarity_boost": 0.7
    }
  }'

Transcription with ElevenLabs Scribe

基于ElevenLabs Scribe的语音转写

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-speech-to-text-scribe-v2",
    "version": "0.0.1",
    "input": {
      "media_url": "https://example.com/recording.mp3",
      "diarize": true,
      "timestamps_granularity": "word"
    }
  }'

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "elevenlabs-speech-to-text-scribe-v2",
    "version": "0.0.1",
    "input": {
      "media_url": "https://example.com/recording.mp3",
      "diarize": true,
      "timestamps_granularity": "word"
    }
  }'

Transcription with Wizper (Whisper)

基于Wizper（Whisper）的语音转写

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "wizper-with-timestamp",
    "version": "0.0.1",
    "input": {
      "audio_url": "https://example.com/audio.mp3",
      "language": "en",
      "task": "transcribe",
      "chunk_level": "segment"
    }
  }'

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "wizper-with-timestamp",
    "version": "0.0.1",
    "input": {
      "audio_url": "https://example.com/audio.mp3",
      "language": "en",
      "task": "transcribe",
      "chunk_level": "segment"
    }
  }'

Speaker Diarization with Whisper

基于Whisper的说话人分离

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "whisper-diarization",
    "version": "0.0.1",
    "input": {
      "file_url": "https://example.com/meeting.mp3",
      "num_speakers": 3,
      "language": "en",
      "group_segments": true
    }
  }'

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "whisper-diarization",
    "version": "0.0.1",
    "input": {
      "file_url": "https://example.com/meeting.mp3",
      "num_speakers": 3,
      "language": "en",
      "group_segments": true
    }
  }'

Voice Conversion with RVC v2

基于RVC v2的声音转换

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "rvc-v2",
    "version": "0.0.1",
    "input": {
      "input_audio": "https://example.com/vocals.wav",
      "rvc_model": "CUSTOM",
      "custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
      "pitch_change": 0,
      "output_format": "wav"
    }
  }'

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "rvc-v2",
    "version": "0.0.1",
    "input": {
      "input_audio": "https://example.com/vocals.wav",
      "rvc_model": "CUSTOM",
      "custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
      "pitch_change": 0,
      "output_format": "wav"
    }
  }'

Merge Audio with Video

音视频合并

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "ffmpeg-api-merge-audio-video",
    "version": "0.0.1",
    "input": {
      "video_url": "https://example.com/video.mp4",
      "audio_url": "https://example.com/narration.mp3",
      "start_offset": 0
    }
  }'

bash

curl -X POST https://api.eachlabs.ai/v1/prediction \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $EACHLABS_API_KEY" \
  -d '{
    "model": "ffmpeg-api-merge-audio-video",
    "version": "0.0.1",
    "input": {
      "video_url": "https://example.com/video.mp4",
      "audio_url": "https://example.com/narration.mp3",
      "start_offset": 0
    }
  }'

ElevenLabs Voice IDs

ElevenLabs 语音ID

The

elevenlabs-text-to-speech

model supports these voice IDs. Pass the raw ID string:

Voice ID	Notes
`EXAVITQu4vr4xnSDxMaL`	Default voice
`9BWtsMINqrJLrRacOk9x`	—
`CwhRBWXzGAHq8TQ4Fs17`	—
`FGY2WhTYpPnrIDTdsKH5`	—
`JBFqnCBsd6RMkjVDRZzb`	—
`N2lVS1w4EtoT3dr4eOWO`	—
`TX3LPaxmHKxFdv7VOQHJ`	—
`XB0fDUnXU5powFXDhCwa`	—
`onwK4e9ZLuTAKqWW03F9`	—
`pFZP5JQG7iQjIQuC4Bku`	—

elevenlabs-text-to-speech

模型支持以下语音ID，直接传入原始ID字符串即可：

语音ID	说明
`EXAVITQu4vr4xnSDxMaL`	默认语音
`9BWtsMINqrJLrRacOk9x`	—
`CwhRBWXzGAHq8TQ4Fs17`	—
`FGY2WhTYpPnrIDTdsKH5`	—
`JBFqnCBsd6RMkjVDRZzb`	—
`N2lVS1w4EtoT3dr4eOWO`	—
`TX3LPaxmHKxFdv7VOQHJ`	—
`XB0fDUnXU5powFXDhCwa`	—
`onwK4e9ZLuTAKqWW03F9`	—
`pFZP5JQG7iQjIQuC4Bku`	—

Parameter Reference

参数参考

See references/MODELS.md for complete parameter details for each model.

如需查看各模型的完整参数详情，请参阅references/MODELS.md。