voice-isolator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ElevenLabs Voice Isolator

ElevenLabs Voice Isolator

Removes background noise from audio and isolates vocals/speech — useful for cleaning up noisy recordings, prepping audio for transcription, or pulling dialogue out of a mixed track.
Setup: See Installation Guide. For JavaScript, use
@elevenlabs/*
packages only.
可去除音频中的背景噪音并分离人声/语音——适用于清理嘈杂录音、为转录准备音频,或从混合音轨中提取对话。
设置步骤: 请查看安装指南。对于JavaScript,仅使用
@elevenlabs/*
包。

Quick Start

快速开始

Python

Python

python
from elevenlabs import ElevenLabs

client = ElevenLabs()

with open("noisy.mp3", "rb") as audio_file:
    audio_stream = client.audio_isolation.convert(audio=audio_file)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)
python
from elevenlabs import ElevenLabs

client = ElevenLabs()

with open("noisy.mp3", "rb") as audio_file:
    audio_stream = client.audio_isolation.convert(audio=audio_file)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)

JavaScript

JavaScript

javascript
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createReadStream, createWriteStream } from "fs";

const client = new ElevenLabsClient();

const audioStream = await client.audioIsolation.convert({
  audio: createReadStream("noisy.mp3"),
});

audioStream.pipe(createWriteStream("clean.mp3"));
javascript
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createReadStream, createWriteStream } from "fs";

const client = new ElevenLabsClient();

const audioStream = await client.audioIsolation.convert({
  audio: createReadStream("noisy.mp3"),
});

audioStream.pipe(createWriteStream("clean.mp3"));

cURL

cURL

bash
curl -X POST "https://api.elevenlabs.io/v1/audio-isolation" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -F "audio=@noisy.mp3" \
  --output clean.mp3
bash
curl -X POST "https://api.elevenlabs.io/v1/audio-isolation" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -F "audio=@noisy.mp3" \
  --output clean.mp3

Parameters

参数

ParameterTypeDefaultDescription
audio
file (required)Audio file with vocals/speech to isolate
file_format
string
other
other
for any encoded audio, or
pcm_s16le_16
for 16-bit PCM mono @ 16kHz little-endian (lower latency)
参数类型默认值说明
audio
文件(必填)包含待分离人声/语音的音频文件
file_format
字符串
other
other
适用于任何编码音频,
pcm_s16le_16
适用于16位PCM单声道@16kHz小端格式(更低延迟)

Isolating from a URL

从URL中分离语音

python
import requests
from io import BytesIO
from elevenlabs import ElevenLabs

client = ElevenLabs()

audio_url = "https://example.com/noisy.mp3"
response = requests.get(audio_url)
audio_data = BytesIO(response.content)

audio_stream = client.audio_isolation.convert(audio=audio_data)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)
python
import requests
from io import BytesIO
from elevenlabs import ElevenLabs

client = ElevenLabs()

audio_url = "https://example.com/noisy.mp3"
response = requests.get(audio_url)
audio_data = BytesIO(response.content)

audio_stream = client.audio_isolation.convert(audio=audio_data)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)

Low-Latency PCM Input

低延迟PCM输入

If you already have raw 16-bit PCM mono @ 16kHz, passing
file_format="pcm_s16le_16"
skips decoding and reduces latency:
python
audio_stream = client.audio_isolation.convert(
    audio=pcm_bytes,
    file_format="pcm_s16le_16",
)
如果您已有原始16位PCM单声道@16kHz音频,传入
file_format="pcm_s16le_16"
可跳过解码步骤并降低延迟:
python
audio_stream = client.audio_isolation.convert(
    audio=pcm_bytes,
    file_format="pcm_s16le_16",
)

Supported Formats

支持的格式

Any common encoded audio/video container works as input (MP3, WAV, M4A, FLAC, OGG, WebM, MP4, etc.). Response is a streamed MP3 by default.
任何常见的编码音频/视频容器均可作为输入(MP3、WAV、M4A、FLAC、OGG、WebM、MP4等)。默认响应为流式MP3格式。

Common Workflows

常见工作流程

  • Clean up interview/podcast recordings — strip room tone, HVAC, traffic before editing.
  • Prep noisy audio for Speech-to-Text — isolate voice first, then pass through
    speech_to_text.convert()
    for better transcription accuracy.
  • Extract dialogue from mixed tracks — pull vocals out of a track with music/SFX.
  • Pre-processing for Voice Changer — isolate the source voice before applying voice transformation.
  • 清理访谈/播客录音 —— 在编辑前去除房间混响、空调噪音、交通噪音。
  • 为语音转文本准备嘈杂音频 —— 先分离语音,再传入
    speech_to_text.convert()
    以提高转录准确率。
  • 从混合音轨中提取对话 —— 从包含音乐/音效的音轨中提取人声。
  • 语音变声器预处理 —— 在应用语音转换前先分离源语音。

Error Handling

错误处理

python
try:
    audio_stream = client.audio_isolation.convert(audio=audio_file)
except Exception as e:
    print(f"Voice isolation failed: {e}")
Common errors:
  • 401: Invalid API key
  • 422: Invalid parameters (e.g. wrong
    file_format
    for the supplied audio)
  • 429: Rate limit exceeded
python
try:
    audio_stream = client.audio_isolation.convert(audio=audio_file)
except Exception as e:
    print(f"Voice isolation failed: {e}")
常见错误:
  • 401:无效API密钥
  • 422:无效参数(例如,提供的音频与
    file_format
    不匹配)
  • 429:超出速率限制

References

参考资料

  • Installation Guide
  • 安装指南