voice-isolator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ElevenLabs Voice Isolator

Removes background noise from audio and isolates vocals/speech — useful for cleaning up noisy recordings, prepping audio for transcription, or pulling dialogue out of a mixed track.

Setup: See Installation Guide. For JavaScript, use
@elevenlabs/*
packages only.

可去除音频中的背景噪音并分离人声/语音——适用于清理嘈杂录音、为转录准备音频，或从混合音轨中提取对话。

设置步骤： 请查看安装指南。对于JavaScript，仅使用
@elevenlabs/*
包。

Quick Start

快速开始

Python

python

from elevenlabs import ElevenLabs

client = ElevenLabs()

with open("noisy.mp3", "rb") as audio_file:
    audio_stream = client.audio_isolation.convert(audio=audio_file)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)

python

from elevenlabs import ElevenLabs

client = ElevenLabs()

with open("noisy.mp3", "rb") as audio_file:
    audio_stream = client.audio_isolation.convert(audio=audio_file)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)

JavaScript

javascript

import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createReadStream, createWriteStream } from "fs";

const client = new ElevenLabsClient();

const audioStream = await client.audioIsolation.convert({
  audio: createReadStream("noisy.mp3"),
});

audioStream.pipe(createWriteStream("clean.mp3"));

javascript

import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
import { createReadStream, createWriteStream } from "fs";

const client = new ElevenLabsClient();

const audioStream = await client.audioIsolation.convert({
  audio: createReadStream("noisy.mp3"),
});

audioStream.pipe(createWriteStream("clean.mp3"));

cURL

bash

curl -X POST "https://api.elevenlabs.io/v1/audio-isolation" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -F "audio=@noisy.mp3" \
  --output clean.mp3

bash

curl -X POST "https://api.elevenlabs.io/v1/audio-isolation" \
  -H "xi-api-key: $ELEVENLABS_API_KEY" \
  -F "audio=@noisy.mp3" \
  --output clean.mp3

Parameters

参数

Parameter	Type	Default	Description
`audio`	file (required)	—	Audio file with vocals/speech to isolate
`file_format`	string	`other`	`other` for any encoded audio, or `pcm_s16le_16` for 16-bit PCM mono @ 16kHz little-endian (lower latency)

参数	类型	默认值	说明
`audio`	文件（必填）	—	包含待分离人声/语音的音频文件
`file_format`	字符串	`other`	`other` 适用于任何编码音频， `pcm_s16le_16` 适用于16位PCM单声道@16kHz小端格式（更低延迟）

Isolating from a URL

从URL中分离语音

python

import requests
from io import BytesIO
from elevenlabs import ElevenLabs

client = ElevenLabs()

audio_url = "https://example.com/noisy.mp3"
response = requests.get(audio_url)
audio_data = BytesIO(response.content)

audio_stream = client.audio_isolation.convert(audio=audio_data)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)

python

import requests
from io import BytesIO
from elevenlabs import ElevenLabs

client = ElevenLabs()

audio_url = "https://example.com/noisy.mp3"
response = requests.get(audio_url)
audio_data = BytesIO(response.content)

audio_stream = client.audio_isolation.convert(audio=audio_data)

with open("clean.mp3", "wb") as f:
    for chunk in audio_stream:
        f.write(chunk)

Low-Latency PCM Input

低延迟PCM输入

If you already have raw 16-bit PCM mono @ 16kHz, passing

file_format="pcm_s16le_16"

skips decoding and reduces latency:

python

audio_stream = client.audio_isolation.convert(
    audio=pcm_bytes,
    file_format="pcm_s16le_16",
)

如果您已有原始16位PCM单声道@16kHz音频，传入

file_format="pcm_s16le_16"

可跳过解码步骤并降低延迟：

python

audio_stream = client.audio_isolation.convert(
    audio=pcm_bytes,
    file_format="pcm_s16le_16",
)

Supported Formats

支持的格式

Any common encoded audio/video container works as input (MP3, WAV, M4A, FLAC, OGG, WebM, MP4, etc.). Response is a streamed MP3 by default.

任何常见的编码音频/视频容器均可作为输入（MP3、WAV、M4A、FLAC、OGG、WebM、MP4等）。默认响应为流式MP3格式。

Common Workflows

常见工作流程

Clean up interview/podcast recordings — strip room tone, HVAC, traffic before editing.
Prep noisy audio for Speech-to-Text — isolate voice first, then pass through
```
speech_to_text.convert()
```
for better transcription accuracy.
Extract dialogue from mixed tracks — pull vocals out of a track with music/SFX.
Pre-processing for Voice Changer — isolate the source voice before applying voice transformation.

清理访谈/播客录音 —— 在编辑前去除房间混响、空调噪音、交通噪音。
为语音转文本准备嘈杂音频 —— 先分离语音，再传入
```
speech_to_text.convert()
```
以提高转录准确率。
从混合音轨中提取对话 —— 从包含音乐/音效的音轨中提取人声。
语音变声器预处理 —— 在应用语音转换前先分离源语音。

Error Handling

错误处理

python

try:
    audio_stream = client.audio_isolation.convert(audio=audio_file)
except Exception as e:
    print(f"Voice isolation failed: {e}")

Common errors:

401: Invalid API key
422: Invalid parameters (e.g. wrong
```
file_format
```
for the supplied audio)
429: Rate limit exceeded

python

try:
    audio_stream = client.audio_isolation.convert(audio=audio_file)
except Exception as e:
    print(f"Voice isolation failed: {e}")

常见错误：

401：无效API密钥
422：无效参数（例如，提供的音频与
```
file_format
```
不匹配）
429：超出速率限制

References

参考资料

Installation Guide

安装指南