Loading...
Loading...
Compare original and translation side by side
D:\code\qwen3-ttsD:\code\qwen3-tts| Task | Model | Method |
|---|---|---|
| Custom voice with preset speakers | CustomVoice | |
| Design new voice via description | VoiceDesign | |
| Clone voice from audio sample | Base | |
| Encode/decode audio | Tokenizer | |
| 任务 | 模型 | 方法 |
|---|---|---|
| 自定义预设发音人语音 | CustomVoice | |
| 通过描述设计新语音 | VoiceDesign | |
| 从音频样本克隆语音 | Base | |
| 音频编码/解码 | Tokenizer | |
undefinedundefinedundefinedundefined| Model | Features |
|---|---|
| 9 preset speakers, instruction control |
| Create voices from natural language descriptions |
| Voice cloning, fine-tuning base |
| Smaller custom voice model |
| Smaller base model for cloning/fine-tuning |
| Audio encoder/decoder |
| 模型 | 特性 |
|---|---|
| 9个预设发音人,支持指令控制 |
| 通过自然语言描述创建语音 |
| 语音克隆,微调基础模型 |
| 轻量版自定义语音模型 |
| 轻量版克隆/微调基础模型 |
| 音频编码器/解码器 |
import torch
import soundfile as sf
from qwen_tts import Qwen3TTSModel
model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)import torch
import soundfile as sf
from qwen_tts import Qwen3TTSModel
model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)
**Available Speakers:**
| Speaker | Description | Native Language |
|---------|-------------|-----------------|
| Vivian | Bright, edgy young female | Chinese |
| Serena | Warm, gentle young female | Chinese |
| Uncle_Fu | Low, mellow mature male | Chinese |
| Dylan | Youthful Beijing male | Chinese (Beijing) |
| Eric | Lively Chengdu male | Chinese (Sichuan) |
| Ryan | Dynamic male with rhythmic drive | English |
| Aiden | Sunny American male | English |
| Ono_Anna | Playful Japanese female | Japanese |
| Sohee | Warm Korean female | Korean |
**可用发音人:**
| 发音人 | 描述 | 母语 |
|---------|-------------|-----------------|
| Vivian | 活泼干练的年轻女性 | 中文 |
| Serena | 温暖柔和的年轻女性 | 中文 |
| Uncle_Fu | 低沉醇厚的成熟男性 | 中文 |
| Dylan | 年轻的北京男性 | 中文(北京话) |
| Eric | 活泼的成都男性 | 中文(四川话) |
| Ryan | 富有节奏感的活力男性 | 英文 |
| Aiden | 阳光的美国男性 | 英文 |
| Ono_Anna | 俏皮的日本女性 | 日文 |
| Sohee | 温暖的韩国女性 | 韩文 |model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)
wavs, sr = model.generate_voice_design(
text="Welcome to our presentation today.",
language="English",
instruct="Professional male voice, warm baritone, confident and clear",
)
sf.write("designed_voice.wav", wavs[0], sr)model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)
wavs, sr = model.generate_voice_design(
text="Welcome to our presentation today.",
language="English",
instruct="Professional male voice, warm baritone, confident and clear",
)
sf.write("designed_voice.wav", wavs[0], sr)model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-Base",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)model = Qwen3TTSModel.from_pretrained(
"Qwen/Qwen3-TTS-12Hz-1.7B-Base",
device_map="cuda:0",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2",
)undefinedundefinedundefinedundefinedundefinedundefinedfrom qwen_tts import Qwen3TTSTokenizer
import soundfile as sf
tokenizer = Qwen3TTSTokenizer.from_pretrained(
"Qwen/Qwen3-TTS-Tokenizer-12Hz",
device_map="cuda:0",
)from qwen_tts import Qwen3TTSTokenizer
import soundfile as sf
tokenizer = Qwen3TTSTokenizer.from_pretrained(
"Qwen/Qwen3-TTS-Tokenizer-12Hz",
device_map="cuda:0",
)undefinedundefinedgenerate_*wavs, sr = model.generate_custom_voice(
text="...",
language="Auto",
speaker="Ryan",
max_new_tokens=2048,
do_sample=True,
top_k=50,
top_p=1.0,
temperature=0.9,
repetition_penalty=1.05,
)generate_*wavs, sr = model.generate_custom_voice(
text="...",
language="Auto",
speaker="Ryan",
max_new_tokens=2048,
do_sample=True,
top_k=50,
top_p=1.0,
temperature=0.9,
repetition_penalty=1.05,
)undefinedundefinedundefinedundefinedlanguage="Auto"language="Auto"D:\code\qwen3-ttsD:\code\qwen3-tts