text-to-speech
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseText-to-Speech — Bulbul
文本转语音 — Bulbul
[!IMPORTANT] Auth:header — NOTapi-subscription-key. Base URL:Authorization: Bearerhttps://api.sarvam.ai/v1
[!IMPORTANT] 认证:使用请求头 — 而非api-subscription-key。基础URL:Authorization: Bearerhttps://api.sarvam.ai/v1
Model
模型
bulbul:v3shubhbulbul:v3shubhQuick Start (Python)
快速开始(Python)
python
from sarvamai import SarvamAI
from sarvamai.play import save
client = SarvamAI()
response = client.text_to_speech.convert(
text="नमस्ते, आप कैसे हैं?",
target_language_code="hi-IN",
model="bulbul:v3",
speaker="shubh"
)
save(response, "output.wav")python
from sarvamai import SarvamAI
from sarvamai.play import save
client = SarvamAI()
response = client.text_to_speech.convert(
text="नमस्ते, आप कैसे हैं?",
target_language_code="hi-IN",
model="bulbul:v3",
speaker="shubh"
)
save(response, "output.wav")HTTP Stream (lower latency, binary audio)
HTTP Stream (lower latency, binary audio)
chunks = []
for chunk in client.text_to_speech.convert_stream(
text="Hello from Sarvam AI",
target_language_code="en-IN",
speaker="shubh",
model="bulbul:v3"
):
chunks.append(chunk)
audio = b"".join(chunks)
undefinedchunks = []
for chunk in client.text_to_speech.convert_stream(
text="Hello from Sarvam AI",
target_language_code="en-IN",
speaker="shubh",
model="bulbul:v3"
):
chunks.append(chunk)
audio = b"".join(chunks)
undefinedQuick Start (JavaScript/TypeScript)
快速开始(JavaScript/TypeScript)
typescript
import { SarvamAIClient } from "sarvamai";
import { writeFile } from "fs/promises";
const client = new SarvamAIClient({ apiSubscriptionKey: "YOUR_SARVAM_API_KEY" });
// REST
const response = await client.textToSpeech.convert({
text: "नमस्ते, आप कैसे हैं?",
target_language_code: "hi-IN",
model: "bulbul:v3",
speaker: "shubh"
});
// HTTP Stream (lower latency, returns BinaryResponse)
const streamResponse = await client.textToSpeech.convertStream({
text: "Hello from Sarvam AI",
target_language_code: "en-IN",
speaker: "shubh",
model: "bulbul:v3"
});
const bytes = await streamResponse.bytes();
await writeFile("output.wav", bytes);typescript
import { SarvamAIClient } from "sarvamai";
import { writeFile } from "fs/promises";
const client = new SarvamAIClient({ apiSubscriptionKey: "YOUR_SARVAM_API_KEY" });
// REST
const response = await client.textToSpeech.convert({
text: "नमस्ते, आप कैसे हैं?",
target_language_code: "hi-IN",
model: "bulbul:v3",
speaker: "shubh"
});
// HTTP Stream (lower latency, returns BinaryResponse)
const streamResponse = await client.textToSpeech.convertStream({
text: "Hello from Sarvam AI",
target_language_code: "en-IN",
speaker: "shubh",
model: "bulbul:v3"
});
const bytes = await streamResponse.bytes();
await writeFile("output.wav", bytes);WebSocket Streaming
WebSocket流式传输
python
import asyncio
from sarvamai import AsyncSarvamAI
async def tts_stream():
client = AsyncSarvamAI()
async with client.text_to_speech_streaming.connect(model="bulbul:v3") as ws:
await ws.configure(target_language_code="hi-IN", speaker="shubh")
await ws.convert("Your text here")
await ws.flush()
async for message in ws:
pass # base64 audio chunks
asyncio.run(tts_stream())python
import asyncio
from sarvamai import AsyncSarvamAI
async def tts_stream():
client = AsyncSarvamAI()
async with client.text_to_speech_streaming.connect(model="bulbul:v3") as ws:
await ws.configure(target_language_code="hi-IN", speaker="shubh")
await ws.convert("Your text here")
await ws.flush()
async for message in ws:
pass # base64 audio chunks
asyncio.run(tts_stream())Character Limits
字符限制
| Method | Max Text |
|---|---|
REST ( | 2,500 chars |
HTTP Stream ( | 3,500 chars |
| WebSocket | 2,500 chars/msg |
| 调用方式 | 最大文本长度 |
|---|---|
REST ( | 2500字符 |
HTTP流式传输 ( | 3500字符 |
| WebSocket | 2500字符/消息 |
Gotchas
注意事项
| Gotcha | Detail |
|---|---|
| JS method name | |
| SDK accepts these but API returns 400 for v3. Only |
| v2 voices incompatible | |
| Sample rate >24kHz | 32kHz, 44.1kHz, 48kHz only via REST, not streaming. |
| REST response | Base64-encoded audio in |
| Pronunciation dictionary | |
| 注意点 | 详细说明 |
|---|---|
| JavaScript方法名 | |
| SDK支持这两个参数,但v3版本API会返回400错误。仅 |
| v2版本音色不兼容 | |
| 采样率>24kHz | 32kHz、44.1kHz、48kHz采样率仅支持REST调用,流式传输不支持。 |
| REST接口响应 | 音频内容以Base64编码格式存储在 |
| 发音词典 | 通过 |
Full Docs
完整文档
Fetch voice catalog, streaming protocol, pronunciation dictionary CRUD, and codec options from:
- https://docs.sarvam.ai/llms.txt — comprehensive docs index
- TTS Overview
- Voice Catalog
- HTTP Stream
- Pronunciation Dictionary
- Rate Limits