venice-audio-music

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Venice Music / Async Audio

Venice 音乐/异步音频

Music (and long-form voice) generation is asynchronous. The flow is:
POST /api/v1/audio/quote      → price in USD
POST /api/v1/audio/queue      → { queue_id }      (funds reserved)
POST /api/v1/audio/retrieve   → status or binary audio
POST /api/v1/audio/complete   → finalize & delete media
For short text-to-speech, use the synchronous
venice-audio-speech
endpoint instead.
音乐(以及长语音)生成是异步的。流程如下:
POST /api/v1/audio/quote      → 美元计价
POST /api/v1/audio/queue      → { queue_id }      (预扣资金)
POST /api/v1/audio/retrieve   → 状态或二进制音频文件
POST /api/v1/audio/complete   → 完成并删除媒体文件
对于短文本转语音场景,请使用同步的
venice-audio-speech
接口。

Use when

适用场景

  • You need songs, jingles, score, soundscape, or long narration.
  • The selected model uses duration-based or character-based pricing and must be priced before submission.
  • The expected generation time is long enough (> 20 s) that sync call would time out.
  • 需要生成歌曲、广告配乐、配乐、音景或长旁白时。
  • 所选模型采用时长计费字符计费,提交前需确认价格。
  • 预期生成时长较长(>20秒),同步调用会超时的情况。

Lifecycle

生命周期

1.
POST /audio/quote
— price it first

1.
POST /audio/quote
— 先询价

bash
curl https://api.venice.ai/api/v1/audio/quote \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "duration_seconds": 60
  }'
Response:
{"quote": 0.48}
(USD).
FieldNotes
model
Required. Music/audio model from
GET /models?type=music
.
duration_seconds
Integer or numeric string. Only if the model reports duration metadata.
character_count
Required for models with
pricing.per_thousand_characters
(long narration).
bash
curl https://api.venice.ai/api/v1/audio/quote \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "duration_seconds": 60
  }'
响应:
{"quote": 0.48}
(美元)。
字段说明
model
必填项。从
GET /models?type=music
获取音乐/音频模型。
duration_seconds
整数或数字字符串。仅适用于报告时长元数据的模型。
character_count
对于采用
pricing.per_thousand_characters
(长旁白)的模型为必填项。

2.
POST /audio/queue
— enqueue

2.
POST /audio/queue
— 加入队列

bash
curl https://api.venice.ai/api/v1/audio/queue \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "prompt": "Uplifting indie-folk acoustic track, 120 BPM, major key.",
    "lyrics_prompt": "Verse 1: Walking through the city lights...\nChorus: We are the dreamers...",
    "duration_seconds": 60,
    "voice": "Aria",
    "language_code": "en",
    "speed": 1.0,
    "force_instrumental": false,
    "lyrics_optimizer": false
  }'
Response:
{ "model": "...", "queue_id": "uuid" }
.
FieldNotes
model
Required.
prompt
Required. Describe genre, mood, tempo, instruments. Length caps in
/models
.
lyrics_prompt
Lyrics. Required when
lyrics_required=true
, rejected when
supports_lyrics=false
.
duration_seconds
Integer or string. Model-dependent.
force_instrumental
Only when
supports_force_instrumental=true
.
lyrics_optimizer
Auto-generate lyrics from
prompt
. Requires
supports_lyrics_optimizer=true
.
lyrics_prompt
must be empty.
voice
For voice-enabled models. See
voices
+
default_voice
in
/models
.
language_code
ISO 639-1. Requires
supports_language_code=true
.
speed
Requires
supports_speed=true
. Use model's
min_speed
/
max_speed
.
bash
curl https://api.venice.ai/api/v1/audio/queue \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "elevenlabs-music",
    "prompt": "欢快的独立民谣原声曲目,120 BPM,大调。",
    "lyrics_prompt": "主歌1:漫步在城市灯光下...\n副歌:我们是追梦者...",
    "duration_seconds": 60,
    "voice": "Aria",
    "language_code": "en",
    "speed": 1.0,
    "force_instrumental": false,
    "lyrics_optimizer": false
  }'
响应:
{ "model": "...", "queue_id": "uuid" }
字段说明
model
必填项。
prompt
必填项。描述流派、情绪、节奏、乐器。长度限制见
/models
lyrics_prompt
歌词内容。当
lyrics_required=true
时为必填项,当
supports_lyrics=false
时会被拒绝。
duration_seconds
整数或字符串。取决于模型。
force_instrumental
仅适用于
supports_force_instrumental=true
的模型。
lyrics_optimizer
根据
prompt
自动生成歌词。需要
supports_lyrics_optimizer=true
lyrics_prompt
必须为空。
voice
适用于支持音色的模型。查看
/models
中的
voices
default_voice
language_code
ISO 639-1标准代码。需要
supports_language_code=true
speed
需要
supports_speed=true
。使用模型的
min_speed
/
max_speed
范围。

3.
POST /audio/retrieve
— poll status / download

3.
POST /audio/retrieve
— 轮询状态/下载

bash
curl https://api.venice.ai/api/v1/audio/retrieve \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs-music","queue_id":"..."}' \
  --output track.mp3
  • If still processing: JSON
    {"status":"PROCESSING","average_execution_time":...,"execution_duration":...}
    .
  • If done: binary audio body (
    audio/mpeg
    or similar). Save the bytes.
  • Set
    delete_media_on_completion: true
    to skip step 4.
Poll every 2–5 s; use
average_execution_time
(ms, P80) as a guideline for your first poll delay.
bash
curl https://api.venice.ai/api/v1/audio/retrieve \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs-music","queue_id":"..."}' \
  --output track.mp3
  • 若仍在处理中:返回JSON
    {"status":"PROCESSING","average_execution_time":...,"execution_duration":...}
  • 若处理完成:返回二进制音频主体(
    audio/mpeg
    或类似格式)。保存字节数据。
  • 设置
    delete_media_on_completion: true
    可跳过第4步。
每2-5秒轮询一次;可参考
average_execution_time
(毫秒,P80值)设置首次轮询延迟。

4.
POST /audio/complete
— cleanup

4.
POST /audio/complete
— 清理

bash
curl https://api.venice.ai/api/v1/audio/complete \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs-music","queue_id":"..."}'
Removes the media from Venice storage after you've downloaded it. Required unless you used
delete_media_on_completion: true
on retrieve.
bash
curl https://api.venice.ai/api/v1/audio/complete \
  -H "Authorization: Bearer $VENICE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"elevenlabs-music","queue_id":"..."}'
下载完成后从Venice存储中删除媒体文件。除非在retrieve时设置了
delete_media_on_completion: true
,否则此步骤为必填项。

Full loop (TypeScript)

完整流程(TypeScript)

ts
const base = 'https://api.venice.ai/api/v1'
const headers = {
  Authorization: `Bearer ${process.env.VENICE_API_KEY}`,
  'Content-Type': 'application/json',
}

async function generateTrack() {
  // 1. Quote
  const quote = await fetch(`${base}/audio/quote`, {
    method: 'POST', headers,
    body: JSON.stringify({ model: 'elevenlabs-music', duration_seconds: 60 }),
  }).then(r => r.json())
  console.log('price:', quote.quote)

  // 2. Queue
  const { queue_id, model } = await fetch(`${base}/audio/queue`, {
    method: 'POST', headers,
    body: JSON.stringify({
      model: 'elevenlabs-music',
      prompt: 'Uplifting indie-folk acoustic track, 120 BPM.',
      duration_seconds: 60,
      force_instrumental: true,
    }),
  }).then(r => r.json())

  // 3. Poll
  while (true) {
    const res = await fetch(`${base}/audio/retrieve`, {
      method: 'POST', headers,
      body: JSON.stringify({ model, queue_id }),
    })
    const ct = res.headers.get('content-type') ?? ''
    if (ct.startsWith('audio/')) {
      const buf = Buffer.from(await res.arrayBuffer())
      await fs.writeFile('track.mp3', buf)
      break
    }
    const { status } = await res.json()
    if (status !== 'PROCESSING') throw new Error(`unexpected ${status}`)
    await new Promise(r => setTimeout(r, 3000))
  }

  // 4. Complete
  await fetch(`${base}/audio/complete`, {
    method: 'POST', headers,
    body: JSON.stringify({ model, queue_id }),
  })
}
ts
const base = 'https://api.venice.ai/api/v1'
const headers = {
  Authorization: `Bearer ${process.env.VENICE_API_KEY}`,
  'Content-Type': 'application/json',
}

async function generateTrack() {
  // 1. 询价
  const quote = await fetch(`${base}/audio/quote`, {
    method: 'POST', headers,
    body: JSON.stringify({ model: 'elevenlabs-music', duration_seconds: 60 }),
  }).then(r => r.json())
  console.log('price:', quote.quote)

  // 2. 加入队列
  const { queue_id, model } = await fetch(`${base}/audio/queue`, {
    method: 'POST', headers,
    body: JSON.stringify({
      model: 'elevenlabs-music',
      prompt: '欢快的独立民谣原声曲目,120 BPM。',
      duration_seconds: 60,
      force_instrumental: true,
    }),
  }).then(r => r.json())

  // 3. 轮询
  while (true) {
    const res = await fetch(`${base}/audio/retrieve`, {
      method: 'POST', headers,
      body: JSON.stringify({ model, queue_id }),
    })
    const ct = res.headers.get('content-type') ?? ''
    if (ct.startsWith('audio/')) {
      const buf = Buffer.from(await res.arrayBuffer())
      await fs.writeFile('track.mp3', buf)
      break
    }
    const { status } = await res.json()
    if (status !== 'PROCESSING') throw new Error(`unexpected ${status}`)
    await new Promise(r => setTimeout(r, 3000))
  }

  // 4. 完成清理
  await fetch(`${base}/audio/complete`, {
    method: 'POST', headers,
    body: JSON.stringify({ model, queue_id }),
  })
}

Capability probing

模型能力探测

Before calling
/audio/queue
, inspect the model entry returned by
GET /models?type=music
— each row's
model_spec
exposes (among other fields):
  • supports_lyrics
    ,
    lyrics_required
    ,
    supports_lyrics_optimizer
  • supports_force_instrumental
    ,
    supports_speed
    ,
    supports_language_code
  • voices[]
    ,
    default_voice
  • min_prompt_length
    ,
    prompt_character_limit
  • min_speed
    ,
    max_speed
  • pricing.generation
    (per-job),
    pricing.per_second
    (per second generated),
    pricing.per_thousand_characters
    (character-priced narration), or
    pricing.durations
    (duration-tiered map:
    { "<tier>": { usd, diem, min_seconds, max_seconds } }
    ) — each model uses one of these shapes
调用
/audio/queue
前,先查看
GET /models?type=music
返回的模型条目 —— 每条记录的
model_spec
包含以下字段(及其他字段):
  • supports_lyrics
    ,
    lyrics_required
    ,
    supports_lyrics_optimizer
  • supports_force_instrumental
    ,
    supports_speed
    ,
    supports_language_code
  • voices[]
    ,
    default_voice
  • min_prompt_length
    ,
    prompt_character_limit
  • min_speed
    ,
    max_speed
  • pricing.generation
    (按任务计费)、
    pricing.per_second
    (按生成时长计费)、
    pricing.per_thousand_characters
    (按字符计费的旁白)或
    pricing.durations
    (按时长阶梯计费:
    { "<tier>": { usd, diem, min_seconds, max_seconds } }
    )—— 每个模型采用其中一种计费方式

Errors

错误码

CodeMeaning
400
Wrong params (lyrics on an instrumental-only model,
duration_seconds
outside allowed range, voice not in model's list).
401
Auth / Pro-only model.
402
Insufficient balance. Bearer →
INSUFFICIENT_BALANCE
; x402 →
PAYMENT_REQUIRED
.
404
On
retrieve
/
complete
: unknown / expired
queue_id
.
422
Content policy violation.
ContentViolationError
may include
suggested_prompt
.
429
Rate limited.
500
/
503
Inference or capacity issue.
代码含义
400
参数错误(纯器乐模型传入歌词、
duration_seconds
超出允许范围、音色不在模型列表中等)。
401
认证失败/仅专业版模型可用。
402
余额不足。Bearer认证返回
INSUFFICIENT_BALANCE
;x402认证返回
PAYMENT_REQUIRED
404
retrieve
/
complete
中:
queue_id
不存在或已过期。
422
违反内容政策。
ContentViolationError
可能包含
suggested_prompt
429
请求频率受限。
500
/
503
推理或容量问题。

Gotchas

注意事项

  • Quote before queue — music is pay-per-second; unexpected
    duration_seconds
    can blow through a budget. Use
    /audio/quote
    to gate the
    queue
    call against your available balance (
    /billing/balance
    or
    /x402/balance/...
    ).
  • queue_id
    is UUIDv4. Store it alongside the
    model
    — both are required for every subsequent call.
  • Media URLs are ephemeral. Download during
    retrieve
    and store yourself; after
    complete
    , Venice deletes the file.
  • lyrics_optimizer: true
    and a non-empty
    lyrics_prompt
    is a
    400
    .
  • Poll rate: don't hammer
    /retrieve
    . 2–5 s is plenty — the job queue is the same regardless of poll frequency.
  • execution_duration
    from the retrieve status is cumulative (ms since enqueue);
    average_execution_time
    is the P80 expected total.
  • 先询价再入队 —— 音乐按秒计费;意外的
    duration_seconds
    可能超出预算。使用
    /audio/quote
    结合可用余额检查(
    /billing/balance
    /x402/balance/...
    )来控制入队操作。
  • queue_id
    是UUIDv4格式。需与
    model
    一起存储 —— 后续所有调用都需要这两个参数。
  • 媒体URL是临时的。请在
    retrieve
    时下载并自行存储;调用
    complete
    后,Venice会删除文件。
  • lyrics_optimizer: true
    lyrics_prompt
    非空会返回
    400
    错误。
  • 轮询频率:不要频繁调用
    /retrieve
    。2-5秒一次足够 —— 任务队列不受轮询频率影响。
  • retrieve状态中的
    execution_duration
    是累计时长(从入队开始的毫秒数);
    average_execution_time
    是预期总时长的P80值。