elevenlabs-music-generation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ElevenLabs AI Music Generation — Pro Pack on RunComfy

RunComfy上的ElevenLabs AI音乐生成专业包

Generate full songs and instrumental tracks from a text description — studio-quality 44.1 kHz stereo, 5 seconds to 5 minutes, with section-level structure control. ElevenLabs Music on the RunComfy Model API, called through the
runcomfy
CLI.
通过文本描述生成完整歌曲和器乐曲目——专业工作室品质的44.1kHz立体声,时长5秒至5分钟,支持段落级结构控制。RunComfy模型API上的ElevenLabs Music可通过
runcomfy
CLI调用。

Install this skill

安装此技能

bash
npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -g
bash
npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -g

Powered by the RunComfy CLI

基于RunComfy CLI实现

bash
undefined
bash
undefined

1. Install (one of — see runcomfy-cli skill for details)

1. 安装(任选其一——详见runcomfy-cli技能文档)

npm i -g @runcomfy/cli # global install npx -y @runcomfy/cli --version # zero-install
npm i -g @runcomfy/cli # 全局安装 npx -y @runcomfy/cli --version # 零安装运行

2. Sign in

2. 登录

runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
runcomfy login # 或在CI环境中:export RUNCOMFY_TOKEN=<token>

3. Generate music

3. 生成音乐

runcomfy run elevenlabs/elevenlabs/music-generation
--input '{"prompt": "..."}'
--output-dir ./out

CLI deep dive: [`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli) skill.
runcomfy run elevenlabs/elevenlabs/music-generation
--input '{"prompt": "..."}'
--output-dir ./out

CLI深度解析:[`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli)技能。

When to use ElevenLabs Music

何时使用ElevenLabs Music

ElevenLabs Music's strength is structured songs with real vocals — it takes a style brief plus lyrics with section markers and returns a coherent, mixed track. Pick it for:
  • Full vocal songs — verse/chorus structure, multilingual lyrics, consistent meter
  • Instrumental beds
    force_instrumental: true
    for background music, podcast intros, game loops
  • Short brand assets — jingles, stingers, theme music (5–30 s)
  • Long-form tracks — up to 5 minutes in a single call
  • Commercial work — output is commercial-friendly
If the user just wants ambient sound or a one-off SFX (thunder, footsteps), that's a sound-effects task, not music — ElevenLabs Music is for songs and tracks.
ElevenLabs Music的优势在于带有人声的结构化歌曲——它接收风格说明和带段落标记的歌词,返回连贯的混音曲目。适合以下场景:
  • 完整人声歌曲——主歌/副歌结构、多语言歌词、稳定节拍
  • 背景器乐——设置
    force_instrumental: true
    生成背景音乐、播客开场曲、游戏循环音乐
  • 短品牌素材——广告短曲、提示音、主题音乐(5–30秒)
  • 长曲目——单次调用最长可生成5分钟内容
  • 商业用途——输出内容可用于商业场景
如果用户仅需要环境音或单次音效(雷声、脚步声),属于音效任务而非音乐生成——ElevenLabs Music专注于歌曲和曲目生成。

Endpoint + input schema

端点与输入模式

Model:
elevenlabs/elevenlabs/music-generation
FieldTypeRequiredDefaultNotes
prompt
stringyesStyle description and lyrics with section markers. See prompting tips
music_length_ms
intno
40000
Output duration in ms. 5000–300000 (5 s – 5 min)
force_instrumental
boolno
false
true
= instrumental only, no vocals
output_format
stringno
mp3_standard
mp3_standard
(default), or WAV — see the model page API tab for the full format list
Output: 44.1 kHz stereo audio. The result JSON contains the generated audio URL — the CLI downloads it into
--output-dir
.
Pricing: ~$0.0083 per second of generated audio (30 s ≈ $0.25, 60 s ≈ $0.50, 5 min ≈ $2.49). Cost scales with
music_length_ms
, so draft short and finalize long.
模型:
elevenlabs/elevenlabs/music-generation
字段类型必填默认值说明
prompt
字符串包含风格描述带段落标记的歌词。详见提示词技巧
music_length_ms
整数
40000
输出时长(毫秒)。范围5000–300000(5秒–5分钟)
force_instrumental
布尔值
false
true
= 仅器乐,无人声
output_format
字符串
mp3_standard
mp3_standard
(默认)或WAV——详见模型页面的API标签获取完整格式列表
输出:44.1kHz立体声音频。结果JSON包含生成的音频URL——CLI会将其下载到
--output-dir
目录中。
定价: 约每生成1秒音频0.0083美元(30秒≈0.25美元,60秒≈0.50美元,5分钟≈2.49美元)。成本随
music_length_ms
增加而上升,建议先生成短版本草稿,再渲染完整长版本。

How to invoke

调用示例

Full vocal song with structure:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "Upbeat indie-pop anthem, bright electric guitars, driving drums, 120 BPM, female lead vocal. [Intro 8 bars] instrumental build. [Verse] Chalk on the palms, laces double-knotted, morning on the ridge. [Chorus] We rise, we strike, we never fade out. [Bridge] soft breakdown, just piano and voice. [Outro] full band, fade.",
    "music_length_ms": 60000
  }' \
  --output-dir ./out
Instrumental background bed:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "Calm lo-fi hip-hop instrumental for a study playlist. Warm Rhodes piano, soft vinyl crackle, mellow boom-bap drums, 75 BPM. No vocals. Consistent loop-friendly groove throughout.",
    "music_length_ms": 90000,
    "force_instrumental": true
  }' \
  --output-dir ./out
Short brand jingle:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "5-second cheerful brand stinger, bright marimba and a single uplifting chord resolve, no vocals.",
    "music_length_ms": 5000,
    "force_instrumental": true
  }' \
  --output-dir ./out
带结构的完整人声歌曲:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "欢快的独立流行颂歌,明亮的电吉他,强劲的鼓点,120 BPM,女性主唱。[前奏8小节] 器乐渐强。[主歌] 掌心沾满粉笔,鞋带系成双结,山脊上的清晨。[副歌] 我们崛起,我们奋斗,永不褪色。[桥段] 轻柔的分解,仅钢琴与人声。[尾声] 全乐队演奏,渐弱。",
    "music_length_ms": 60000
  }' \
  --output-dir ./out
背景器乐:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "适合学习播放列表的舒缓lo-fi嘻哈器乐。温暖的罗德钢琴,轻柔的黑胶杂音,柔和的boom-bap鼓点,75 BPM。无人声。全程保持连贯的可循环节奏。",
    "music_length_ms": 90000,
    "force_instrumental": true
  }' \
  --output-dir ./out
短品牌广告曲:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
  --input '{
    "prompt": "5秒欢快的品牌提示音,明亮的马林巴琴和一个振奋人心的和弦收尾,无人声。",
    "music_length_ms": 5000,
    "force_instrumental": true
  }' \
  --output-dir ./out

Prompting tips

提示词技巧

ElevenLabs Music reads one
prompt
field
that carries both the style brief and the lyrics. Structure it well:
  • Lead with the style brief: genre, mood, tempo (BPM), key instruments, vocal type.
    "Upbeat indie-pop anthem, bright electric guitars, 120 BPM, female lead vocal."
  • Then the lyrics with section markers:
    [Intro]
    ,
    [Verse]
    ,
    [Chorus]
    ,
    [Bridge]
    ,
    [Outro]
    . Add approximate durations or bar counts —
    [Intro 8 bars]
    ,
    [Verse 16 bars]
    .
  • Keep lyrical meter consistent — even syllable counts per line, clear rhyme scheme. The model follows meter; sloppy meter produces awkward phrasing.
  • Name lead instruments and mix priorities
    "electric guitar carries the chorus, drums sit back in the verse."
  • For instrumental, set
    force_instrumental: true
    AND say "no vocals" in the prompt — belt and suspenders.
  • Multilingual: write the lyrics in the target language; annotate accent/language inline if needed (
    [Verse] (sung in Brazilian Portuguese) ...
    ).
  • Avoid contradictory style instructions — "aggressive metal" + "soft lullaby" in one prompt confuses the model. One coherent direction per call.
  • Draft short, finalize long: validate the direction with a 30–45 s draft (
    music_length_ms: 35000
    ) before paying for a 5-minute render.
ElevenLabs Music读取单个
prompt
字段
,同时承载风格说明和歌词。需合理结构化:
  • 先写风格说明: 流派、情绪、节拍(BPM)、核心乐器、人声类型。例如:
    "欢快的独立流行颂歌,明亮的电吉他,120 BPM,女性主唱。"
  • 再写带段落标记的歌词:
    [前奏]
    [主歌]
    [副歌]
    [桥段]
    [尾声]
    。可添加大致时长或小节数——
    [前奏8小节]
    [主歌16小节]
  • 保持歌词节拍一致——每行音节数均匀,押韵清晰。模型会遵循节拍,混乱的节拍会导致生硬的演唱。
  • 指定核心乐器和混音优先级——
    "电吉他主导副歌,鼓点在主歌中弱化。"
  • 生成器乐时,设置
    force_instrumental: true
    并在提示词中注明“无人声”——双重保障。
  • 多语言: 用目标语言写歌词;如有需要可在行内标注口音/语言(
    [主歌](用巴西葡萄牙语演唱)...
    )。
  • 避免矛盾的风格指令——同一提示词中同时出现“激进金属”和“轻柔摇篮曲”会混淆模型。单次调用保持单一连贯风格。
  • 先短草稿,再长定稿: 先生成30–45秒的草稿(
    music_length_ms: 35000
    )确认风格/节拍/结构,再付费渲染5分钟版本。

Common patterns

常见使用场景

Theme song for a video

视频主题歌

  • Full brief + lyrics +
    [Intro]/[Verse]/[Chorus]
    structure,
    music_length_ms
    matched to the video length
  • 完整说明+歌词+
    [前奏]/[主歌]/[副歌]
    结构,
    music_length_ms
    匹配视频时长

Podcast intro / outro

播客开场/结尾曲

  • force_instrumental: true
    , 10–20 s, "loop-friendly, clean ending"
  • 设置
    force_instrumental: true
    ,时长10–20秒,提示“可循环,干净收尾”

Game background loop

游戏背景循环音乐

  • force_instrumental: true
    , describe "seamless loop", 60–120 s, consistent groove
  • 设置
    force_instrumental: true
    ,提示“无缝循环”,时长60–120秒,节奏连贯

Multilingual release (same song, multiple languages)

多语言发布(同一歌曲,多种语言)

  • One call per language, identical style brief, swap only the lyric lines
  • 每种语言单独调用,风格说明一致,仅替换歌词内容

Iterate then commit

先迭代再定稿

  • Draft at
    music_length_ms: 35000
    to lock genre/tempo/structure → final render at full length
  • music_length_ms: 35000
    生成草稿锁定流派/节拍/结构 → 再渲染完整时长版本

Limitations

局限性

  • One
    prompt
    field
    carries everything (style + lyrics). There is no separate "lyrics" parameter.
  • 5 s – 5 min per call (
    music_length_ms
    5000–300000). For longer pieces, generate sections and stitch externally.
  • Cost scales with duration — a 5-minute render is ~10× a 30-second one.
  • force_instrumental
    is the only vocal toggle
    — you can't request specific voice identities or clone a singer through this endpoint.
  • This skill pins ElevenLabs Music specifically. For sound effects, text-to-speech, or voice cloning, that's a different ElevenLabs capability not exposed through this endpoint.
  • 单个
    prompt
    字段
    承载所有内容(风格+歌词)。没有单独的“歌词”参数。
  • 单次调用时长5秒–5分钟
    music_length_ms
    5000–300000)。如需更长曲目,需分段生成后外部拼接。
  • 成本随时长增加——5分钟渲染的成本约为30秒版本的10倍。
  • force_instrumental
    是唯一的人声开关
    ——无法通过此端点请求特定人声身份或克隆歌手声音。
  • 本技能专门针对ElevenLabs Music。如需音效、文本转语音或声音克隆,属于ElevenLabs的其他功能,未在此端点开放。

Exit codes

退出码

codemeaning
0success
64bad CLI args
65bad input JSON / schema mismatch
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected
代码含义
0成功
64CLI参数错误
65输入JSON错误/模式不匹配
69上游服务5xx错误
75可重试:超时/429错误
77未登录或令牌被拒绝

How it works

工作原理

The skill invokes
runcomfy run elevenlabs/elevenlabs/music-generation
with the JSON body. The CLI POSTs to the RunComfy Model API, polls request status, fetches the result, and downloads the generated audio file into
--output-dir
.
Ctrl-C
cancels the remote request before exit.
该技能调用
runcomfy run elevenlabs/elevenlabs/music-generation
并传入JSON参数。CLI向RunComfy模型API发送POST请求,轮询请求状态,获取结果,并将生成的音频文件下载到
--output-dir
目录。按
Ctrl-C
可在退出前取消远程请求。

Security & Privacy

安全与隐私

  • Install via verified package manager only. Use
    npm i -g @runcomfy/cli
    or
    npx -y @runcomfy/cli
    . Agents must not pipe an arbitrary remote install script into a shell on the user's behalf — if the operator wants the curl-pipe path documented at
    docs.runcomfy.com/cli/install
    , they should review the script first.
  • Token storage:
    runcomfy login
    writes the API token to
    ~/.config/runcomfy/token.json
    with mode 0600. Set
    RUNCOMFY_TOKEN
    env var to bypass the file in CI / containers. Never echo the token into a prompt, log it, or check it in.
  • Input boundary (shell injection): the prompt is passed as a JSON string via
    --input
    . The CLI does not shell-expand prompt content; it transmits the JSON body directly to the Model API over HTTPS. No shell-injection surface from prompt content, even with backticks, quotes, or
    $(...)
    patterns.
  • Lyrics provenance: if the user supplies lyrics, confirm they have the rights to them. Generating music around copyrighted lyrics is the operator's responsibility — the skill does not check.
  • Outbound endpoints (allowlist): only
    model-api.runcomfy.net
    (request submission) and
    *.runcomfy.net
    /
    *.runcomfy.com
    (download whitelist for generated audio). No telemetry, no callbacks.
  • Generated-file size cap: the CLI aborts any single download > 2 GiB.
  • Scope of bash usage: the skill only invokes
    runcomfy <subcommand>
    npm
    /
    npx
    lines are one-time operator setup, not commands the skill executes per call.
  • 仅通过已验证的包管理器安装。使用
    npm i -g @runcomfy/cli
    npx -y @runcomfy/cli
    代理不得将任意远程安装脚本通过管道传入用户的shell——如果操作者需要使用
    docs.runcomfy.com/cli/install
    文档中的curl管道方式,应先审核脚本内容。
  • 令牌存储
    runcomfy login
    会将API令牌写入
    ~/.config/runcomfy/token.json
    ,权限为0600。在CI/容器环境中可设置
    RUNCOMFY_TOKEN
    环境变量跳过文件存储。切勿在提示中回显令牌、记录令牌或将其提交到版本控制系统。
  • 输入边界(shell注入):提示词通过
    --input
    作为JSON字符串传递。CLI不会对提示词内容进行shell扩展;会直接通过HTTPS将JSON主体传输到模型API。提示词内容不存在shell注入风险,即使包含反引号、引号或
    $(...)
    模式。
  • 歌词来源:如果用户提供歌词,需确认他们拥有相关权利。围绕受版权保护的歌词生成音乐是操作者的责任——本技能不做检查。
  • 出站端点(白名单):仅允许
    model-api.runcomfy.net
    (请求提交)和
    *.runcomfy.net
    /
    *.runcomfy.com
    (生成音频的下载白名单)。无遥测,无回调。
  • 生成文件大小限制:CLI会中止任何超过2 GiB的单个下载。
  • bash使用范围:本技能仅调用
    runcomfy <subcommand>
    ——
    npm
    /
    npx
    命令是操作者的一次性设置,并非技能每次调用都会执行的命令。

See also

相关链接