elevenlabs-music-generation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseElevenLabs AI Music Generation — Pro Pack on RunComfy
RunComfy上的ElevenLabs AI音乐生成专业包
Generate full songs and instrumental tracks from a text description — studio-quality 44.1 kHz stereo, 5 seconds to 5 minutes, with section-level structure control. ElevenLabs Music on the RunComfy Model API, called through the CLI.
runcomfy通过文本描述生成完整歌曲和器乐曲目——专业工作室品质的44.1kHz立体声,时长5秒至5分钟,支持段落级结构控制。RunComfy模型API上的ElevenLabs Music可通过 CLI调用。
runcomfyInstall this skill
安装此技能
bash
npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -gbash
npx skills add agentspace-so/runcomfy-agent-skills --skill elevenlabs-music-generation -gPowered by the RunComfy CLI
基于RunComfy CLI实现
bash
undefinedbash
undefined1. Install (one of — see runcomfy-cli skill for details)
1. 安装(任选其一——详见runcomfy-cli技能文档)
npm i -g @runcomfy/cli # global install
npx -y @runcomfy/cli --version # zero-install
npm i -g @runcomfy/cli # 全局安装
npx -y @runcomfy/cli --version # 零安装运行
2. Sign in
2. 登录
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
runcomfy login # 或在CI环境中:export RUNCOMFY_TOKEN=<token>
3. Generate music
3. 生成音乐
runcomfy run elevenlabs/elevenlabs/music-generation
--input '{"prompt": "..."}'
--output-dir ./out
--input '{"prompt": "..."}'
--output-dir ./out
CLI deep dive: [`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli) skill.runcomfy run elevenlabs/elevenlabs/music-generation
--input '{"prompt": "..."}'
--output-dir ./out
--input '{"prompt": "..."}'
--output-dir ./out
CLI深度解析:[`runcomfy-cli`](https://www.skills.sh/agentspace-so/runcomfy-agent-skills/runcomfy-cli)技能。When to use ElevenLabs Music
何时使用ElevenLabs Music
ElevenLabs Music's strength is structured songs with real vocals — it takes a style brief plus lyrics with section markers and returns a coherent, mixed track. Pick it for:
- Full vocal songs — verse/chorus structure, multilingual lyrics, consistent meter
- Instrumental beds — for background music, podcast intros, game loops
force_instrumental: true - Short brand assets — jingles, stingers, theme music (5–30 s)
- Long-form tracks — up to 5 minutes in a single call
- Commercial work — output is commercial-friendly
If the user just wants ambient sound or a one-off SFX (thunder, footsteps), that's a sound-effects task, not music — ElevenLabs Music is for songs and tracks.
ElevenLabs Music的优势在于带有人声的结构化歌曲——它接收风格说明和带段落标记的歌词,返回连贯的混音曲目。适合以下场景:
- 完整人声歌曲——主歌/副歌结构、多语言歌词、稳定节拍
- 背景器乐——设置生成背景音乐、播客开场曲、游戏循环音乐
force_instrumental: true - 短品牌素材——广告短曲、提示音、主题音乐(5–30秒)
- 长曲目——单次调用最长可生成5分钟内容
- 商业用途——输出内容可用于商业场景
如果用户仅需要环境音或单次音效(雷声、脚步声),属于音效任务而非音乐生成——ElevenLabs Music专注于歌曲和曲目生成。
Endpoint + input schema
端点与输入模式
Model:
elevenlabs/elevenlabs/music-generation| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| string | yes | — | Style description and lyrics with section markers. See prompting tips |
| int | no | | Output duration in ms. 5000–300000 (5 s – 5 min) |
| bool | no | | |
| string | no | | |
Output: 44.1 kHz stereo audio. The result JSON contains the generated audio URL — the CLI downloads it into .
--output-dirPricing: ~$0.0083 per second of generated audio (30 s ≈ $0.25, 60 s ≈ $0.50, 5 min ≈ $2.49). Cost scales with , so draft short and finalize long.
music_length_ms模型:
elevenlabs/elevenlabs/music-generation| 字段 | 类型 | 必填 | 默认值 | 说明 |
|---|---|---|---|---|
| 字符串 | 是 | — | 包含风格描述和带段落标记的歌词。详见提示词技巧 |
| 整数 | 否 | | 输出时长(毫秒)。范围5000–300000(5秒–5分钟) |
| 布尔值 | 否 | | |
| 字符串 | 否 | | |
输出:44.1kHz立体声音频。结果JSON包含生成的音频URL——CLI会将其下载到目录中。
--output-dir定价: 约每生成1秒音频0.0083美元(30秒≈0.25美元,60秒≈0.50美元,5分钟≈2.49美元)。成本随增加而上升,建议先生成短版本草稿,再渲染完整长版本。
music_length_msHow to invoke
调用示例
Full vocal song with structure:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "Upbeat indie-pop anthem, bright electric guitars, driving drums, 120 BPM, female lead vocal. [Intro 8 bars] instrumental build. [Verse] Chalk on the palms, laces double-knotted, morning on the ridge. [Chorus] We rise, we strike, we never fade out. [Bridge] soft breakdown, just piano and voice. [Outro] full band, fade.",
"music_length_ms": 60000
}' \
--output-dir ./outInstrumental background bed:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "Calm lo-fi hip-hop instrumental for a study playlist. Warm Rhodes piano, soft vinyl crackle, mellow boom-bap drums, 75 BPM. No vocals. Consistent loop-friendly groove throughout.",
"music_length_ms": 90000,
"force_instrumental": true
}' \
--output-dir ./outShort brand jingle:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "5-second cheerful brand stinger, bright marimba and a single uplifting chord resolve, no vocals.",
"music_length_ms": 5000,
"force_instrumental": true
}' \
--output-dir ./out带结构的完整人声歌曲:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "欢快的独立流行颂歌,明亮的电吉他,强劲的鼓点,120 BPM,女性主唱。[前奏8小节] 器乐渐强。[主歌] 掌心沾满粉笔,鞋带系成双结,山脊上的清晨。[副歌] 我们崛起,我们奋斗,永不褪色。[桥段] 轻柔的分解,仅钢琴与人声。[尾声] 全乐队演奏,渐弱。",
"music_length_ms": 60000
}' \
--output-dir ./out背景器乐:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "适合学习播放列表的舒缓lo-fi嘻哈器乐。温暖的罗德钢琴,轻柔的黑胶杂音,柔和的boom-bap鼓点,75 BPM。无人声。全程保持连贯的可循环节奏。",
"music_length_ms": 90000,
"force_instrumental": true
}' \
--output-dir ./out短品牌广告曲:
bash
runcomfy run elevenlabs/elevenlabs/music-generation \
--input '{
"prompt": "5秒欢快的品牌提示音,明亮的马林巴琴和一个振奋人心的和弦收尾,无人声。",
"music_length_ms": 5000,
"force_instrumental": true
}' \
--output-dir ./outPrompting tips
提示词技巧
ElevenLabs Music reads one field that carries both the style brief and the lyrics. Structure it well:
prompt- Lead with the style brief: genre, mood, tempo (BPM), key instruments, vocal type.
"Upbeat indie-pop anthem, bright electric guitars, 120 BPM, female lead vocal." - Then the lyrics with section markers: ,
[Intro],[Verse],[Chorus],[Bridge]. Add approximate durations or bar counts —[Outro],[Intro 8 bars].[Verse 16 bars] - Keep lyrical meter consistent — even syllable counts per line, clear rhyme scheme. The model follows meter; sloppy meter produces awkward phrasing.
- Name lead instruments and mix priorities —
"electric guitar carries the chorus, drums sit back in the verse." - For instrumental, set AND say "no vocals" in the prompt — belt and suspenders.
force_instrumental: true - Multilingual: write the lyrics in the target language; annotate accent/language inline if needed ().
[Verse] (sung in Brazilian Portuguese) ... - Avoid contradictory style instructions — "aggressive metal" + "soft lullaby" in one prompt confuses the model. One coherent direction per call.
- Draft short, finalize long: validate the direction with a 30–45 s draft () before paying for a 5-minute render.
music_length_ms: 35000
ElevenLabs Music读取单个字段,同时承载风格说明和歌词。需合理结构化:
prompt- 先写风格说明: 流派、情绪、节拍(BPM)、核心乐器、人声类型。例如:
"欢快的独立流行颂歌,明亮的电吉他,120 BPM,女性主唱。" - 再写带段落标记的歌词: 、
[前奏]、[主歌]、[副歌]、[桥段]。可添加大致时长或小节数——[尾声]、[前奏8小节]。[主歌16小节] - 保持歌词节拍一致——每行音节数均匀,押韵清晰。模型会遵循节拍,混乱的节拍会导致生硬的演唱。
- 指定核心乐器和混音优先级——
"电吉他主导副歌,鼓点在主歌中弱化。" - 生成器乐时,设置并在提示词中注明“无人声”——双重保障。
force_instrumental: true - 多语言: 用目标语言写歌词;如有需要可在行内标注口音/语言()。
[主歌](用巴西葡萄牙语演唱)... - 避免矛盾的风格指令——同一提示词中同时出现“激进金属”和“轻柔摇篮曲”会混淆模型。单次调用保持单一连贯风格。
- 先短草稿,再长定稿: 先生成30–45秒的草稿()确认风格/节拍/结构,再付费渲染5分钟版本。
music_length_ms: 35000
Common patterns
常见使用场景
Theme song for a video
视频主题歌
- Full brief + lyrics + structure,
[Intro]/[Verse]/[Chorus]matched to the video lengthmusic_length_ms
- 完整说明+歌词+结构,
[前奏]/[主歌]/[副歌]匹配视频时长music_length_ms
Podcast intro / outro
播客开场/结尾曲
- , 10–20 s, "loop-friendly, clean ending"
force_instrumental: true
- 设置,时长10–20秒,提示“可循环,干净收尾”
force_instrumental: true
Game background loop
游戏背景循环音乐
- , describe "seamless loop", 60–120 s, consistent groove
force_instrumental: true
- 设置,提示“无缝循环”,时长60–120秒,节奏连贯
force_instrumental: true
Multilingual release (same song, multiple languages)
多语言发布(同一歌曲,多种语言)
- One call per language, identical style brief, swap only the lyric lines
- 每种语言单独调用,风格说明一致,仅替换歌词内容
Iterate then commit
先迭代再定稿
- Draft at to lock genre/tempo/structure → final render at full length
music_length_ms: 35000
- 用生成草稿锁定流派/节拍/结构 → 再渲染完整时长版本
music_length_ms: 35000
Limitations
局限性
- One field carries everything (style + lyrics). There is no separate "lyrics" parameter.
prompt - 5 s – 5 min per call (5000–300000). For longer pieces, generate sections and stitch externally.
music_length_ms - Cost scales with duration — a 5-minute render is ~10× a 30-second one.
- is the only vocal toggle — you can't request specific voice identities or clone a singer through this endpoint.
force_instrumental - This skill pins ElevenLabs Music specifically. For sound effects, text-to-speech, or voice cloning, that's a different ElevenLabs capability not exposed through this endpoint.
- 单个字段承载所有内容(风格+歌词)。没有单独的“歌词”参数。
prompt - 单次调用时长5秒–5分钟(5000–300000)。如需更长曲目,需分段生成后外部拼接。
music_length_ms - 成本随时长增加——5分钟渲染的成本约为30秒版本的10倍。
- 是唯一的人声开关——无法通过此端点请求特定人声身份或克隆歌手声音。
force_instrumental - 本技能专门针对ElevenLabs Music。如需音效、文本转语音或声音克隆,属于ElevenLabs的其他功能,未在此端点开放。
Exit codes
退出码
| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
| 代码 | 含义 |
|---|---|
| 0 | 成功 |
| 64 | CLI参数错误 |
| 65 | 输入JSON错误/模式不匹配 |
| 69 | 上游服务5xx错误 |
| 75 | 可重试:超时/429错误 |
| 77 | 未登录或令牌被拒绝 |
How it works
工作原理
The skill invokes with the JSON body. The CLI POSTs to the RunComfy Model API, polls request status, fetches the result, and downloads the generated audio file into . cancels the remote request before exit.
runcomfy run elevenlabs/elevenlabs/music-generation--output-dirCtrl-C该技能调用并传入JSON参数。CLI向RunComfy模型API发送POST请求,轮询请求状态,获取结果,并将生成的音频文件下载到目录。按可在退出前取消远程请求。
runcomfy run elevenlabs/elevenlabs/music-generation--output-dirCtrl-CSecurity & Privacy
安全与隐私
- Install via verified package manager only. Use or
npm i -g @runcomfy/cli. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf — if the operator wants the curl-pipe path documented atnpx -y @runcomfy/cli, they should review the script first.docs.runcomfy.com/cli/install - Token storage: writes the API token to
runcomfy loginwith mode 0600. Set~/.config/runcomfy/token.jsonenv var to bypass the file in CI / containers. Never echo the token into a prompt, log it, or check it in.RUNCOMFY_TOKEN - Input boundary (shell injection): the prompt is passed as a JSON string via . The CLI does not shell-expand prompt content; it transmits the JSON body directly to the Model API over HTTPS. No shell-injection surface from prompt content, even with backticks, quotes, or
--inputpatterns.$(...) - Lyrics provenance: if the user supplies lyrics, confirm they have the rights to them. Generating music around copyrighted lyrics is the operator's responsibility — the skill does not check.
- Outbound endpoints (allowlist): only (request submission) and
model-api.runcomfy.net/*.runcomfy.net(download whitelist for generated audio). No telemetry, no callbacks.*.runcomfy.com - Generated-file size cap: the CLI aborts any single download > 2 GiB.
- Scope of bash usage: the skill only invokes —
runcomfy <subcommand>/npmlines are one-time operator setup, not commands the skill executes per call.npx
- 仅通过已验证的包管理器安装。使用或
npm i -g @runcomfy/cli。代理不得将任意远程安装脚本通过管道传入用户的shell——如果操作者需要使用npx -y @runcomfy/cli文档中的curl管道方式,应先审核脚本内容。docs.runcomfy.com/cli/install - 令牌存储:会将API令牌写入
runcomfy login,权限为0600。在CI/容器环境中可设置~/.config/runcomfy/token.json环境变量跳过文件存储。切勿在提示中回显令牌、记录令牌或将其提交到版本控制系统。RUNCOMFY_TOKEN - 输入边界(shell注入):提示词通过作为JSON字符串传递。CLI不会对提示词内容进行shell扩展;会直接通过HTTPS将JSON主体传输到模型API。提示词内容不存在shell注入风险,即使包含反引号、引号或
--input模式。$(...) - 歌词来源:如果用户提供歌词,需确认他们拥有相关权利。围绕受版权保护的歌词生成音乐是操作者的责任——本技能不做检查。
- 出站端点(白名单):仅允许(请求提交)和
model-api.runcomfy.net/*.runcomfy.net(生成音频的下载白名单)。无遥测,无回调。*.runcomfy.com - 生成文件大小限制:CLI会中止任何超过2 GiB的单个下载。
- bash使用范围:本技能仅调用——
runcomfy <subcommand>/npm命令是操作者的一次性设置,并非技能每次调用都会执行的命令。npx
See also
相关链接
- — the underlying CLI, schema discovery, polling modes, scripting
runcomfy-cli - ElevenLabs Music model page — full API tab with the latest schema
- All RunComfy models — image, video, and audio endpoints
- — pair a generated track with a generated video
ai-video-generation - — talking-head video (different audio path — speech, not music)
ai-avatar-video
- — 底层CLI、模式发现、轮询模式、脚本编写
runcomfy-cli - ElevenLabs Music模型页面 — 包含最新模式的完整API标签
- 所有RunComfy模型 — 图像、视频和音频端点
- — 将生成的曲目与生成的视频配对
ai-video-generation - — 虚拟人视频(不同的音频路径——语音,而非音乐)
ai-avatar-video