ace-step

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

ACE Step — Pro Pack on RunComfy

ACE Step — RunComfy专业套件

Tag-driven music generation, inpainting, and outpainting with StepFun-AI's ACE Step open-weights model. Four CLI-reachable endpoints, $0.0002–0.0003 per second of audio, up to 4 minutes per call.

runcomfy.com · ACE Step base · ACE Step 1.5 · CLI docs

基于StepFun-AI的ACE Step开源权重模型，实现标签驱动的音乐生成、修复与扩展功能。提供四个可通过CLI调用的端点，音频单价为每秒0.0002–0.0003美元，单次调用最长可生成4分钟音频。

runcomfy.com · ACE Step基础版 · ACE Step 1.5 · CLI文档

Install this skill

安装此技能

bash

npx skills add agentspace-so/runcomfy-agent-skills --skill ace-step -g

bash

npx skills add agentspace-so/runcomfy-agent-skills --skill ace-step -g

Powered by the RunComfy CLI

基于RunComfy CLI运行

Step 1 — install (one of, see the

runcomfy-cli

skill for details):

bash

npm i -g @runcomfy/cli         # global install
npx -y @runcomfy/cli --version # zero-install

Step 2 — sign in (or set

RUNCOMFY_TOKEN

env var in CI / containers):

bash

runcomfy login

Step 3 — generate:

bash

runcomfy run acestep-ai/ace-step/text-to-audio \
  --input '{"tags": "..."}' \
  --output-dir ./out

CLI deep dive:

runcomfy-cli

skill.

步骤1 — 安装（二选一，详见

runcomfy-cli

技能说明）：

bash

npm i -g @runcomfy/cli         # 全局安装
npx -y @runcomfy/cli --version # 零安装调用

步骤2 — 登录（或在CI/容器中设置

RUNCOMFY_TOKEN

环境变量）：

bash

runcomfy login

步骤3 — 生成音乐：

bash

runcomfy run acestep-ai/ace-step/text-to-audio \
  --input '{"tags": "..."}' \
  --output-dir ./out

CLI深度解读：

runcomfy-cli

技能。

Pick the right endpoint

选择合适的端点

Listed newest first.

ACE Step 1.5 (text-to-audio) —

acestep-ai/ace-step-1.5/text-to-audio

Latest ACE Step generation. 50+ language vocal support, refined structured-lyric handling, otherwise same shape as base. Slightly higher cost ($0.0003/s vs $0.0002/s). Pick for: multilingual lyrics, hero-quality vocal tracks, vocal songs that need clean section structure. Avoid for: cost-sensitive batches where the base model is good enough.

ACE Step (text-to-audio) —

acestep-ai/ace-step/text-to-audio

(default — cheap & fast)

Original ACE Step. Tag-driven composition, optional lyrics, 5–240 s stereo. $0.0002/s — ~27× cheaper than ElevenLabs Music. Pick for: high-volume drafts, background music, jingles, game loops, cost-sensitive iteration. Avoid for: maximally polished commercial vocal hooks — try ACE Step 1.5 or ElevenLabs Music for those.

ACE Step (audio-inpaint) —

acestep-ai/ace-step/audio-inpaint

Regenerate a time range inside an existing track (not mask-based; uses
start_time
/
end_time
in seconds, each anchored to track start or end). Pick for: fix a bad chorus in the middle, swap the bridge, replace a 20 s section without re-rendering the whole song. Avoid for: edits that aren't time-bounded — those don't fit the schema.

ACE Step (audio-outpaint) —

acestep-ai/ace-step/audio-outpaint

Extend an existing track bidirectionally — add intro before, outro after, or both. Pick for: lengthening a 30 s draft into a 2 min cut, adding a fade-in, building a longer arrangement around an existing hook. Avoid for: extending a track past 4 min total — chain calls instead.

按更新时间排序，最新在前。

ACE Step 1.5（文本转音频） —

acestep-ai/ace-step-1.5/text-to-audio

ACE Step最新版本生成功能。支持50+种语言人声，优化了结构化歌词处理逻辑，其余特性与基础版一致。成本略高（每秒0.0003美元，基础版为每秒0.0002美元）。适用场景：多语言歌词、高品质人声曲目、需要清晰段落结构的人声歌曲。不适用场景：对成本敏感的批量任务，且基础版已能满足需求。

ACE Step（文本转音频） —

acestep-ai/ace-step/text-to-audio

（默认选项 — 低成本、快速）

ACE Step原版模型。支持标签驱动创作、可选歌词，可生成5–240秒立体声。每秒0.0002美元 — 约比ElevenLabs Music便宜27倍。适用场景：高批量草稿制作、背景音乐、广告短曲、游戏循环音轨、对成本敏感的迭代任务。不适用场景：追求极致打磨的商业人声钩子 — 此类场景可尝试ACE Step 1.5或ElevenLabs Music。

ACE Step（音频修复） —

acestep-ai/ace-step/audio-inpaint

重新生成现有音轨中的指定时间段内容（非蒙版方式；使用以音轨开头或结尾为锚点的
start_time
/
end_time
参数，单位为秒）。适用场景：修复中间糟糕的副歌部分、替换桥段、重新生成20秒片段而无需重新渲染整首歌曲。不适用场景：非时间范围限定的编辑 — 此类需求不符合当前接口规范。

ACE Step（音频扩展） —

acestep-ai/ace-step/audio-outpaint

双向扩展现有音轨 — 在音轨前添加前奏、在音轨后添加尾声，或同时添加两者。适用场景：将30秒草稿延长为2分钟版本、添加淡入效果、围绕现有钩子构建更长的编曲。不适用场景：将音轨总时长延长至4分钟以上 — 可通过多次调用实现。

Route 1: ACE Step text-to-audio (default)

路径1：ACE Step文本转音频（默认）

Model:

acestep-ai/ace-step/text-to-audio

(or

acestep-ai/ace-step-1.5/text-to-audio

for the 1.5 variant)

模型：

acestep-ai/ace-step/text-to-audio

（或使用

acestep-ai/ace-step-1.5/text-to-audio

调用1.5版本）

Schema (both variants — same shape)

接口规范（两个版本 — 结构一致）

Field	Type	Required	Default	Notes
`tags`	string	yes	—	Comma-separated genre / mood / instrument tags. Drives composition
`lyrics`	string	no	—	Vocal content. Use section markers `[Verse]` , `[Chorus]` , `[Bridge]` . Use `[inst]` or `[instrumental]` for no vocals
`duration`	int	no	`60`	Audio length in seconds. 5–240 (max 4 min per call)
`seed`	int	no	`-1`	Reproducibility; `-1` randomizes

Pricing: ACE Step $0.0002/s · ACE Step 1.5 $0.0003/s. 60 s ≈ $0.012 / $0.018; 240 s ≈ $0.048 / $0.072.

字段	类型	必填	默认值	说明
`tags`	string	是	—	逗号分隔的流派/情绪/乐器标签，决定音乐创作方向
`lyrics`	string	否	—	人声内容。可使用段落标记 `[Verse]` 、 `[Chorus]` 、 `[Bridge]` 。使用 `[inst]` 或 `[instrumental]` 指定纯音乐
`duration`	int	否	`60`	音频时长（秒）。范围5–240（单次调用最长4分钟）
`seed`	int	否	`-1`	用于复现结果； `-1` 表示随机生成

定价：ACE Step 每秒0.0002美元 · ACE Step 1.5 每秒0.0003美元。60秒约0.012/0.018美元；240秒约0.048/0.072美元。

Invoke

调用示例

Tag-driven instrumental:

bash

runcomfy run acestep-ai/ace-step/text-to-audio \
  --input '{
    "tags": "lo-fi hip-hop, mellow, vinyl crackle, rhodes piano, soft drums, 75 BPM",
    "lyrics": "[inst]",
    "duration": 90
  }' \
  --output-dir ./out

Full vocal song with structure (use 1.5 for multilingual):

bash

runcomfy run acestep-ai/ace-step-1.5/text-to-audio \
  --input '{
    "tags": "indie pop, anthemic, electric guitar, driving drums, female vocal, 120 BPM",
    "lyrics": "[Verse]\nChalk on the palms, laces double-knotted\nMorning on the ridge, the sun is rising\n[Chorus]\nWe rise, we strike, we never fade out\nWe rise, we strike, we sing it loud\n[Bridge]\nSoft piano breakdown\n[Outro]\nFull band, fade",
    "duration": 60
  }' \
  --output-dir ./out

标签驱动的纯音乐：

bash

runcomfy run acestep-ai/ace-step/text-to-audio \
  --input '{
    "tags": "lo-fi hip-hop, mellow, vinyl crackle, rhodes piano, soft drums, 75 BPM",
    "lyrics": "[inst]",
    "duration": 90
  }' \
  --output-dir ./out

带结构的完整人声歌曲（多语言场景使用1.5版本）：

bash

runcomfy run acestep-ai/ace-step-1.5/text-to-audio \
  --input '{
    "tags": "indie pop, anthemic, electric guitar, driving drums, female vocal, 120 BPM",
    "lyrics": "[Verse]\nChalk on the palms, laces double-knotted\nMorning on the ridge, the sun is rising\n[Chorus]\nWe rise, we strike, we never fade out\nWe rise, we strike, we sing it loud\n[Bridge]\nSoft piano breakdown\n[Outro]\nFull band, fade",
    "duration": 60
  }' \
  --output-dir ./out

Prompting tips

提示技巧

Tags do the heavy lifting — be specific:

"lo-fi hip-hop, mellow, vinyl crackle, rhodes piano, soft drums, 75 BPM"

beats

"chill music"

Include BPM in tags when it matters — ACE respects tempo language.
Lyrics with section markers:
```
[Verse]
```
,
```
[Chorus]
```
,
```
[Bridge]
```
,
```
[Outro]
```
. Keep meter consistent across lines.
Instrumental shortcut:
```
"lyrics": "[inst]"
```
or
```
"[instrumental]"
```
. Belt-and-suspenders: also say "no vocals" in tags.
Multilingual vocals: ACE Step 1.5 covers 50+ languages. Write lyrics directly in the target language; tag the language too (
```
"japanese vocal, j-pop"
```
).
Fix the seed for reproducibility (
```
"seed": 42
```
); use
```
-1
```
to explore variations.
Cheap draft → polish: ACE Step at 5–10× lower cost is great for iterating tags before committing to a long render.

标签起核心作用 — 越具体越好：

"lo-fi hip-hop, mellow, vinyl crackle, rhodes piano, soft drums, 75 BPM"

效果远优于

"chill music"

。

必要时在标签中包含BPM — ACE模型会遵循节奏要求。
带段落标记的歌词：使用
```
[Verse]
```
、
```
[Chorus]
```
、
```
[Bridge]
```
、
```
[Outro]
```
。保持每行节奏一致。
纯音乐快捷方式：
```
"lyrics": "[inst]"
```
或
```
"[instrumental]"
```
。双重保障：同时在标签中添加"no vocals"。
多语言人声：ACE Step 1.5支持50+种语言。直接用目标语言编写歌词；同时在标签中注明语言（如
```
"japanese vocal, j-pop"
```
）。
固定seed以复现结果（如
```
"seed": 42
```
）；使用
```
-1
```
探索不同变体。
低成本草稿→打磨：ACE Step成本低5–10倍，非常适合在进行长时间渲染前迭代标签组合。

Route 2: ACE Step audio-inpaint

路径2：ACE Step音频修复

Model:

acestep-ai/ace-step/audio-inpaint

Catalog: audio-inpaint

模型：

acestep-ai/ace-step/audio-inpaint

目录：audio-inpaint

Schema

接口规范

Field	Type	Required	Default	Notes
`audio`	string	yes	—	HTTPS URL to MP3 / WAV / FLAC. Up to 60 min
`tags`	string	yes	—	Comma-separated tags steering the regenerated segment
`start_time`	float	no	—	Start of editable segment, in seconds (0–240)
`start_time_relative_to`	enum	no	`start`	`start` or `end` — anchor for `start_time`
`end_time`	float	no	`30`	End of editable segment, in seconds (0–240)
`end_time_relative_to`	enum	no	`start`	`start` or `end` — anchor for `end_time`
`lyrics`	string	no	—	Lyrics for the regenerated segment. Blank = model writes; `[inst]` = no vocals
`seed`	int	no	`-1`	Reproducibility

No mask — region is defined purely by

start_time

end_time

(each anchorable to track start or end).

字段	类型	必填	默认值	说明
`audio`	string	是	—	MP3/WAV/FLAC格式的HTTPS URL。最长支持60分钟
`tags`	string	是	—	逗号分隔的标签，引导重新生成片段的风格
`start_time`	float	否	—	可编辑片段的起始时间（秒），范围0–240
`start_time_relative_to`	enum	否	`start`	`start` 或 `end` — `start_time` 的锚点
`end_time`	float	否	`30`	可编辑片段的结束时间（秒），范围0–240
`end_time_relative_to`	enum	否	`start`	`start` 或 `end` — `end_time` 的锚点
`lyrics`	string	否	—	重新生成片段的歌词。留空则由模型生成； `[inst]` 表示无 vocals
`seed`	int	否	`-1`	用于复现结果

无蒙版 — 编辑区域完全由

start_time

end_time

定义（每个参数可锚定到音轨开头或结尾）。

Invoke

调用示例

Replace 20–40 s of a track with a new bridge:

bash

runcomfy run acestep-ai/ace-step/audio-inpaint \
  --input '{
    "audio": "https://your-cdn.example/original-track.mp3",
    "tags": "indie pop, breakdown, piano only, soft, no drums",
    "start_time": 20,
    "end_time": 40,
    "lyrics": "[inst]"
  }' \
  --output-dir ./out

Anchor end relative to track end (rewrite the last 15 s):

bash

runcomfy run acestep-ai/ace-step/audio-inpaint \
  --input '{
    "audio": "https://your-cdn.example/song.mp3",
    "tags": "indie pop, fade, soft, ambient pad",
    "start_time": 15,
    "start_time_relative_to": "end",
    "end_time": 0,
    "end_time_relative_to": "end"
  }' \
  --output-dir ./out

替换音轨20–40秒为新桥段：

bash

runcomfy run acestep-ai/ace-step/audio-inpaint \
  --input '{
    "audio": "https://your-cdn.example/original-track.mp3",
    "tags": "indie pop, breakdown, piano only, soft, no drums",
    "start_time": 20,
    "end_time": 40,
    "lyrics": "[inst]"
  }' \
  --output-dir ./out

锚定到音轨结尾（重写最后15秒）：

bash

runcomfy run acestep-ai/ace-step/audio-inpaint \
  --input '{
    "audio": "https://your-cdn.example/song.mp3",
    "tags": "indie pop, fade, soft, ambient pad",
    "start_time": 15,
    "start_time_relative_to": "end",
    "end_time": 0,
    "end_time_relative_to": "end"
  }' \
  --output-dir ./out

Tips

技巧

Match the surrounding tags — if the original is "indie pop, electric guitar, 120 BPM", the inpaint segment should share enough of the tags to blend, not contrast.
Inpaint window is up to ~4 min even on a 60-min source — pick a focused range, not the whole track.
Use
_relative_to: "end"
to target the outro/last seconds without computing exact timestamps.

匹配周边风格标签 — 如果原音轨标签是"indie pop, electric guitar, 120 BPM"，修复片段应保留足够多的相同标签以实现融合，避免风格冲突。
修复窗口最长约4分钟 — 即使源音轨长达60分钟，也应选择聚焦的时间段，而非整个音轨。
使用
_relative_to: "end"
— 无需计算精确时间戳即可定位尾声或最后几秒。

Route 3: ACE Step audio-outpaint

路径3：ACE Step音频扩展

Model:

acestep-ai/ace-step/audio-outpaint

Catalog: audio-outpaint

模型：

acestep-ai/ace-step/audio-outpaint

目录：audio-outpaint

Schema

接口规范

Field	Type	Required	Default	Notes
`audio`	string	yes	—	HTTPS URL to MP3 / WAV / FLAC. Up to 60 min
`tags`	string	yes	—	Tags steering the extended sections
`extend_before_duration`	float	no	`0`	Seconds of new audio before the original (0–240)
`extend_after_duration`	float	no	`30`	Seconds of new audio after the original (0–240)
`lyrics`	string	no	—	Optional lyrics for extended sections
`seed`	int	no	`-1`	Reproducibility

字段	类型	必填	默认值	说明
`audio`	string	是	—	MP3/WAV/FLAC格式的HTTPS URL。最长支持60分钟
`tags`	string	是	—	引导扩展部分风格的标签
`extend_before_duration`	float	否	`0`	在原音轨之前添加的新音频时长（秒），范围0–240
`extend_after_duration`	float	否	`30`	在原音轨之后添加的新音频时长（秒），范围0–240
`lyrics`	string	否	—	扩展部分的可选歌词
`seed`	int	否	`-1`	用于复现结果

Invoke

调用示例

Extend a 30 s hook into a 2 min cut (add 30 s intro + 60 s outro):

bash

runcomfy run acestep-ai/ace-step/audio-outpaint \
  --input '{
    "audio": "https://your-cdn.example/hook-30s.mp3",
    "tags": "indie pop, electric guitar, drums, build-up before chorus, fade outro",
    "extend_before_duration": 30,
    "extend_after_duration": 60,
    "lyrics": "[inst]"
  }' \
  --output-dir ./out

Add only a fade-out (no pre-extension):

bash

runcomfy run acestep-ai/ace-step/audio-outpaint \
  --input '{
    "audio": "https://your-cdn.example/track.mp3",
    "tags": "ambient pad, soft fade, low volume tail",
    "extend_before_duration": 0,
    "extend_after_duration": 20
  }' \
  --output-dir ./out

将30秒钩子扩展为2分钟版本（添加30秒前奏+60秒尾声）：

bash

runcomfy run acestep-ai/ace-step/audio-outpaint \
  --input '{
    "audio": "https://your-cdn.example/hook-30s.mp3",
    "tags": "indie pop, electric guitar, drums, build-up before chorus, fade outro",
    "extend_before_duration": 30,
    "extend_after_duration": 60,
    "lyrics": "[inst]"
  }' \
  --output-dir ./out

仅添加淡出效果（不扩展前奏）：

bash

runcomfy run acestep-ai/ace-step/audio-outpaint \
  --input '{
    "audio": "https://your-cdn.example/track.mp3",
    "tags": "ambient pad, soft fade, low volume tail",
    "extend_before_duration": 0,
    "extend_after_duration": 20
  }' \
  --output-dir ./out

Tips

技巧

Tags describe the extension, not the original — what should the new section sound like?
Bidirectional in one call — set both
```
extend_before_duration
```
and
```
extend_after_duration
```
to add intro + outro in one go.
Don't exceed 4 min total — if original is 3 min, you can add max 1 min combined.

标签描述扩展部分，而非原音轨 — 新段落应该是什么风格？
单次调用双向扩展 — 同时设置
```
extend_before_duration
```
和
```
extend_after_duration
```
，可一次性添加前奏和尾声。
总时长不超过4分钟 — 如果原音轨时长为3分钟，最多可添加1分钟的扩展内容。

When to pick ACE Step vs ElevenLabs Music

ACE Step与ElevenLabs Music的选择场景

ACE Step and ElevenLabs Music are different tools:

Dimension	ACE Step	ElevenLabs Music
Cost	$0.0002–0.0003 / s	$0.0083 / s (~27× more)
License	Open-weights (Apache 2.0)	Commercial, ElevenLabs-hosted
Multilingual vocals	50+ languages (1.5 variant)	Strong multilingual support
Structured lyrics	`[Verse]/[Chorus]/[Bridge]` markers	`[Verse]/[Chorus]/[Bridge]` markers
Max duration / call	240 s (4 min)	300 s (5 min)
Inpaint / outpaint	Yes (time-range based)	No
Tag-driven composition	Yes (tags is required field)	Style is part of free-text prompt
Best for	Cost-sensitive batches, drafts, inpaint/outpaint workflows, open-weights pipelines	Premium vocal song hooks, polished commercial cuts

Cheap draft pattern: draft tag combos with ACE Step → lock vibe → final render on ElevenLabs Music if a polished commercial cut is needed.

For the routing skill that picks between them automatically based on intent, see

ai-music

once it ships.

ACE Step和ElevenLabs Music是不同的工具：

维度	ACE Step	ElevenLabs Music
成本	$0.0002–0.0003 / 秒	$0.0083 / 秒（约贵27倍）
许可证	开源权重（Apache 2.0）	商业授权，ElevenLabs托管
多语言人声	50+种语言（1.5版本）	强大的多语言支持
结构化歌词	支持 `[Verse]/[Chorus]/[Bridge]` 标记	支持 `[Verse]/[Chorus]/[Bridge]` 标记
单次调用最长时长	240秒（4分钟）	300秒（5分钟）
音频修复/扩展	支持（基于时间范围）	不支持
标签驱动创作	支持（tags为必填字段）	风格作为自由文本提示的一部分
最佳适用场景	对成本敏感的批量任务、草稿制作、修复/扩展工作流、开源权重管道	高品质人声歌曲钩子、打磨完成的商业曲目

低成本草稿流程：使用ACE Step迭代标签组合→锁定风格→若需打磨完成的商业曲目，再用ElevenLabs Music进行最终渲染。

如需根据意图自动在两者间选择的路由技能，请关注即将推出的

ai-music

。

Common patterns

常见使用模式

Cost-sensitive background music library

对成本敏感的背景音乐库

Route 1 (ACE Step base) with varied tag combos, 60–90 s each,
```
[inst]
```

使用路径1（ACE Step基础版），搭配不同标签组合，时长60–90秒，设置
```
[inst]
```

Multilingual launch (same song, many languages)

多语言发布（同一歌曲，多种语言）

Route 1 (ACE Step 1.5) with identical tags, swap
```
lyrics
```
per language

使用路径1（ACE Step 1.5），标签保持一致，针对不同语言替换
```
lyrics
```
内容

Section repair (bad chorus → new chorus)

段落修复（糟糕的副歌→新副歌）

Route 2 (audio-inpaint) with
```
start_time
```
/
```
end_time
```
around the bad section, tags matching the song style

使用路径2（音频修复），设置
```
start_time
```
/
```
end_time
```
定位糟糕段落，标签匹配歌曲风格

Hook → full track

钩子→完整曲目

Route 3 (audio-outpaint) adds intro before + outro after a tight 30 s hook

使用路径3（音频扩展），在30秒紧凑钩子前后分别添加前奏和尾声

Game loop bed

游戏循环音轨

Route 1 (ACE Step base) with "seamless loop, consistent groove" in tags, 60–120 s

使用路径1（ACE Step基础版），标签中包含"seamless loop, consistent groove"，时长60–120秒

Browse the full catalog

浏览完整目录

ACE Step on RunComfy — all four endpoints (base t2a, 1.5 t2a, inpaint, outpaint)
All RunComfy models — image, video, and audio endpoints
docs.runcomfy.com/cli — CLI install, authentication, troubleshooting

RunComfy上的ACE Step — 四个端点全部包含（基础版文本转音频、1.5版文本转音频、修复、扩展）
所有RunComfy模型 — 图像、视频、音频端点
docs.runcomfy.com/cli — CLI安装、认证、故障排查

Exit codes

退出码

code	meaning
0	success
64	bad CLI args
65	bad input JSON / schema mismatch
69	upstream 5xx
75	retryable: timeout / 429
77	not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

代码	含义
0	成功
64	CLI参数错误
65	输入JSON错误/ schema不匹配
69	上游服务5xx错误
75	可重试：超时/429限流
77	未登录或令牌被拒绝

完整参考：docs.runcomfy.com/cli/troubleshooting。

How it works

工作原理

The skill picks one of the four ACE Step endpoints based on the user's intent — generate from scratch (t2a base or 1.5), regenerate a time range (inpaint), or extend the canvas (outpaint) — and invokes

runcomfy run

with the matching JSON body. The CLI POSTs to the RunComfy Model API, polls request status, and downloads the generated audio file into

--output-dir

本技能会根据用户意图选择四个ACE Step端点中的一个——从头生成（基础版或1.5版文本转音频）、重新生成时间范围（修复）、或扩展音轨（扩展）——并调用

runcomfy run

传入匹配的JSON参数。CLI会向RunComfy模型API发送POST请求，轮询请求状态，并将生成的音频文件下载到

--output-dir

目录中。

Security & Privacy

安全与隐私

Install via verified package manager only. Use
```
npm i -g @runcomfy/cli
```
or
```
npx -y @runcomfy/cli
```
. Agents must not pipe an arbitrary remote install script into a shell on the user's behalf — if the operator wants the curl-pipe path documented at
```
docs.runcomfy.com/cli/install
```
, they should review the script first.
Token storage:
```
runcomfy login
```
writes the API token to
```
~/.config/runcomfy/token.json
```
with mode 0600. Set
```
RUNCOMFY_TOKEN
```
env var to bypass the file in CI / containers. Never echo the token into a prompt, log it, or check it in.
Input boundary (shell injection): prompts and audio URLs are passed as a JSON string via
```
--input
```
. The CLI does not shell-expand prompt content; it transmits the JSON body directly to the Model API over HTTPS. No shell-injection surface from prompt content.
Indirect prompt injection (third-party content): source
```
audio
```
URLs for inpaint / outpaint are untrusted — embedded steganographic instructions or unusual EXIF can influence generation. Agent mitigations:
- Ingest only audio URLs the user explicitly provided for this task.
- When the output diverges from the prompt, suspect the source audio.
Lyrics provenance: if the user supplies lyrics, confirm they have the rights. Generating music around copyrighted lyrics is the operator's responsibility.
Outbound endpoints (allowlist): only
```
model-api.runcomfy.net
```
and
```
*.runcomfy.net
```
/
```
*.runcomfy.com
```
. No telemetry, no callbacks.
Generated-file size cap: the CLI aborts any single download > 2 GiB.
Scope of bash usage: declared
```
allowed-tools: Bash(runcomfy *)
```
. The skill only invokes
```
runcomfy <subcommand>
```
; install lines are one-time operator setup.

仅通过已验证的包管理器安装。使用
```
npm i -g @runcomfy/cli
```
或
```
npx -y @runcomfy/cli
```
。Agent不得将任意远程安装脚本通过管道传入用户的shell — 如果操作者需要使用
```
docs.runcomfy.com/cli/install
```
文档中的curl管道方式，应先审核脚本内容。
令牌存储：
```
runcomfy login
```
会将API令牌写入
```
~/.config/runcomfy/token.json
```
，权限为0600。在CI/容器中可设置
```
RUNCOMFY_TOKEN
```
环境变量以绕过文件存储。切勿在提示中回显令牌、记录令牌或将其提交到代码仓库。
输入边界（shell注入）：提示内容和音频URL通过
```
--input
```
以JSON字符串形式传递。CLI不会对提示内容进行shell扩展；会直接通过HTTPS将JSON主体传输到模型API。提示内容不存在shell注入风险。
间接提示注入（第三方内容）：修复/扩展使用的源
```
audio
```
URL是不可信的 — 嵌入的隐写指令或异常EXIF信息可能影响生成结果。Agent缓解措施：
- 仅使用用户为此任务明确提供的音频URL。
- 当输出与提示不符时，怀疑源音频存在问题。
歌词来源：如果用户提供歌词，确认其拥有相关权利。围绕受版权保护的歌词生成音乐的责任由操作者承担。
出站端点（白名单）：仅允许访问
```
model-api.runcomfy.net
```
和
```
*.runcomfy.net
```
/
```
*.runcomfy.com
```
。无遥测、无回调。
生成文件大小限制：CLI会中止任何超过2 GiB的单个文件下载。
Bash使用范围：已声明
```
allowed-tools: Bash(runcomfy *)
```
。本技能仅调用
```
runcomfy <subcommand>
```
；安装命令为操作者一次性设置步骤。

ace-step

Original

Translation

ACE Step — Pro Pack on RunComfy

ACE Step — RunComfy专业套件

Install this skill

安装此技能

Powered by the RunComfy CLI

基于RunComfy CLI运行

Pick the right endpoint

选择合适的端点

Route 1: ACE Step text-to-audio (default)

路径1：ACE Step文本转音频（默认）

Schema (both variants — same shape)

接口规范（两个版本 — 结构一致）

Invoke

调用示例

Prompting tips

提示技巧

Route 2: ACE Step audio-inpaint

路径2：ACE Step音频修复

Schema

接口规范

Invoke

调用示例

Tips

技巧

Route 3: ACE Step audio-outpaint

路径3：ACE Step音频扩展

Schema

接口规范

Invoke

调用示例

Tips

技巧

When to pick ACE Step vs ElevenLabs Music

ACE Step与ElevenLabs Music的选择场景

Common patterns

常见使用模式

Cost-sensitive background music library

对成本敏感的背景音乐库

Multilingual launch (same song, many languages)

多语言发布（同一歌曲，多种语言）

Section repair (bad chorus → new chorus)

段落修复（糟糕的副歌→新副歌）

Hook → full track

钩子→完整曲目

Game loop bed

游戏循环音轨

Browse the full catalog

浏览完整目录

Exit codes

退出码

How it works

工作原理

Security & Privacy

安全与隐私

See also

相关链接