seedance-v2
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSeedance 2.0 Pro — Pro Pack on RunComfy
Seedance 2.0 Pro — RunComfy专业版
ByteDance Seedance 2.0 Pro — multimodal cinematic video generator with native lip-synced audio — hosted on the RunComfy Model API.
bash
npx skills add agentspace-so/runcomfy-skills --skill seedance-v2 -g字节跳动Seedance 2.0 Pro——一款具备原生唇形同步音频的多模态电影级视频生成器,托管在RunComfy Model API上。
bash
npx skills add agentspace-so/runcomfy-skills --skill seedance-v2 -gWhen to pick this model (vs siblings)
何时选择该模型(对比同类模型)
Seedance 2.0 Pro's distinct strength is multi-modal cinematic short-form: combine character images + scene videos + reference audio into one coherent shot. Pick it when fidelity to a reference identity / scene matters and you want native lip-sync.
| You want | Use |
|---|---|
| Lip-synced spokesperson / dialogue ad | Seedance 2.0 Pro |
| Multi-modal references (image + video + audio) | Seedance 2.0 Pro |
| Brand-consistent multi-language narrative | Seedance 2.0 Pro |
| Currently-#1 blind-vote video quality | HappyHorse 1.0 |
| Audio-driven lip-sync from your own track | Wan 2.7 ( |
| Motion editing on existing footage | Kling Video O1 |
| Ultra-fast iteration | LTX 2 |
If the user said "Seedance" / "Seedance 2" / "ByteDance video" explicitly, route here regardless.
Seedance 2.0 Pro的独特优势是多模态电影级短视频生成:可将人物图片、场景视频、参考音频融合为一个连贯镜头。当需要忠实还原参考人物/场景,且希望实现原生唇形同步时,选择该模型。
| 需求场景 | 适用模型 |
|---|---|
| 带唇形同步的代言人/对话广告 | Seedance 2.0 Pro |
| 多模态参考(图片+视频+音频) | Seedance 2.0 Pro |
| 品牌一致的多语言叙事视频 | Seedance 2.0 Pro |
| 当前盲投排名第一的视频质量 | HappyHorse 1.0 |
| 基于自定义音轨的音频驱动唇形同步 | Wan 2.7(使用 |
| 现有素材的运动编辑 | Kling Video O1 |
| 超快速迭代生成 | LTX 2 |
如果用户明确提到"Seedance" / "Seedance 2" / "ByteDance video",无论其他需求如何,均使用此模型。
Prerequisites
前置条件
- RunComfy CLI —
npm i -g @runcomfy/cli - RunComfy account — opens a browser device-code flow.
runcomfy login - CI / containers — set instead of
RUNCOMFY_TOKEN=<token>.runcomfy login
- RunComfy CLI — 执行安装
npm i -g @runcomfy/cli - RunComfy账号 — 执行将打开浏览器设备码登录流程
runcomfy login - CI/容器环境 — 设置环境变量替代
RUNCOMFY_TOKEN=<token>runcomfy login
Endpoints + input schema
接口与输入规范
bytedance/seedance-v2/pro
bytedance/seedance-v2/probytedance/seedance-v2/pro
bytedance/seedance-v2/pro| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| string | yes | — | CN ≤ 500 chars OR EN ≤ 1000 words. |
| array | no | | 0–9 references (JPEG/PNG/WebP/BMP/TIFF/GIF). |
| array | no | | 0–3 clips (MP4/MOV), 2–15s each. |
| array | no | | 0–3 audio refs (WAV/MP3), 2–15s, < 15MB each. |
| enum | no | | |
| int | no | 5 | 4–15 (whole seconds). |
| enum | no | | |
| bool | no | true | In-pass synchronized speech / SFX / music. |
| int | no | — | Reproducibility. |
| 字段 | 类型 | 是否必填 | 默认值 | 说明 |
|---|---|---|---|---|
| 字符串 | 是 | — | 中文≤500字符 或 英文≤1000词 |
| 数组 | 否 | | 0–9个参考素材(支持JPEG/PNG/WebP/BMP/TIFF/GIF格式) |
| 数组 | 否 | | 0–3个视频片段(MP4/MOV格式),每个时长2–15秒 |
| 数组 | 否 | | 0–3个音频参考素材(WAV/MP3格式),每个时长2–15秒,文件大小<15MB |
| 枚举值 | 否 | | 可选值: |
| 整数 | 否 | 5 | 4–15(单位:秒,需为整数) |
| 枚举值 | 否 | | 可选值: |
| 布尔值 | 否 | true | 生成同步的语音/音效/音乐 |
| 整数 | 否 | — | 用于生成结果的可复现性 |
How to invoke
调用方式
Default (text only, 5s, 720p with audio):
bash
runcomfy run bytedance/seedance-v2/pro \
--input '{"prompt": "<user prompt>"}' \
--output-dir <absolute/path>Lip-synced ad with character reference (image-stable, text-evolves):
bash
runcomfy run bytedance/seedance-v2/pro \
--input '{
"prompt": "Medium close-up. The woman explains today'\''s special in a warm friendly tone, slow push-in, soft window light, gentle cafe ambience.",
"image_url": ["https://.../barista-headshot.jpg"],
"duration": 8,
"aspect_ratio": "9:16"
}' \
--output-dir <absolute/path>Multi-modal (image + video + audio refs):
bash
runcomfy run bytedance/seedance-v2/pro \
--input '{
"prompt": "Subject from image 1 walks through the café from video 1, voice tone matches audio 1.",
"image_url": ["https://.../subject.jpg"],
"video_url": ["https://.../cafe-locked-shot.mp4"],
"audio_url": ["https://.../voice-ref.mp3"]
}' \
--output-dir <absolute/path>The CLI submits, polls, fetches the result, downloads / URLs into .
*.runcomfy.net*.runcomfy.com--output-dir默认调用(仅文本输入,5秒时长,720p分辨率带音频):
bash
runcomfy run bytedance/seedance-v2/pro \
--input '{"prompt": "<用户提示词>"}' \
--output-dir <绝对路径>带人物参考的唇形同步广告(人物形象稳定,文本驱动内容变化):
bash
runcomfy run bytedance/seedance-v2/pro \
--input '{
"prompt": "中近景镜头。女士用温暖友好的语气介绍今日特供,镜头缓慢推进,柔和的窗边光线,温馨的咖啡馆氛围。",
"image_url": ["https://.../barista-headshot.jpg"],
"duration": 8,
"aspect_ratio": "9:16"
}' \
--output-dir <绝对路径>多模态调用(图片+视频+音频参考):
bash
runcomfy run bytedance/seedance-v2/pro \
--input '{
"prompt": "图片1中的人物穿过视频1中的咖啡馆,语气与音频1匹配。",
"image_url": ["https://.../subject.jpg"],
"video_url": ["https://.../cafe-locked-shot.mp4"],
"audio_url": ["https://.../voice-ref.mp3"]
}' \
--output-dir <绝对路径>CLI会提交请求、轮询状态、获取结果,并将/地址的内容下载到指定的目录。
*.runcomfy.net*.runcomfy.com--output-dirPrompting — what actually works
提示词撰写技巧——有效方法
Image vs text division. This is the single most important rule. Stable identity (face, costume, brand mark, logo) → put in . Evolving narrative (action, mood, lighting, camera) → put in . Trying to verbally describe a face in detail wastes tokens and produces drift.
image_urlpromptCamera + motion in plain language. "Medium close-up", "slow push-in", "handheld follow", "locked-off wide" all work as directives. Combine:
"Medium close-up. Slow push-in over 3 seconds. Handheld, slight breathing motion."Audio direction with — say the tone: , , . For ambient: .
generate_audio: true"warm friendly conversational""calm instructional""crisp newsroom delivery""gentle cafe chatter, distant traffic, no foreground music"Reference media specs — videos must be 2–15s; audio must be ≤15MB and 2–15s. Out-of-range files reject. Match aspect ratio of refs to your output to avoid crops.
Anti-patterns:
- Mixing radically different aesthetic refs (watercolor + photoreal) → confuses.
- Conflicting style cues in prompt → simplify by removing contradictions.
- Trying to describe stable identity verbally → use instead.
image_url - Asking for >15s clips → 422; segment into multiple calls.
图片与文本的分工:这是最重要的规则。稳定的元素(人脸、服装、品牌标识、Logo)→ 放入。动态叙事内容(动作、情绪、光线、镜头运动)→ 写入。试图用文字详细描述人脸会浪费token并导致结果偏差。
image_urlprompt用通俗语言描述镜头与运动:"中近景"、"缓慢推进"、"手持跟拍"、"固定广角"等表述均可作为指令。组合示例:
"中近景镜头。3秒内缓慢推进。手持拍摄,轻微呼吸感晃动。"开启时的音频指令:描述语气,例如、"冷静的教学语气"、"清晰的新闻播报语气"。对于环境音:。
generate_audio: true"温暖友好的对话语气""轻柔的咖啡馆交谈声,远处的车流声,无前景音乐"参考素材规格:视频时长需为2–15秒;音频文件大小≤15MB且时长2–15秒。超出范围的文件会被拒绝。参考素材的宽高比需与输出匹配,避免裁剪。
反模式:
- 混合风格差异极大的参考素材(水彩风格+写实风格)→ 会让模型混淆
- 提示词中包含冲突的风格线索→ 简化提示词,移除矛盾内容
- 试图用文字描述稳定元素→ 改用
image_url - 请求生成超过15秒的视频→ 会返回422错误;可拆分为多次调用
Where it shines
优势场景
| Use case | Why Seedance 2.0 Pro |
|---|---|
| Spokesperson / dialogue ads | Native in-pass lip-sync, no separate TTS step |
| Brand-consistent multi-language narratives | Image refs hold identity; text drives translation |
| Cinematic short-form film previs | Camera-shot grammar + multi-modal refs |
| Ad creatives with reference music / VO tone | Audio refs guide voice / mood without locking lip-sync |
| Reproducible variant testing | Seed control + fixed schema |
| 使用场景 | 选择Seedance 2.0 Pro的原因 |
|---|---|
| 代言人/对话广告 | 原生内联唇形同步,无需单独的TTS步骤 |
| 品牌一致的多语言叙事视频 | 图片参考保持人物一致性;文本驱动翻译内容 |
| 电影级短视频前期预演 | 支持镜头语法+多模态参考 |
| 带参考音乐/旁白语气的广告创意 | 音频参考引导语音/情绪,无需锁定唇形同步 |
| 可复现的变体测试 | 支持Seed控制+固定规范 |
Sample prompts (verified to produce strong results)
验证有效的示例提示词
Default playground example:
Golden hour on a quiet cafe terrace: a barista wipes the counter, then
looks up and explains today's special in a friendly tone, natural
lip-sync. Medium close-up, slow push-in; warm side light, soft bokeh
through glass, gentle cafe ambience and subtle film grain.Multi-modal lip-sync (text + image):
Same person as image 1 in a softly-lit recording booth, leaning into
the mic, says: "We just shipped the biggest update of the year."
Calm conversational tone. Medium close-up, locked tripod, shallow DOF,
warm key light from camera-left.默认演示示例:
宁静咖啡馆露台的黄金时刻:咖啡师擦拭柜台,然后抬头用友好的语气介绍今日特供,自然唇形同步。中近景镜头,缓慢推进;温暖的侧光,透过玻璃的柔和虚化背景,温馨的咖啡馆氛围与细微的胶片颗粒感。多模态唇形同步(文本+图片):
与图片1中相同的人物在光线柔和的录音棚内,凑近麦克风说:"我们刚刚发布了本年度最大的更新。"语气冷静且自然。中近景镜头,三脚架固定拍摄,浅景深,来自镜头左侧的暖色主光。Limitations
局限性
- Duration 4–15s — no longer clips on this endpoint.
- Resolution ceiling 720p on the playground variant.
- Reference media specs — videos / audio must be 2–15s; audio < 15MB.
- Lip-sync quality — depends on prompt clarity; not guaranteed perfect under all conditions.
- No -syntax for character binding — relies on image refs + prompt alignment.
@
- 时长限制4–15秒:此接口不支持更长的视频片段
- 分辨率上限720p:演示版本的分辨率最高为720p
- 参考素材规格限制:视频/音频时长需为2–15秒;音频文件大小<15MB
- 唇形同步质量:取决于提示词的清晰度;无法保证所有场景下都完美
- 无语法绑定人物:依赖图片参考+提示词对齐
@
Exit codes
退出码
| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
| 代码 | 含义 |
|---|---|
| 0 | 成功 |
| 64 | CLI参数错误 |
| 65 | 输入JSON错误/规范不匹配 |
| 69 | 上游服务5xx错误 |
| 75 | 可重试:超时/429限流 |
| 77 | 未登录或令牌被拒绝 |
How it works
工作原理
The skill invokes with a JSON body matching the schema. The CLI POSTs to , polls the request, fetches the result, and downloads any / URL into . cancels the remote request before exit.
runcomfy run bytedance/seedance-v2/prohttps://model-api.runcomfy.net/v1/models/bytedance/seedance-v2/pro.runcomfy.net.runcomfy.com--output-dirCtrl-C该技能通过符合规范的JSON请求体调用。CLI会向发送POST请求,轮询请求状态,获取结果,并将所有/地址的内容下载到指定的目录。按下会在退出前取消远程请求。
runcomfy run bytedance/seedance-v2/prohttps://model-api.runcomfy.net/v1/models/bytedance/seedance-v2/pro.runcomfy.net.runcomfy.com--output-dirCtrl-CSecurity & Privacy
安全与隐私
- Token storage: writes the API token to
runcomfy loginwith mode 0600 (owner-only read/write). Set~/.config/runcomfy/token.jsonenv var to bypass the file entirely in CI / containers.RUNCOMFY_TOKEN - Input boundary: the user prompt is passed as a JSON string to the CLI via . The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
--input - Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
- Outbound endpoints: only (request submission) and
model-api.runcomfy.net/*.runcomfy.net(download whitelist for generated outputs). No telemetry, no callbacks.*.runcomfy.com - Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.
- 令牌存储:会将API令牌写入
runcomfy login,权限设置为0600(仅所有者可读写)。在CI/容器环境中,可设置环境变量~/.config/runcomfy/token.json完全绕过文件存储。RUNCOMFY_TOKEN - 输入边界:用户提示词通过以JSON字符串形式传递给CLI。CLI不会对提示词进行Shell展开;会直接通过HTTPS将JSON请求体传输给模型API。提示词内容不存在Shell注入风险。
--input - 第三方内容:你传入的图片/遮罩/视频URL由RunComfy模型服务器获取,而非本地CLI。请将外部URL视为不可信内容;基于图片的提示词注入是所有图片/视频编辑模型的已知风险。
- 出站接口:仅允许访问(提交请求)和
model-api.runcomfy.net/*.runcomfy.net(下载生成结果的白名单地址)。无遥测数据,无回调。*.runcomfy.com - 生成文件大小限制:CLI会终止任何超过2 GiB的单个文件下载,防止恶意或异常模型输出占满磁盘。