Loading...
Loading...
Create AI avatar, talking-head, and lip-sync videos on RunComfy via the `runcomfy` CLI. Routes across ByteDance OmniHuman (audio-driven full-body avatar), Wan-AI Wan 2-7 (audio-driven mouth sync via `audio_url` on a portrait), HappyHorse 1.0 (Arena #1 t2v / i2v with in-pass audio), and Seedance v2 Pro (multi-modal cinematic with reference audio + reference subject). Picks the right model for the user's actual intent — UGC voiceover, virtual presenter, dubbed product demo, lip-synced character, dialog scene — and ships each model's documented prompting patterns plus the minimal `runcomfy run` invoke. Triggers on "talking head", "lip sync", "avatar video", "make X speak", "audio to video", "audio driven avatar", "virtual presenter", "AI spokesperson", "dubbed video", "UGC avatar", "HeyGen alternative", "Synthesia alternative", "digital human", "make this portrait talk", "video from voiceover", or any explicit ask to put words in a face.
npx skill4agent add agentspace-so/runcomfy-agent-skills ai-avatar-videoruncomfy run# 1. Install (see runcomfy-cli skill for details)
npm i -g @runcomfy/cli # or: npx -y @runcomfy/cli --version
# 2. Sign in
runcomfy login # or in CI: export RUNCOMFY_TOKEN=<token>
# 3. Generate an avatar video
runcomfy run <vendor>/<model>/<endpoint> \
--input '{"prompt": "...", "audio_url": "https://...", "image_url": "https://..."}' \
--output-dir ./outruncomfy-clinpx skills add agentspace-so/runcomfy-agent-skills --skill ai-avatar-video -gbytedance/omnihuman/apiByteDance audio-driven full-body avatar. Feed one portrait + one audio file, get back a video where the subject speaks / sings / gestures naturally. Listed on RunComfy'sas the curated default. Pick for: UGC voiceover, virtual presenter, dubbed product demo, multi-language clips from same portrait. Avoid for: no audio file available (need to generate speech from a script) — use HappyHorse 1.0./feature/lip-sync
happyhorse/happyhorse-1-0/text-to-videohappyhorse/happyhorse-1-0/image-to-videoArena #1 t2v / i2v with in-pass audio generated from prompt. No external audio file required — quote the spoken line inside the prompt. Pick for: written script with no audio file, "write a script → get a video", concept clips, i2v talking-head from an existing portrait. Avoid for: precise lip-sync to a specific MP3 — audio is regenerated each call, not locked.
bytedance/seedance-v2/proByteDance multi-modal flagship — up to 9 reference images, 3 reference videos, 3 reference audio tracks composed in one pass with cinematic motion / lens / lighting control. Pick for: cinematic monologue with reference subject + reference audio + reference scene; ad creative. Avoid for: simple "portrait + audio" jobs — overpowered, slower. Use OmniHuman.
audio_urlwan-ai/wan-2-7/text-to-videoOpen-weights withfield — prompt describes the scene, audio file drives the mouth. Pick for: full scene control (not just a portrait), specific voiceover MP3, open-weights pipeline. Avoid for: simplest portrait-talks job — use OmniHuman.audio_url
community/wan-2-2-animate/apiCommunity-published variant on the Wan 2-2 base. Audio-driven full-body animation of stylized characters (illustration, anime, mascot). Pick for: stylized / illustrated character + audio (not a photoreal portrait). Avoid for: photoreal subjects — use OmniHuman or Wan 2-7.
runcomfy run bytedance/omnihuman/api \
--input '{
"image_url": "https://your-cdn.example/presenter.jpg",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./outaudio_urlwan-ai/wan-2-7/text-to-videoaudio_urlruncomfy run wan-ai/wan-2-7/text-to-video \
--input '{
"prompt": "Studio portrait of a woman in her 30s, confident expression, soft window light, neutral gray background.",
"audio_url": "https://your-cdn.example/voiceover.mp3",
"duration": 8
}' \
--output-dir ./outruncomfy run community/wan-2-2-animate/api \
--input '{
"image_url": "https://your-cdn.example/character.png",
"audio_url": "https://your-cdn.example/voiceover.mp3"
}' \
--output-dir ./outhappyhorse/happyhorse-1-0/text-to-videohappyhorse/happyhorse-1-0/image-to-videoruncomfy run happyhorse/happyhorse-1-0/text-to-video \
--input '{
"prompt": "A woman in her 30s, confident expression, looks at the camera and says clearly: \"Welcome to our product demo. Today we are going to show you three things.\" Soft daylight, neutral background.",
"duration": 6,
"aspect_ratio": "9:16",
"resolution": "1080p"
}' \
--output-dir ./outruncomfy run happyhorse/happyhorse-1-0/image-to-video \
--input '{
"image_url": "https://your-cdn.example/portrait.jpg",
"prompt": "She looks at the camera and says clearly: \"Hi, I am Aria.\" Audio: friendly tone, neutral accent.",
"duration": 5
}' \
--output-dir ./outsays clearly: "…""Audio: friendly tone, neutral accent."bytedance/seedance-v2/proruncomfy run bytedance/seedance-v2/pro \
--input '{
"prompt": "Anamorphic close-up — the subject delivers a confident monologue to camera, golden hour light through window, shallow DoF.",
"reference_images": ["https://your-cdn.example/subject.jpg"],
"reference_audio": ["https://your-cdn.example/voiceover.mp3"],
"duration": 10,
"aspect_ratio": "21:9"
}' \
--output-dir ./outai-image-generationaudio_url/models/feature/lip-sync/models/feature/character-swaprecently-added| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
runcomfy run <model_id>.runcomfy.net.runcomfy.com--output-dirnpm i -g @runcomfy/clinpx -y @runcomfy/cliruncomfy login~/.config/runcomfy/token.jsonRUNCOMFY_TOKEN--inputmodel-api.runcomfy.net*.runcomfy.net*.runcomfy.comallowed-tools: Bash(runcomfy *)runcomfy <subcommand>runcomfy-cliai-video-generationlipsyncface-swapimage-to-videoai-image-generation