muapi-ugc-video-factory

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

UGC Video Factory

UGC视频工厂

Turn a person photo + product photo (+ optional script & environment) into a vertical 9:16 UGC-style video ad with native dialogue audio.
A three-stage pipeline:
  1. GPT writes a director-grade ultra-realistic lifestyle photography prompt from your inputs.
  2. Nano-Banana Pro Edit fuses the person + product into a single hero photo (1K, 9:16).
  3. Seedance 2.0 VIP Image-to-Video animates the hero photo into a 10s vertical UGC clip with synced spoken audio.
将人物照片+产品照片(+可选脚本与场景)转换为带有原生对话音频的竖版9:16比例UGC风格视频广告。
分为三个阶段的流程:
  1. GPT根据你的输入生成导演级超写实生活化摄影提示词。
  2. Nano-Banana Pro Edit将人物与产品融合为一张主视觉照片(1K分辨率,9:16比例)。
  3. Seedance 2.0 VIP图生视频工具将主视觉照片制作成10秒竖版UGC短视频,并同步添加语音音频。

Inputs

输入参数

NameTypeRequiredDefaultDescription
person
image_urlyesPhoto of the person who will appear in the ad (face + upper body works best).
product
image_urlyesClear photo of the product (preferably on neutral background, logo/text legible).
script
textno
Okay… first of all, ship happens. And this hat is honestly my favorite. It also comes in navy and black, so you can pick your vibe.
The exact line the on-screen person will say (kept short — 1–2 sentences fit 10s comfortably).
environment
textno
study room, laptop in front of it
Scene / context where the person is using the product (e.g. "bathroom mirror, morning routine", "coffee shop window seat").
If
person
or
product
is missing, ask the user to upload them (
muapi upload file <path>
) or offer to generate placeholders before continuing.
名称类型是否必填默认值说明
person
image_url广告中出现的人物照片(脸部+上半身效果最佳)。
product
image_url清晰的产品照片(最好置于中性背景上,标识/文字清晰可辨)。
script
文本
Okay… first of all, ship happens. And this hat is honestly my favorite. It also comes in navy and black, so you can pick your vibe.
屏幕中人物将说的台词(需简短——1-2句话适合10秒时长)。
environment
文本
study room, laptop in front of it
使用产品的场景/背景(例如:"浴室镜子前,晨间日常"、"咖啡店靠窗座位")。
如果缺少
person
product
,请要求用户上传(
muapi upload file <path>
),或提议生成占位图后再继续。

Steps

操作步骤

Run the three steps sequentially — each step's output feeds the next.
按顺序执行三个步骤——每个步骤的输出作为下一个步骤的输入。

Step 1 — Director Prompt (GPT)

步骤1——导演提示词(GPT)

Use a GPT model (
gpt-5.1
or whichever chat model is available to the executing agent) with temperature 0 and max ~200 tokens to produce the hero-image prompt.
System prompt:
You are a helpful assistant.
User prompt (substitute
{{person}}
,
{{product}}
,
{{environment}}
):
Uploaded images are being analyzed. Ultra-realistic lifestyle photography with {{person}} and {{product}} and {{environment}}.

If the product is wearable (e.g., hat, glasses, hooded sweatshirt), the person wears the product naturally.

If the product is carried in the hand (e.g., cream, bottle, thermos), the person holds the product naturally.

The product is clearly visible and is the main focus of the image. The logo or text on the product must be legible.

The person has a natural and modern look with a minimalist style.

The scene is consistent with the context of the product's use: {{environment}}.

Lighting: soft natural daylight.
Background: clean, aesthetic, slightly blurred (shallow depth of field).
Style: high-end commercial lifestyle photography, realistic textures, 4K quality, vertical 9:16 composition, social-media advertising style. The background and environment should be appropriate to the product (e.g. a woman with a serum could be at home). The person's facial details and the product must remain unchanged.
Capture the GPT response as
{{step1_prompt}}
.
使用GPT模型(
gpt-5.1
或执行Agent可用的其他聊天模型),设置temperature为0最大约200 tokens来生成主视觉图提示词。
系统提示词:
You are a helpful assistant.
用户提示词(替换
{{person}}
{{product}}
{{environment}}
):
Uploaded images are being analyzed. Ultra-realistic lifestyle photography with {{person}} and {{product}} and {{environment}}.

If the product is wearable (e.g., hat, glasses, hooded sweatshirt), the person wears the product naturally.

If the product is carried in the hand (e.g., cream, bottle, thermos), the person holds the product naturally.

The product is clearly visible and is the main focus of the image. The logo or text on the product must be legible.

The person has a natural and modern look with a minimalist style.

The scene is consistent with the context of the product's use: {{environment}}.

Lighting: soft natural daylight.
Background: clean, aesthetic, slightly blurred (shallow depth of field).
Style: high-end commercial lifestyle photography, realistic textures, 4K quality, vertical 9:16 composition, social-media advertising style. The background and environment should be appropriate to the product (e.g. a woman with a serum could be at home). The person's facial details and the product must remain unchanged.
将GPT的响应保存为
{{step1_prompt}}

Step 2 — Hero Image (Nano-Banana Pro Edit)

步骤2——主视觉图(Nano-Banana Pro Edit)

Submit a
muapi image edit
call against the
nano-banana-pro-edit
model:
  • Reference images (
    image_urls
    ):
    [ {{person}}, {{product}} ]
    — order matters; person first.
  • Prompt:
    {{step1_prompt}}
    from Step 1.
  • Aspect ratio:
    9:16
  • Num images:
    1
  • Resolution:
    1K
  • Output format:
    jpeg
Capture the resulting image URL as
{{hero_image}}
. Briefly show it to the user for approval before kicking off the video step.
调用
muapi image edit
接口,使用
nano-banana-pro-edit
模型:
  • 参考图片
    image_urls
    ):
    [ {{person}}, {{product}} ]
    ——顺序重要;人物在前。
  • 提示词:步骤1生成的
    {{step1_prompt}}
  • 宽高比
    9:16
  • 生成图片数量
    1
  • 分辨率
    1K
  • 输出格式
    jpeg
将生成的图片URL保存为
{{hero_image}}
。在启动视频制作步骤前,先向用户展示该图片以获得确认。

Step 3 — UGC Video (Seedance 2.0 VIP Image-to-Video)

步骤3——UGC视频(Seedance 2.0 VIP图生视频)

Submit a
muapi video from-image
call against
seedance-2-vip-image-to-video
(or the
-fast
variant if the executing agent wants lower latency).
  • Start image:
    {{hero_image}}
    from Step 2.
  • Aspect ratio:
    9:16
  • Duration:
    10
    seconds.
  • Generate audio:
    true
    (native dialogue).
  • CFG scale:
    0.5
  • Negative prompt:
    blur, distort, low quality
  • Prompt (substitute
    {{script}}
    ):
Create a 10-second vertical UGC-style video (9:16).

A person is interacting naturally with their setting and product.

The product is used naturally:
- If wearable → the person is wearing it.
- If handheld → the person is holding or applying it.

The video is a single, uninterrupted shot. No cuts. No color changes. No text on screen.

The person looks directly at the camera with a relaxed and natural expression.
They interact comfortably with the product using their hands (adjusting, holding, pointing).

They say in a natural, conversational tone:

"{{script}}"

Subtle hand gestures while speaking.
End with a small smile or nod.

Style: authentic UGC, handheld phone feel, light natural movement, soft daylight, shallow depth of field, TikTok/Reels aesthetic.
Poll the result with
muapi predict wait <request_id>
and download to the user's outputs directory.
调用
muapi video from-image
接口,使用**
seedance-2-vip-image-to-video
**(如果执行Agent希望降低延迟,可使用
-fast
变体)。
  • 起始图片:步骤2生成的
    {{hero_image}}
  • 宽高比
    9:16
  • 时长
    10
    秒。
  • 生成音频
    true
    (原生对话)。
  • CFG scale
    0.5
  • 负面提示词
    blur, distort, low quality
  • 提示词(替换
    {{script}}
    ):
Create a 10-second vertical UGC-style video (9:16).

A person is interacting naturally with their setting and product.

The product is used naturally:
- If wearable → the person is wearing it.
- If handheld → the person is holding or applying it.

The video is a single, uninterrupted shot. No cuts. No color changes. No text on screen.

The person looks directly at the camera with a relaxed and natural expression.
They interact comfortably with the product using their hands (adjusting, holding, pointing).

They say in a natural, conversational tone:

"{{script}}"

Subtle hand gestures while speaking.
End with a small smile or nod.

Style: authentic UGC, handheld phone feel, light natural movement, soft daylight, shallow depth of field, TikTok/Reels aesthetic.
使用
muapi predict wait <request_id>
轮询结果,并下载到用户的输出目录。

Notes

注意事项

  • VIP tier supports 9:16 and durations 4–15s; 10s is the sweet spot for a 1–2 sentence script.
  • Keep the script short — Seedance 2.0 will compress longer scripts and clip words.
  • Seedance VIP tolerates realistic human faces in references (unlike Chinese tier), making it the right choice for UGC.
  • If you want lower latency at the same quality, swap to
    seedance-2-vip-image-to-video-fast
    .
  • For multi-shot ads, generate several
    {{hero_image}}
    variations in Step 2 and animate each independently — Seedance VIP does not multi-image i2v at 9:16 + audio.
  • VIP tier支持9:16比例和4-15秒时长;10秒是适配1-2句台词的最佳时长。
  • 台词需简短——Seedance 2.0会压缩较长台词并截断词语。
  • Seedance VIP支持参考图中的写实人脸(不同于中国区版本),因此是制作UGC内容的合适选择。
  • 如果希望在保持画质的同时降低延迟,可切换为
    seedance-2-vip-image-to-video-fast
  • 如需制作多镜头广告,可在步骤2中生成多个
    {{hero_image}}
    变体,然后分别制作动画——Seedance VIP不支持9:16比例+音频的多图生视频。

Trigger Keywords

触发关键词

ugc video factory
,
ugc video ad
,
person plus product video
,
talking product ad
,
ugc reel
,
lifestyle product video
,
vertical ugc video

ugc video factory
,
ugc video ad
,
person plus product video
,
talking product ad
,
ugc reel
,
lifestyle product video
,
vertical ugc video

Notes for the Executing Agent

执行Agent注意事项

  • This recipe is LLM-orchestrated: read each phase, gather any missing inputs from the user, then call
    muapi
    CLI commands. Run
    muapi auth configure
    first if
    MUAPI_API_KEY
    is unset.
  • For local files supplied by the user, upload them first:
    muapi upload file <path> --output-json --jq '.url'
    .
  • Substitute
    {{input_name}}
    placeholders with the user's actual inputs before issuing each call.
  • If the
    muapi
    CLI does not yet alias
    nano-banana-pro-edit
    or
    seedance-2-vip-image-to-video
    , fall back to the raw API:
    curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}'
    , then poll with
    muapi predict wait <request_id>
    .
  • 本流程由LLM编排:阅读每个阶段,向用户收集缺失的输入,然后调用
    muapi
    CLI命令。如果
    MUAPI_API_KEY
    未设置,请先运行
    muapi auth configure
  • 对于用户提供的本地文件,需先上传:
    muapi upload file <path> --output-json --jq '.url'
  • 在发出每个调用前,将
    {{input_name}}
    占位符替换为用户的实际输入。
  • 如果
    muapi
    CLI尚未为
    nano-banana-pro-edit
    seedance-2-vip-image-to-video
    设置别名,可回退到原始API:
    curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}'
    ,然后使用
    muapi predict wait <request_id>
    轮询结果。