arcads-external-api

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Arcads external API

Arcads外部API

Configuration

配置

  • Base URL:
    https://external-api.arcads.ai
    (or
    ARCADS_BASE_URL
    ).
  • Auth: HTTP Basic — use
    ARCADS_API_KEY
    as the username and an empty password unless Arcads documentation for your key specifies otherwise. Example curl:
    curl -u "$ARCADS_API_KEY:" "$ARCADS_BASE_URL/v1/products"
    .
  • Never print API keys, commit
    .env
    , or paste keys into
    MASTER_CONTEXT.md
    .
  • 基础URL:
    https://external-api.arcads.ai
    (或
    ARCADS_BASE_URL
    )。
  • 认证: HTTP Basic认证 — 使用
    ARCADS_API_KEY
    作为用户名,密码留空,除非Arcads针对你的密钥的文档另有说明。curl示例:
    curl -u "$ARCADS_API_KEY:" "$ARCADS_BASE_URL/v1/products"
  • 绝对不要打印API密钥、提交
    .env
    文件,或把密钥粘贴到
    MASTER_CONTEXT.md
    中。

If the key is missing or the API returns 401/403

若密钥缺失或API返回401/403

  1. Editor-first (default): Ensure
    .env
    exists (copy from
    .env.example
    in the repo root). Ask the user to paste
    ARCADS_API_KEY
    only inside
    .env
    and save. Do not ask them to paste the key in chat unless they insist.
  2. Chat-assisted: If they paste the key in chat, write
    .env
    for them, confirm "saved to
    .env
    " without repeating the key, and remind them that chat history may retain secrets—rotate the key in Arcads if the chat could be shared.
Before the first call, confirm
.gitignore
excludes
.env
.
  1. 优先编辑器操作(默认): 确保
    .env
    文件存在(从仓库根目录的
    .env.example
    复制)。请用户仅在
    .env
    文件内粘贴
    ARCADS_API_KEY
    并保存。除非用户坚持,否则不要让他们在聊天框中粘贴密钥。
  2. 聊天辅助操作: 如果用户在聊天框中粘贴了密钥,帮他们写入
    .env
    文件,确认“已保存至
    .env
    不要重复密钥内容,并提醒他们聊天记录可能会保留机密信息——如果聊天内容可能被分享,请在Arcads中轮换密钥。
首次调用前,确认
.gitignore
已排除
.env
文件。

Read order

阅读顺序

  1. Repo root
    MASTER_CONTEXT.md
    when present (brand voice, decisions, quirks).
  2. This skill's reference.md for routes, bodies, polling.
  3. prompting/guide.md then the right
    prompting/prompt-library/
    file for the model (see table below).
  1. 若存在仓库根目录的**
    MASTER_CONTEXT.md
    **,优先阅读(包含品牌调性、决策细节、特殊规则)。
  2. 阅读本技能的**reference.md**,了解接口路由、请求体、轮询机制。
  3. 先阅读**prompting/guide.md,再根据下表选择对应模型的
    prompting/prompt-library/
    **文件。

Decision tree: which flow?

决策树:选择哪种流程?

All video models use
POST /v2/videos/generate
with the appropriate
model
value (see reference.md for the full
CreateVideoDto
schema).
User goalStart herePrompt library
Seedance 2.0 UGC video — selfie-style product review / testimonial
POST /v2/videos/generate
with
model: "seedance-2.0"
seedance-2.md (platform guide) + seedance-2-ugc.md (9-layer UGC formula)
Seedance 2.0 premium product reveal — dark-void, no person, text narrative
POST /v2/videos/generate
with
model: "seedance-2.0"
seedance-2.md + seedance-2-premium-reveal.md
Seedance 2.0 product hero — elemental effects, no person, splash/mist
POST /v2/videos/generate
with
model: "seedance-2.0"
seedance-2.md + seedance-2-product-hero.md
Seedance 2.0 studio lookbook — polished, voiceover, multi-look
POST /v2/videos/generate
with
model: "seedance-2.0"
seedance-2.md + seedance-2-studio-lookbook.md
Seedance 2.0 feature walkthrough — fast-paced feature demo
POST /v2/videos/generate
with
model: "seedance-2.0"
seedance-2.md + seedance-2-feature-walkthrough.md
Reverse-engineer a video style into a reusable Seedance 2.0 templateFollow the analyze-video skillprompting/analyze-video/SKILL.md
Clone/replicate an existing video ad for a different productFollow the clone-ad skillprompting/clone-ad/SKILL.md
Raw Sora 2 video from text (plus product)
POST /v2/videos/generate
with
model: "sora2"
prompt-library/sora-2.md
Sora remix of an existing asset
POST /v1/sora2/remix/video
sora-2.md
Veo 3.1 video
POST /v2/videos/generate
with
model: "veo31"
prompt-library/veo-3-1.md
Kling 3.0 video
POST /v2/videos/generate
with
model: "kling-3.0"
kling-3.md
Grok Video
POST /v2/videos/generate
with
model: "grok-video"
See reference.md for fields
Nano Banana still image (standalone or as starting frame for video)
POST /v2/images/generate
with
"model":"nano-banana-2"
by default; optional
"model":"nano-banana"
(Nano Banana Pro)
nano-banana.md
B-roll clip (product-level)
POST /v1/b-roll
kling-3.md or nano-banana.md for craft; see reference.md for Kling/Nano routing notes
Scene generation
POST /v1/scene
Same as b-roll row
Recreate an influencer from a reference photoTwo-step: (1)
POST /v2/images/generate
with
refImageAsBase64
to generate a still image via Nano Banana, get user approval; (2) upload approved still →
POST /v2/videos/generate
with
model: "veo31"
and
startFrame
for video. Never skip the approval step.
prompt-library/influencer-recreation.md
Product showcase — AI person holds/uses a product and talks about itTwo-step: (1)
POST /v2/images/generate
with product
refImageAsBase64
; (2) user approves still; (3) start-frame → video via
POST /v2/videos/generate
.
prompt-library/product-showcase.md
UGC / selfie-style (authentic reels, cross-model)Any video model via
POST /v2/videos/generate
prompt-library/ugc-selfie-style.md — cross-model UGC guide. For Seedance 2.0 specifically, use seedance-2-ugc.md instead.
Create a new AI influencer from text (character sheet)Two-pass: (1) hero portrait via
POST /v2/images/generate
, get approval; (2) 9 angles with hero as
referenceImages
. Save to
references/influencers/
.
prompt-library/character-sheet.md
UGC product selfie — AI influencer holding a productCombine character hero + product photo + style references as
referenceImages
.
prompt-library/ugc-product-selfie.md
Pixar-style 3D animated ad — anthropomorphized cartoon ad with mascot beatsMulti-step: (1) Lock cast sheet; (2) ChatGPT Image 2 storyboard stills via
POST /v2/images/generate
with
model: "gpt-image-2"
(max 5
referenceImages
); (3) Seedance 2.0 image-to-video per beat via
POST /v2/videos/generate
with
model: "seedance-2.0"
and
startFrame
from each still; (4) ffmpeg-stitch + burn captions.
../../shared/skills/pixar-style-ad/prompting/guide.mdstoryboard-gpt-image-2.md + animate-seedance-2.md
Claymation / Aardman-style ad — sculpted plasticine characters, narrator-driven 8-beat story arc, 60–115sMulti-step: (1) Lock cast sheet (protagonist + supporting character + narrator voice); (2) ChatGPT Image 2 storyboard stills via
POST /v2/images/generate
with
model: "gpt-image-2"
(max 5
referenceImages
) — fallback to
model: "nano-banana"
(Pro) for close-ups if clay texture flattens; (3) Seedance 2.0 image-to-video per beat via
POST /v2/videos/generate
with
model: "seedance-2.0"
; (4) ffmpeg-stitch (optional
fps=12,fps=24
for stop-motion judder) + burn captions.
../../shared/skills/claymation-ad/prompting/guide.mdstoryboard-gpt-image-2.md + animate-seedance-2.md
Add captions to a finished video — burn timed narrator/dialogue captions onto an existing MP4 (any source — claymation, pixar, UGC, B-roll)Out of band (no Arcads API call). Multi-step: (1)
npx hyperframes init <run-id>-captions
; (2)
npx hyperframes transcribe source.mp4 --model medium.en
(NOT
small.en
if there's background music); (3) group word-level transcript into reading phrases; (4) write captions-only HTML over
#ff00ff
magenta bg — never include
<video>
or
<audio>
elements
(causes black-bar bug); (5)
npm run render
then ffmpeg
chromakey=0xff00ff:0.10:0.05
overlay onto source.
../../shared/skills/caption-video/prompting/guide.md
Talking avatar / script (actors, voices)
POST /v1/scripts
,
POST /v1/scripts/{id}/generate
prompting/guide.md
OmniHuman
POST /v1/omnihuman
prompting/guide.md
Audio-driven
POST /v1/audio-driven
prompting/guide.md
Prefer the shortest path: if the user only needs a single model, do not create scripts unless they ask for actors/lip-sync workflows.
所有视频模型均使用
POST /v2/videos/generate
接口,搭配对应的
model
参数值(完整的
CreateVideoDto
schema请参考reference.md)。
用户目标起始接口提示库
Seedance 2.0 UGC视频 — 自拍风格的产品评测/推荐
POST /v2/videos/generate
model: "seedance-2.0"
seedance-2.md(平台指南) + seedance-2-ugc.md(9层UGC公式)
Seedance 2.0高端产品展示 — 深色背景、无人物、文字叙事
POST /v2/videos/generate
model: "seedance-2.0"
seedance-2.md + seedance-2-premium-reveal.md
Seedance 2.0产品主视觉 — 元素特效、无人物、水花/雾气效果
POST /v2/videos/generate
model: "seedance-2.0"
seedance-2.md + seedance-2-product-hero.md
Seedance 2.0工作室型录 — 精致风格、旁白、多场景展示
POST /v2/videos/generate
model: "seedance-2.0"
seedance-2.md + seedance-2-studio-lookbook.md
Seedance 2.0功能演示 — 快节奏的功能展示
POST /v2/videos/generate
model: "seedance-2.0"
seedance-2.md + seedance-2-feature-walkthrough.md
将视频风格逆向工程为可复用的Seedance 2.0模板遵循analyze-video技能流程prompting/analyze-video/SKILL.md
为不同产品克隆/复制现有视频广告遵循clone-ad技能流程prompting/clone-ad/SKILL.md
基于文本(加产品信息)生成原始Sora 2视频
POST /v2/videos/generate
model: "sora2"
prompt-library/sora-2.md
基于现有资产重制Sora视频
POST /v1/sora2/remix/video
sora-2.md
Veo 3.1视频
POST /v2/videos/generate
model: "veo31"
prompt-library/veo-3-1.md
Kling 3.0视频
POST /v2/videos/generate
model: "kling-3.0"
kling-3.md
Grok Video
POST /v2/videos/generate
model: "grok-video"
参考reference.md中的字段说明
Nano Banana静态图像(独立使用或作为视频起始帧)默认使用
POST /v2/images/generate
"model":"nano-banana-2"
;可选
"model":"nano-banana"
(Nano Banana Pro)
nano-banana.md
B-roll片段(产品级)
POST /v1/b-roll
创作参考kling-3.mdnano-banana.md;Kling/Nano的路由说明请参考reference.md
场景生成
POST /v1/scene
与B-roll流程相同
根据参考照片复刻网红形象两步流程: (1) 使用
refImageAsBase64
调用
POST /v2/images/generate
,通过Nano Banana生成静态图像,获取用户确认;(2) 上传确认后的图像 → 调用
POST /v2/videos/generate
model: "veo31"
并传入
startFrame
生成视频。绝对不要跳过确认步骤。
prompt-library/influencer-recreation.md
产品展示 — AI人物手持/使用产品并讲解两步流程: (1) 使用产品
refImageAsBase64
调用
POST /v2/images/generate
;(2) 用户确认静态图像;(3) 以该图像为起始帧 → 调用
POST /v2/videos/generate
生成视频。
prompt-library/product-showcase.md
UGC / 自拍风格(真实短视频,跨模型)通过
POST /v2/videos/generate
调用任意视频模型
prompt-library/ugc-selfie-style.md — 跨模型UGC指南。针对Seedance 2.0,优先使用seedance-2-ugc.md
根据文本(角色设定)创建新AI网红两次生成: (1) 调用
POST /v2/images/generate
生成主角肖像,获取确认;(2) 以主角肖像为
referenceImages
生成9个角度的图像。保存至
references/influencers/
目录。
prompt-library/character-sheet.md
UGC产品自拍 — AI网红手持产品结合主角肖像 + 产品照片 + 风格参考作为
referenceImages
prompt-library/ugc-product-selfie.md
皮克斯风格3D动画广告 — 拟人化卡通广告,带吉祥物情节多步流程: (1) 确定角色设定;(2) 调用
POST /v2/images/generate
model: "gpt-image-2"
(最多5张
referenceImages
)生成ChatGPT Image 2分镜静态图;(3) 针对每个分镜,调用
POST /v2/videos/generate
model: "seedance-2.0"
并传入分镜图作为
startFrame
,通过Seedance 2.0将图像转为视频;(4) 使用ffmpeg拼接视频并添加字幕。
../../shared/skills/pixar-style-ad/prompting/guide.mdstoryboard-gpt-image-2.md + animate-seedance-2.md
黏土动画 / Aardman风格广告 — 泥塑角色、旁白驱动的8段故事线,时长60–115秒多步流程: (1) 确定角色设定(主角 + 配角 + 旁白声音);(2) 调用
POST /v2/images/generate
model: "gpt-image-2"
(最多5张
referenceImages
)生成ChatGPT Image 2分镜静态图——如果黏土纹理不够清晰,可 fallback 到
model: "nano-banana"
(Pro版)生成特写;(3) 针对每个分镜,调用
POST /v2/videos/generate
model: "seedance-2.0"
生成视频;(4) 使用ffmpeg拼接视频(可选设置
fps=12,fps=24
模拟停格动画卡顿效果)并添加字幕。
../../shared/skills/claymation-ad/prompting/guide.mdstoryboard-gpt-image-2.md + animate-seedance-2.md
为成品视频添加字幕 — 将带时间轴的旁白/对话字幕叠加到现有MP4视频上(任意来源:黏土动画、皮克斯风格、UGC、B-roll)无需调用Arcads API。多步流程: (1)
npx hyperframes init <run-id>-captions
;(2)
npx hyperframes transcribe source.mp4 --model medium.en
(若有背景音乐,请勿使用
small.en
);(3) 将逐词转录内容分组为适合阅读的短语;(4) 编写仅包含字幕的HTML,背景设为
#ff00ff
洋红色——绝对不要包含
<video>
<audio>
元素
(会导致黑边bug);(5)
npm run render
后,使用ffmpeg的
chromakey=0xff00ff:0.10:0.05
参数将字幕叠加到原视频。
../../shared/skills/caption-video/prompting/guide.md
会说话的虚拟形象 / 脚本(演员、声音)
POST /v1/scripts
,
POST /v1/scripts/{id}/generate
prompting/guide.md
OmniHuman
POST /v1/omnihuman
prompting/guide.md
音频驱动
POST /v1/audio-driven
prompting/guide.md
优先选择最简路径:如果用户仅需使用单个模型,除非他们要求演员/唇同步流程,否则不要创建脚本。

Creative layer

创意层要求

  • MANDATORY: Before composing any prompt for the API, read the relevant
    prompting/prompt-library/*.md
    file
    for the chosen model/workflow. Do NOT skip this step — every prompt must align with the vendor guide's formula and best practices.
  • Build one clear prompt paragraph; avoid keyword soup.
  • For Seedance 2.0 / Sora2 / Veo3.1 / Kling / Grok Video / Nano Banana, align with the official vendor guides linked in each
    prompting/prompt-library/*.md
    file (do not paste full vendor docs into chat—summarize checks).
  • Merge slot values from the user and from
    MASTER_CONTEXT.md
    when it conflicts with defaults.
  • 强制要求: 在为API编写任何提示语之前,务必阅读对应模型/流程的
    prompting/prompt-library/*.md
    文件
    。绝对不要跳过这一步——所有提示语必须符合供应商指南中的公式和最佳实践。
  • 编写一段清晰的提示语段落;避免堆砌关键词。
  • 针对Seedance 2.0 / Sora2 / Veo3.1 / Kling / Grok Video / Nano Banana,严格遵循每个
    prompting/prompt-library/*.md
    文件中链接的官方供应商指南(不要在聊天框中粘贴完整的供应商文档——只需总结检查要点)。
  • 当用户提供的信息与默认值冲突时,合并用户提供的参数值与**
    MASTER_CONTEXT.md
    **中的配置。

Session setup: auto-create a dated folder

会话设置:自动创建日期命名的文件夹

At the start of each session that will generate assets, create a folder and project for the day so everything is organized in the Arcads dashboard:
  1. Get today's date as
    YYYY-MM-DD
    .
  2. GET /v1/products
    → pick the target product (default to whichever
    MASTER_CONTEXT.md
    specifies under "My workspace"). If no default is set: if only one product exists, auto-populate
    MASTER_CONTEXT.md
    with its ID and name; if multiple, ask the user to pick and save their choice to
    MASTER_CONTEXT.md
    .
  3. Check existing folders (
    GET /v1/products/{productId}/folders
    ) — if "Arcads API - {today}" already exists, reuse it. Otherwise:
    • POST /v1/folders
      with
      {"productId": "...", "name": "Arcads API - YYYY-MM-DD"}
      .
    • POST /v1/projects
      with
      {"productId": "...", "folderId": "...", "name": "Arcads API - YYYY-MM-DD"}
      .
  4. Store the
    projectId
    for the session and pass it in every generation call (
    projectId
    field on Sora2/Veo31/b-roll/scene/image DTOs) and use
    POST /v1/assets/add-to-project
    after generation for asset types that do not accept
    projectId
    directly.
This ensures every generated asset is findable in the Arcads dashboard under Product → "Arcads API - {date}".
每个将生成资产的会话开始时,创建当日的文件夹和项目,确保所有内容在Arcads控制台中有序管理:
  1. 获取今日日期,格式为
    YYYY-MM-DD
  2. 调用
    GET /v1/products
    → 选择目标产品(默认使用
    MASTER_CONTEXT.md
    中“My workspace”指定的产品)。如果没有默认设置:若仅存在一个产品,自动将其ID和名称写入
    MASTER_CONTEXT.md
    ;若存在多个产品,请用户选择并将其选择保存到
    MASTER_CONTEXT.md
  3. 检查现有文件夹(
    GET /v1/products/{productId}/folders
    )——如果**“Arcads API - {today}”**已存在,则复用该文件夹。否则:
    • 调用
      POST /v1/folders
      ,参数为
      {"productId": "...", "name": "Arcads API - YYYY-MM-DD"}
    • 调用
      POST /v1/projects
      ,参数为
      {"productId": "...", "folderId": "...", "name": "Arcads API - YYYY-MM-DD"}
  4. 存储会话的
    projectId
    ,并在所有生成调用中传入该参数(Sora2/Veo31/b-roll/scene/image的DTO中包含
    projectId
    字段)同时,对于不直接支持
    projectId
    的资产类型,生成后调用
    POST /v1/assets/add-to-project
    将其添加到项目中。
这确保所有生成的资产都能在Arcads控制台的**产品 → “Arcads API - {date}”**路径下找到。

Credit cost estimation (MANDATORY — show before generating)

信用成本估算(强制要求——生成前展示)

Before firing any generation calls, calculate and present the total credit cost to the user as an estimate. Do not generate until the user confirms.
ALWAYS label credit totals as estimates and tell the user to confirm the exact cost in the Arcads platform before generating if precision matters. The Arcads API does not expose billing endpoints; pricing varies by duration, resolution, and reference inputs.
在发起任何生成调用之前,计算并向用户展示预估的总信用成本。获得用户确认后再生成。
始终将信用总额标注为预估,并告知用户如果需要精确成本,请在Arcads平台中确认。 Arcads API不暴露计费接口;定价会根据时长、分辨率和参考输入有所不同。

Cost data sources (in priority order)

成本数据来源(优先级从高到低)

  1. logs/arcads-api.jsonl
    — historical record of actual
    creditsCharged
    values for every previous call. Read this first. Grep for entries with the same
    model
    and similar config (same
    duration
    ,
    resolution
    ,
    referenceImagesCount
    ,
    audioEnabled
    ) and use the recorded
    creditsCharged
    as the estimate. This is the most accurate source.
  2. MASTER_CONTEXT.md
    → Credit costs
    — user-provided pricing rules (e.g. "Seedance 2.0 image-to-video ≈ 0.06/sec"). Use when no matching log entry exists.
  3. Ask the user — if neither source has data for the config, ask the user and write the answer into
    MASTER_CONTEXT.md
    .
Never invent numbers. Always cite the source of the estimate ("based on log entry from YYYY-MM-DD" or "from MASTER_CONTEXT.md rate table").
  1. logs/arcads-api.jsonl
    — 历史记录了每一次调用的实际
    creditsCharged
    值。优先读取此文件。查找相同
    model
    和相似配置(相同
    duration
    resolution
    referenceImagesCount
    audioEnabled
    )的条目,使用记录的
    creditsCharged
    作为预估。这是最准确的来源。
  2. MASTER_CONTEXT.md
    → Credit costs
    — 用户提供的定价规则(例如“Seedance 2.0图像转视频 ≈ 0.06/秒”)。当没有匹配的日志条目时使用。
  3. 询问用户 — 如果上述两个来源均无对应配置的数据,请用户提供信息并写入
    MASTER_CONTEXT.md
绝对不要编造数字。始终说明预估的来源(“基于YYYY-MM-DD的日志条目”或“来自MASTER_CONTEXT.md的费率表”)。

How to calculate

计算方式

total_credits ≈ sum(credits_per_model × variations_requested) for each model
total_credits ≈ sum(credits_per_model × variations_requested) for each model

Example output to user

给用户的示例输出

Estimated credit cost:
  Seedance 2.0 (15s i2v) × 1 = ~0.9 credits   (from logs/arcads-api.jsonl 2026-04-09)
  Veo 3.1                × 2 = ~8 credits     (from MASTER_CONTEXT.md)
  ─────────────────────────────
  Estimated total: ~8.9 credits

⚠️ Estimate only — confirm exact cost in the Arcads platform before proceeding.
Proceed? (yes/no)
Always wait for confirmation before firing. If the user has a credit balance visible in
MASTER_CONTEXT.md
, warn them if the total would exceed it. If neither the logs nor
MASTER_CONTEXT.md
have data for the config, ask the user before the first generation and save the answer.
Exception — QA-fix retries (still images only): After the user has confirmed the initial batch, automatic regeneration to fix visible defects (see Generated image QA below) does not require asking again for credit confirmation. Each retry is still billed — note the extra
creditsCharged
when summarizing the session.
预估信用成本:
  Seedance 2.0(15秒图像转视频)× 1 = ~0.9信用点   (来自logs/arcads-api.jsonl 2026-04-09)
  Veo 3.1                × 2 = ~8信用点     (来自MASTER_CONTEXT.md)
  ─────────────────────────────
  预估总计: ~8.9信用点

⚠️ 仅为预估——生成前请在Arcads平台确认精确成本。
是否继续?(是/否)
始终等待用户确认后再发起调用。如果
MASTER_CONTEXT.md
中显示用户有信用余额,若预估总额将超过余额,请发出警告。如果日志和
MASTER_CONTEXT.md
均无对应配置的数据,首次生成前请询问用户并保存答案。
例外情况——QA修复重试(仅静态图像): 用户确认初始批次后,自动重新生成以修复可见缺陷(见下文生成图像QA)无需再次请求信用确认。每次重试仍会产生费用——会话总结时需记录额外的
creditsCharged

Generation count: multiple variations per prompt

生成数量:每个提示语生成多个变体

Before firing any generation call, ask the user how many variations they want for this prompt. Default is 1 if they don't specify.
When the count is greater than 1, send N separate API calls with the identical payload. Do NOT batch them into a single request — the API has no batch parameter. Fire them in parallel where possible, then poll all asset IDs concurrently.
Present results as a numbered list so the user can compare and pick favorites.
发起任何生成调用前,询问用户希望为该提示语生成多少个变体。如果用户未指定,默认生成1个。
当数量大于1时,发送N次独立的API调用,使用相同的请求体。不要将其批量为单个请求——API不支持批量参数。尽可能并行发起调用,然后同时轮询所有资产ID。
将结果以编号列表形式呈现,方便用户对比并选择偏好的版本。

Nano Banana image: model choice (
nano-banana-2
vs Nano Banana Pro)

Nano Banana图像:模型选择(
nano-banana-2
vs Nano Banana Pro)

For
POST /V2/images/generate
when using a Nano Banana engine:
  • Default:
    "model": "nano-banana-2"
    (Nano Banana 2).
  • Optional:
    "model": "nano-banana"
    when the user asks for Nano Banana Pro (the API has no
    nano-banana-pro
    enum — Pro maps to
    nano-banana
    ; see nano-banana.md).
Before the first Nano Banana image call in a workflow, ask: "Use default Nano Banana 2, or Nano Banana Pro?" If they have no preference, use
nano-banana-2
. Include the chosen
model
in the credit estimate (separate rows in
MASTER_CONTEXT.md
if pricing differs).
使用Nano Banana引擎调用
POST /V2/images/generate
时:
  • 默认:
    "model": "nano-banana-2"
    (Nano Banana 2)。
  • 可选: 当用户要求Nano Banana Pro时,使用
    "model": "nano-banana"
    (API没有
    nano-banana-pro
    枚举值——Pro版对应
    nano-banana
    ;详情请参考nano-banana.md)。
在工作流中首次调用Nano Banana图像生成前,询问用户:“使用默认的Nano Banana 2,还是Nano Banana Pro?” 如果用户无偏好,使用
nano-banana-2
。信用预估中需包含所选的
model
(若定价不同,
MASTER_CONTEXT.md
中需分开记录)。

Script and dialogue

脚本与对话

For any video that features a person speaking, ask the user for the script (the exact words the AI person should say). This is separate from the visual prompt — it's the dialogue.
对于任何有人物说话的视频,请用户提供脚本(AI人物应说的精确内容)。这与视觉提示语分开——是独立的对话内容。

MANDATORY — dialogue confirmation gate

强制要求——对话确认环节

Before generating any video that contains spoken dialogue, the agent MUST:
  1. Extract the dialogue lines from the full prompt and show them to the user in a dedicated block, separate from the visual/cinematography description.
  2. Present them as a clean, numbered list with beat labels (hook / show / demo / verdict, or similar) and any silent beats clearly marked as
    (silent beat — no dialogue)
    .
  3. Read the dialogue out loud in your head at a natural pace, time it against the target duration, and flag the total spoken word count plus whether it comfortably fits.
  4. Explicitly ask for dialogue approval before moving on — e.g. "Approve this dialogue? (yes / edit / rewrite)". Never assume approval from earlier confirmations (tone, template, credit cost). Dialogue approval is its own gate.
  5. Only after the user types
    yes
    (or equivalent) may you proceed to the credit cost confirmation and then generation. If the user says "edit" or proposes changes, revise and re-present the numbered dialogue block until they approve.
Presentation format (use this exact structure):
📝 Dialogue script (please confirm before I generate)

  1. [HOOK]   "Bro. BRO. Look what just showed up."
  2. [SHOW]   "The PAID SOCIAL stripe? Insane. Like, who greenlit this?"
  3. [DEMO]   (silent beat — thumb brushing the suede, small nod)
  4. [VERDICT] "I'm literally wearing these to the gym tomorrow. You guys have to see these in person."

Total spoken words: ~28  |  Target duration: 15s  |  Fits at natural pace: ✅

Approve this dialogue? (yes / edit / rewrite)
This gate applies to Seedance 2.0, Veo 3.1, Sora 2, and Scene — any flow where the model speaks. Skip for silent flows (B-roll, pure product-hero, premium-reveal with no voiceover, Nano Banana images).
在生成任何包含对话的视频之前,智能体必须:
  1. 从完整提示语中提取对话内容,并在单独的区块中展示给用户,与视觉/摄影描述分开。
  2. 将对话整理为清晰的编号列表,标注节拍标签(钩子/展示/演示/结论等),并将无声节拍明确标记为
    (无声节拍 — 无对话)
  3. 以自然语速在脑海中朗读对话,计算时长是否符合目标时长,并标记总单词数以及是否能舒适容纳。
  4. 明确请求对话确认——例如“是否确认此对话?(是 / 编辑 / 重写)”。绝对不要假设用户已通过之前的确认(调性、模板、信用成本)同意对话内容。对话确认是独立的环节。
  5. 只有当用户输入
    yes
    (或等效内容)后,才能进行信用成本确认和生成步骤。如果用户说“编辑”或提出修改,需修订并重新呈现编号对话区块,直至用户确认。
展示格式(严格使用此结构):
📝 对话脚本(生成前请确认)

  1. [钩子]   “兄弟,快看刚到的这个东西。”
  2. [展示]   “付费社交专属款?太离谱了,谁批准的?”
  3. [演示]   (无声节拍 — 拇指摩挲麂皮面料,轻轻点头)
  4. [结论]   “我明天健身就要穿这个。你们一定要亲眼看看。”

总单词数: ~28  |  目标时长: 15秒  |  自然语速下可容纳: ✅

是否确认此对话?(是 / 编辑 / 重写)
此环节适用于Seedance 2.0Veo 3.1Sora 2场景——任何模型会说话的流程。无声流程(B-roll、纯产品主视觉、无旁白的高端展示、Nano Banana图像)可跳过此环节。

Model-specific notes

模型特定说明

  • For Seedance 2.0, Veo 3.1, and Sora 2: embed the dialogue in the
    prompt
    field using a
    Dialogue: "..."
    or
    She speaks: "..."
    pattern (these models generate speech from the text prompt).
  • For Seedance 2.0 specifically: before generating, always ask the user whether to enable audio output (
    audioEnabled: true
    ). Also ask whether they want to supply
    referenceAudios
    (e.g. background music or a specific voice clip). Upload audio files via presigned URL if provided.
  • For Scene (
    CreateSceneDto
    ): use the dedicated
    script
    field for dialogue and
    prompt
    for visuals.
  • For B-roll: no speech — b-roll is silent/ambient by nature. If the user wants speech, redirect to Seedance 2.0, Veo 3.1, Sora 2, or Scene.
  • For Nano Banana images: no speech — these are still images. Speech is handled in the subsequent video generation step.
  • 针对Seedance 2.0Veo 3.1Sora 2: 使用
    Dialogue: "..."
    She speaks: "..."
    格式将对话嵌入
    prompt
    字段(这些模型会从文本提示语生成语音)。
  • 针对Seedance 2.0: 生成前务必询问用户是否启用音频输出(
    audioEnabled: true
    )。同时询问是否需要提供
    referenceAudios
    (例如背景音乐或特定语音片段)。如果用户提供音频文件,通过预签名URL上传。
  • 针对场景
    CreateSceneDto
    ): 使用专用的
    script
    字段存储对话,
    prompt
    字段存储视觉描述。
  • 针对B-roll: 无对话——B-roll本质是无声/环境音。如果用户需要对话,引导至Seedance 2.0、Veo 3.1、Sora 2或场景流程。
  • 针对Nano Banana图像: 无对话——这些是静态图像。对话在后续的视频生成步骤中处理。

Script length → video duration (auto-select)

脚本长度 → 视频时长(自动选择)

Use the script's word count to automatically pick the best
duration
value. Average speaking pace: ~2.5 words per second (~150 WPM). Round up to the next available duration to give breathing room.
根据脚本的单词数自动选择最佳
duration
值。平均语速:~2.5词/秒(~150词/分钟)。向上取整到下一个可用时长,留出缓冲空间。

Sora 2 — duration enum:
[4, 8, 12, 16, 20]
seconds

Sora 2 — 时长枚举值:
[4, 8, 12, 16, 20]

Script lengthDuration
1–8 words4s
9–18 words8s
19–28 words12s
29–38 words16s
39–48 words20s
49+ wordsToo long — offer to split (see below)
脚本长度时长
1–8词4秒
9–18词8秒
19–28词12秒
29–38词16秒
39–48词20秒
49+词过长 — 提供拆分选项(见下文)

Veo 3.1 — no
duration
field

Veo 3.1 — 无
duration
字段

Veo 3.1 auto-determines video length (~8s typical). If the script exceeds ~20 words, warn the user that Veo may truncate dialogue and offer to split or switch to Sora 2 which has longer duration options.
Veo 3.1会自动确定视频长度(通常约8秒)。如果脚本超过约20词,警告用户Veo可能会截断对话,并提供拆分或切换到支持更长时长的Sora 2的选项。

Seedance 2.0 — duration: 4–15 seconds (continuous)

Seedance 2.0 — 时长: 4–15秒(连续可选)

Seedance 2.0 supports any integer from 4 to 15. Use ~2.5 words/second, round up to the nearest second.
Script lengthDuration
1–8 words4–5s
9–15 words6–8s
16–25 words9–12s
26–35 words13–15s
36+ wordsToo long — offer to split into multiple clips
For no-dialogue styles (product hero, premium reveal), default to 15s.
Resolution: Default to
720p
. Only use
480p
if the user asks for a faster/cheaper test generation.
Aspect ratio:
9:16
(vertical, default for UGC/social) or
16:9
(landscape). No
1:1
support.
Seedance 2.0支持4到15之间的任意整数。使用~2.5词/秒的标准,向上取整到最近的秒数。
脚本长度时长
1–8词4–5秒
9–15词6–8秒
16–25词9–12秒
26–35词13–15秒
36+词过长 — 提供拆分为多个片段的选项
对于无对话风格(产品主视觉、高端展示),默认时长为15秒
分辨率: 默认使用
720p
。仅当用户要求更快/更便宜的测试生成时,使用
480p
宽高比:
9:16
(竖屏,UGC/社交平台默认)或
16:9
(横屏)。不支持
1:1

B-roll (Kling 3.0) — duration enum:
[5, 10]
seconds

B-roll(Kling 3.0) — 时长枚举值:
[5, 10]

B-roll is typically wordless. If the user insists on a timed clip with context:
Script lengthDuration
1–12 words5s
13–24 words10s
25+ wordsToo long — redirect to Sora 2 / Veo 3.1 for speech
B-roll通常无对话。如果用户要求带上下文的定时片段:
脚本长度时长
1–12词5秒
13–24词10秒
25+词过长 — 引导至Sora 2 / Veo 3.1生成带对话的视频

Scene — no
duration
field

场景 — 无
duration
字段

Scene auto-determines length. Use the
script
field for dialogue.
场景会自动确定长度。使用
script
字段存储对话。

Splitting long scripts into multiple videos

将长脚本拆分为多个视频

If the script exceeds the maximum duration for the chosen model:
  1. Tell the user the script is too long for a single video and show the word/duration math.
  2. Offer two options:
    • Split into segments — the agent breaks the script at natural sentence boundaries into chunks that each fit within the model's max duration. Each chunk becomes a separate generation call.
    • Switch models — if they're on Kling (10s max), suggest Sora 2 (up to 20s).
  3. If the user chooses to split, generate each segment as a separate video (respecting the generation count — if they asked for 3 variations, generate 3 of each segment).
  4. Offer to stitch the final segments together using
    ffmpeg
    :
    • Download all segment videos locally.
    • Concatenate using
      ffmpeg -f concat -safe 0 -i list.txt -c copy output.mp4
      (re-encode if codecs differ).
    • Present the stitched file alongside the individual segments so the user has both.
如果脚本超过所选模型的最大时长:
  1. 告知用户脚本过长,无法在单个视频中容纳,并展示单词数/时长的计算过程。
  2. 提供两个选项:
    • 拆分为片段 — 智能体将脚本按自然语句边界拆分为多个片段,每个片段均符合模型的最大时长。每个片段对应一次独立的生成调用。
    • 切换模型 — 如果当前使用Kling(最大10秒),建议切换到Sora 2(最长20秒)。
  3. 如果用户选择拆分,将每个片段生成为独立视频(遵循生成数量要求——如果用户要求3个变体,则每个片段生成3个版本)。
  4. 提供拼接服务:使用
    ffmpeg
    将最终片段拼接在一起:
    • 下载所有片段视频到本地。
    • 使用
      ffmpeg -f concat -safe 0 -i list.txt -c copy output.mp4
      拼接(若编码不同则重新编码)。
    • 同时提供拼接后的文件和单个片段,供用户选择。

Veo 3.1:
startFrame
vs
referenceImages
— pick one

Veo 3.1:
startFrame
vs
referenceImages
— 二选一

Veo 3.1 has two mutually exclusive image input modes. Never use both on the same call.
ModeFieldWhen to use
Start frame
startFrame
(presigned upload
filePath
)
User provides a reference image of a person or scene they want the video to start from. The video will animate from this exact image. Use this for influencer recreation, character consistency, or any "make this image come alive" request.
Reference images
referenceImages
(array of
filePath
strings)
User provides images for style, mood, or visual tone — not to appear literally in frame. The model uses them as inspiration, not as a first frame.
Default rule: When the user provides a single reference photo of a person, always use
startFrame
unless they explicitly say they want it as a style reference.
Veo 3.1有两种互斥的图像输入模式。绝对不要在同一调用中同时使用两种模式。
模式字段使用场景
起始帧
startFrame
(预签名上传的
filePath
用户提供人物或场景的参考图像,希望视频从此图像开始动画。用于复刻网红、保持角色一致性,或任何“让此图像动起来”的请求。
参考图像
referenceImages
filePath
字符串数组)
用户提供风格、氛围或视觉调性的图像——并非要直接出现在画面中。模型将其作为灵感来源,而非起始帧。
默认规则: 当用户提供单张人物参考照片时,优先使用
startFrame
,除非用户明确说明将其作为风格参考。

Image handling: auto-upscale small inputs

图像处理:自动放大小尺寸输入

Before sending any reference image, start frame, or base64 image to the API:
  1. Check dimensions. If the image's longest side is below 1024 px, upscale it using Lanczos resampling so the longest side reaches 1080 px (preserve aspect ratio).
  2. Convert to RGB JPEG (quality 90–95) to strip alpha channels and keep payload size reasonable.
  3. Re-encode as base64 (for
    refImageAsBase64
    ) or upload the resized file (for
    startFrame
    via presigned URL).
Several Arcads endpoints (notably
POST /v1/b-roll
) reject images below a minimum resolution with
422 — The provided image is too small
. Auto-upscaling prevents this silently so the user never hits the error.
在将任何参考图像、起始帧或base64图像发送到API之前:
  1. 检查尺寸。如果图像最长边小于1024像素,使用Lanczos重采样将最长边放大到1080像素(保持宽高比)。
  2. 转换为RGB JPEG格式(质量90–95),去除alpha通道并控制 payload 大小。
  3. 重新编码为base64(用于
    refImageAsBase64
    )或上传调整后的文件(用于通过预签名URL上传的
    startFrame
    )。
多个Arcads接口(尤其是
POST /v1/b-roll
)会拒绝低于最小分辨率的图像,返回
422 — The provided image is too small
。自动放大可避免此错误,无需用户干预。

Generated image QA (mandatory)

生成图像QA(强制要求)

Applies to still images from Arcads, especially
POST /V2/images/generate
(Nano Banana and other image models). After each image asset reaches
status: generated
, visually inspect the output (download or open the image URL / use the agent's image-reading capability).
Look for: extra or missing hands or fingers; wrong limb count; distorted, duplicated, or merged facial features; melted or fused objects; impossible anatomy; stray limbs; obvious texture or boundary artifacts; unreadable or garbled text if text was requested.
If something looks wrong: Do not hand off the bad frame as the final deliverable without trying again. Regenerate with a revised prompt that explicitly corrects the issue (e.g. "exactly two hands, five fingers each, anatomically correct arms," "single face, no duplicate features"). Do not resend the identical payload and expect a different outcome.
Retry cap: Up to 2 regeneration attempts per originally requested image (3 attempts total including the first). If defects remain after the cap, stop auto-retries, tell the user what still looks wrong, show the best attempt or URLs for all attempts, and ask how they want to proceed.
Credits: Each attempt is a separate generation and is billed. Summarize total credits used for that image after the QA loop ends. See Exception — QA-fix retries under Credit cost estimation.
Video (optional quick check): Before spending heavily on downstream video, you may spot-check scene/b-roll thumbnails or extracted frames for the same kinds of defects; scope is lighter than for hero stills.
Details and checklist items: prompting/prompt-library/nano-banana.md.
适用于Arcads生成的静态图像,尤其是
POST /V2/images/generate
(Nano Banana和其他图像模型)。每当图像资产的
status
变为
generated
后,视觉检查输出(下载或打开图像URL / 使用智能体的图像识别能力)。
检查要点: 多余或缺失的手/手指;肢体数量错误;面部特征扭曲、重复或融合;物体融化或粘连;不符合人体结构的部位; stray limbs;明显的纹理或边界瑕疵;如果要求生成文字,文字是否清晰可读。
如果发现问题: 绝对不要将有缺陷的图像作为最终交付品,而不尝试修复。重新生成时使用修订后的提示语,明确纠正问题(例如“精确的两只手,每只手五个手指,手臂符合人体结构”,“单一面部,无重复特征”)。绝对不要重复发送相同的请求体,期望得到不同的结果。
重试上限: 每个初始请求的图像最多2次重新生成尝试(包括首次生成,总计3次)。如果达到上限后仍有缺陷,停止自动重试,告知用户剩余问题,展示最佳尝试结果或所有尝试的URL,并询问用户如何处理。
信用点: 每次尝试都是独立的生成操作,会产生费用。QA循环结束后,总结该图像的总信用点消耗。请参考信用成本估算中的例外情况——QA修复重试
视频(可选快速检查): 在后续视频生成投入大量成本之前,可抽查场景/B-roll缩略图或提取的帧,检查类似缺陷;检查范围比主角静态图像更轻量。
详细信息和检查清单:prompting/prompt-library/nano-banana.md

Execution checklist (agent)

执行检查清单(智能体)

  1. Session folder: Ensure today's dated folder + project exist (see above).
  2. Resolve
    productId
    (and
    projectId
    from session folder):
    GET /v1/products
    or ask the user.
  3. Ask for script/dialogue: If the output is a video with a person speaking, ask the user for the exact words. Count words to auto-select duration (see "Script length → video duration" above). If too long, offer to split. (Skip for Nano Banana image-only requests.)
    • MANDATORY dialogue confirmation gate (before credit cost / before generation): Extract the dialogue lines from the drafted prompt and present them to the user as a dedicated, numbered block separate from the visual description. Follow the format in Script and dialogue → MANDATORY dialogue confirmation gate. Wait for explicit
      yes
      before moving on. This gate is separate from the credit cost confirmation — both must be satisfied.
  4. Nano Banana image model: For
    POST /V2/images/generate
    , confirm Nano Banana 2 (default) vs Nano Banana Pro (
    nano-banana
    ) per the section above. Skip if not an image call.
  5. Ask for generation count: Ask how many variations the user wants for this prompt. Default to 1.
  6. Show credit cost and get confirmation: Calculate total credits using the cost table above. Present the breakdown to the user. Do NOT proceed until they confirm.
  7. Check
    references/
    folder:
    Before composing the prompt, check the repo-root
    references/
    folder for relevant images:
    references/influencers/
    for person recreation,
    references/products/
    for product showcase,
    references/aesthetics/
    for style/mood. If the user hasn't provided an image but a relevant one exists in
    references/
    , offer to use it. Auto-upscale any reference image if needed. For Veo 3.1, determine whether to use
    startFrame
    or
    referenceImages
    (see section above — default to
    startFrame
    for person photos).
  8. Compose JSON per OpenAPI / reference.md. Primary video endpoint:
    POST /v2/videos/generate
    with the appropriate
    model
    value (see
    CreateVideoDto
    in reference.md). Include
    projectId
    when the DTO supports it. Set
    duration
    based on script length for models that require it. For Nano Banana images, use
    POST /v2/images/generate
    with
    model
    set per the Nano Banana section (
    nano-banana-2
    unless the user chose Pro).
    • Seedance 2.0 extras: Set
      resolution
      to
      720p
      (default). Set
      aspectRatio
      to
      9:16
      (UGC/social) or
      16:9
      (landscape). Include
      audioEnabled
      per user confirmation. If the user provided reference images, upload via presigned URL and pass
      filePath
      strings in
      referenceImages
      (max 3). Same for
      referenceVideos
      and
      referenceAudios
      if provided. Keep
      @(img1)
      tokens in the prompt text alongside the
      referenceImages
      array.
    • ⚠️ Seedance 2.0 mutually exclusive input modes (confirmed 2026-04-09):
      referenceVideos
      and
      referenceImages
      cannot be combined in the same request — the API returns
      HTTP 500 UNKNOWN_ERROR
      . Pick one: image-to-video OR video-to-video.
      referenceAudios
      may be combined with either. See
      reference.md
      for details.
    • Seedance 2.0 v2v + human faces — RESOLVED 2026-04-14: v2v with people/faces in reference videos now works. Previously blocked by content checker (April 9). See
      reference.md
      .
    • Seedance 2.0 audio+image 500 regression — RESOLVED 2026-04-14:
      audioEnabled: true
      +
      referenceImages
      now works. Previously returned HTTP 500 (April 9). Always use freshly obtained presigned URLs. See
      reference.md
      .
  9. POST
    the correct endpoint N times (once per requested variation) with the same payload. Fire in parallel where possible. Immediately after the POST succeeds, append a log entry to
    logs/arcads-api.jsonl
    with the request config (model, duration, resolution, aspectRatio, audioEnabled, reference counts, promptWordCount, assetId). Do NOT log the full prompt text, API keys, or Authorization headers.
  10. Poll:
    GET /v1/videos/{videoId}
    for video IDs;
    GET /v1/assets/{id}
    for asset IDs (including Nano Banana images) until
    status
    is
    generated
    or
    failed
    (see reference.md). Poll all asset IDs concurrently. When polling completes, update the log entry with
    response.status
    ,
    response.creditsCharged
    ,
    response.generationTimeSec
    ,
    response.videoUrl
    ,
    response.thumbnailUrl
    , and
    response.error
    (if failed). See
    logs/README.md
    for the schema.
  11. Generated image QA: For each still image produced in this turn (e.g.
    POST /V2/images/generate
    ), follow Generated image QA: inspect the image; if defective, regenerate with a refined prompt until pass or 2 retries are exhausted. Skip this step for video-only outputs with no still to review.
  12. Assign ALL assets to session project: After generation (and QA retries), check each asset's
    projects
    array. If it does not include the session
    projectId
    , call
    POST /v1/assets/add-to-project
    . This applies to every generated asset — including failed QA attempts and intermediate assets like Nano Banana stills used as starting frames for subsequent video generations. All assets from the session must end up in the same dated project folder.
  13. Present results: Return watch URLs, image URLs, or download URLs for QA-passed stills (or the best attempt after max retries, with a clear note). If multiple variations, present as a numbered list for comparison. Explain
    failed
    with moderation/validation hints if
    422
    occurred. For Nano Banana images used as starting frames, show the image and wait for user approval before proceeding to video generation.
  • ALWAYS open the output folder on the user's machine after saving generated files so they can immediately review:
    open "<output_directory>"
    (macOS). Save videos to
    outputs/
    with a descriptive subfolder (e.g.
    outputs/seedance-tests/
    ,
    outputs/clone-ad-tests/
    ).
  1. Stitch if split: If the script was split into segments, offer to stitch the final videos together with
    ffmpeg
    and provide both the stitched file and individual segments.
  1. 会话文件夹: 确保当日的日期命名文件夹 + 项目已创建(见上文)。
  2. 确定
    productId
    (以及会话文件夹对应的
    projectId
    ):调用
    GET /v1/products
    或询问用户。
  3. 请求脚本/对话: 如果输出是有人物说话的视频,请用户提供精确的对话内容。根据单词数自动选择时长(见“脚本长度 → 视频时长”)。如果过长,提供拆分选项。(仅生成Nano Banana图像的请求可跳过此步骤。)
    • 强制要求的对话确认环节(信用成本确认前/生成前): 从草拟的提示语中提取对话内容,以独立的编号区块呈现给用户,与视觉描述分开。遵循脚本与对话 → 强制要求——对话确认环节中的格式。等待用户明确输入
      yes
      后再继续。此环节独立于信用成本确认——两者都必须完成。
  4. Nano Banana图像模型: 对于
    POST /V2/images/generate
    请求,确认使用Nano Banana 2(默认)还是Nano Banana Pro(
    nano-banana
    ),见上文。非图像请求可跳过此步骤。
  5. 询问生成数量: 询问用户希望为该提示语生成多少个变体。默认生成1个。
  6. 展示信用成本并获取确认: 使用上述成本表计算总信用点。向用户展示明细。未获得确认前不要继续。
  7. 检查
    references/
    文件夹:
    编写提示语前,检查仓库根目录的
    references/
    文件夹中是否有相关图像:
    references/influencers/
    用于人物复刻,
    references/products/
    用于产品展示,
    references/aesthetics/
    用于风格/氛围参考。如果用户未提供图像但
    references/
    中有相关图像,可提议使用。如有需要,自动放大任何参考图像。针对Veo 3.1,确定使用
    startFrame
    还是
    referenceImages
    (见上文——人物照片默认使用
    startFrame
    )。
  8. 根据OpenAPI / reference.md编写JSON。主要视频接口:
    POST /v2/videos/generate
    ,搭配对应的
    model
    参数值(见reference.md中的
    CreateVideoDto
    )。如果DTO支持,包含
    projectId
    。根据脚本长度为需要时长参数的模型设置
    duration
    。针对Nano Banana图像,使用
    POST /v2/images/generate
    ,根据Nano Banana部分的说明设置
    model
    (默认
    nano-banana-2
    ,除非用户选择Pro版)。
    • Seedance 2.0额外设置:
      resolution
      设为
      720p
      (默认)。将
      aspectRatio
      设为
      9:16
      (UGC/社交平台)或
      16:9
      (横屏)。根据用户确认设置
      audioEnabled
      。如果用户提供参考图像,通过预签名URL上传,并在
      referenceImages
      中传入
      filePath
      字符串(最多3张)。如果用户提供
      referenceVideos
      referenceAudios
      ,同样处理。在提示语文本中保留
      @(img1)
      标记,同时传入
      referenceImages
      数组。
    • ⚠️ Seedance 2.0互斥输入模式(2026-04-09确认):
      referenceVideos
      referenceImages
      不能在同一请求中组合使用——API会返回
      HTTP 500 UNKNOWN_ERROR
      。二选一:图像转视频 或 视频转视频。
      referenceAudios
      可与任意一种模式组合使用。详情请参考
      reference.md
    • Seedance 2.0视频转视频+人脸 — 2026-04-14已解决: 参考视频中有人物/人脸的视频转视频功能现已可用。此前因内容审核被限制(4月9日)。详情请参考
      reference.md
    • Seedance 2.0音频+图像500错误回归 — 2026-04-14已解决:
      audioEnabled: true
      +
      referenceImages
      组合现已可用。此前返回HTTP 500错误(4月9日)。请始终使用最新获取的预签名URL。详情请参考
      reference.md
  9. 针对请求的变体数量,发起N次正确的接口调用(每次调用使用相同的请求体)。尽可能并行发起调用。POST请求成功后立即将请求配置(model、duration、resolution、aspectRatio、audioEnabled、参考数量、promptWordCount、assetId)追加到
    logs/arcads-api.jsonl
    日志文件中。绝对不要记录完整提示语、API密钥或Authorization头。
  10. 轮询: 针对视频ID调用
    GET /v1/videos/{videoId}
    ;针对资产ID(包括Nano Banana图像)调用
    GET /v1/assets/{id}
    ,直到
    status
    变为
    generated
    failed
    (见reference.md)。同时轮询所有资产ID。轮询完成后,更新日志条目,添加
    response.status
    response.creditsCharged
    response.generationTimeSec
    response.videoUrl
    response.thumbnailUrl
    response.error
    (如果失败)。日志schema请参考
    logs/README.md
  11. 生成图像QA: 针对本次生成的每个静态图像(例如
    POST /V2/images/generate
    ),遵循生成图像QA:检查图像;如果有缺陷,使用优化后的提示语重新生成,直至通过或达到2次重试上限。仅生成视频的输出可跳过此步骤。
  12. 将所有资产分配到会话项目: 生成完成(及QA重试)后,检查每个资产的
    projects
    数组。如果不包含会话的
    projectId
    ,调用
    POST /v1/assets/add-to-project
    。这适用于所有生成的资产——包括QA失败的尝试中间资产(例如用作后续视频生成起始帧的Nano Banana静态图像)。会话中的所有资产必须最终存入同一个日期命名的项目文件夹。
  13. 展示结果: 返回可观看的URL、图像URL或下载URL,对应通过QA的静态图像(或达到重试上限后的最佳尝试结果,并附上明确说明)。如果有多个变体,以编号列表形式呈现方便对比。如果返回
    failed
    ,若为
    422
    错误,提供审核/验证提示。对于用作起始帧的Nano Banana图像,展示图像并等待用户确认后再进行视频生成。
    • 始终打开用户机器上的输出文件夹,保存生成文件后让用户可立即查看:
      open "<output_directory>"
      (macOS)。将视频保存到
      outputs/
      目录下的描述性子文件夹中(例如
      outputs/seedance-tests/
      outputs/clone-ad-tests/
      )。
  14. 拆分后拼接: 如果脚本被拆分为多个片段,提供使用
    ffmpeg
    拼接最终视频的服务,并同时提供拼接后的文件和单个片段。

Errors (user-facing)

用户可见的错误

  • 401/403: Fix API key / workspace access (setup flow above).
  • 404: Wrong UUID; re-fetch lists.
  • 422: Validation or moderation — tighten prompt, remove disallowed content, check required enums (aspect ratio, duration).
  • 500: Retry later; if repeated, stop and report.
  • 401/403: 修复API密钥/工作区权限(见上文设置流程)。
  • 404: UUID错误;重新获取列表。
  • 422: 验证或审核不通过——优化提示语,移除不允许的内容,检查必填枚举值(宽高比、时长)。
  • 500: 稍后重试;如果重复出现,停止操作并报告。

Supporting files

支持文件

  • reference.md — endpoints, auth detail, polling, model mapping notes,
    CreateVideoDto
    schema.
  • prompting/guide.md — marketing brief → API.
  • Seedance 2.0:
    • prompting/prompt-library/seedance-2.md — main Seedance 2.0 model guide (platform rules, API parameters, style template directory).
    • prompting/prompt-library/seedance-2-ugc.md — 9-layer UGC selfie-style formula for Seedance 2.0.
    • prompting/prompt-library/seedance-2-premium-reveal.md — dark-void premium product reveal (no person).
    • prompting/prompt-library/seedance-2-product-hero.md — elemental product hero with splash/effects (no person).
    • prompting/prompt-library/seedance-2-studio-lookbook.md — studio lookbook with voiceover.
    • prompting/prompt-library/seedance-2-feature-walkthrough.md — fast-paced feature walkthrough demo.
    • prompting/analyze-video/SKILL.md — reverse-engineer a reference video into a reusable Seedance 2.0 prompting template.
    • prompting/clone-ad/SKILL.md — clone a reference video ad for a different product (end-to-end: analyze → adapt → generate).
  • Other models:
    • prompting/prompt-library/influencer-recreation.md — analyze a reference photo and recreate the influencer.
    • prompting/prompt-library/ugc-selfie-style.md — cross-model UGC guide (iPhone aesthetic, negative prompts, per-model formulas).
    • prompting/prompt-library/product-showcase.md — product-in-hand video workflow (Nano Banana image → approve → video).
    • prompting/prompt-library/nano-banana.md — Nano Banana image prompting guide.
    • prompting/prompt-library/character-sheet.md — generate a 10-image character sheet for a new AI influencer from a text description.
    • prompting/prompt-library/ugc-product-selfie.md — UGC selfie-style still image: character + product + style references.
  • prompting/brand-voice-starter.md — template to copy into
    MASTER_CONTEXT.md
    .
  • reference.md — 接口、认证细节、轮询机制、模型映射说明、
    CreateVideoDto
    schema。
  • prompting/guide.md — 营销简报 → API转换指南。
  • Seedance 2.0:
    • prompting/prompt-library/seedance-2.md — Seedance 2.0模型主指南(平台规则、API参数、风格模板目录)。
    • prompting/prompt-library/seedance-2-ugc.md — Seedance 2.0的9层UGC自拍风格公式。
    • prompting/prompt-library/seedance-2-premium-reveal.md — 深色背景高端产品展示(无人物)。
    • prompting/prompt-library/seedance-2-product-hero.md — 带水花/特效的元素化产品主视觉(无人物)。
    • prompting/prompt-library/seedance-2-studio-lookbook.md — 带旁白的工作室型录。
    • prompting/prompt-library/seedance-2-feature-walkthrough.md — 快节奏功能演示。
    • prompting/analyze-video/SKILL.md — 将参考视频逆向工程为可复用的Seedance 2.0提示语模板。
    • prompting/clone-ad/SKILL.md — 为不同产品克隆参考视频广告(端到端:分析 → 适配 → 生成)。
  • 其他模型:
    • prompting/prompt-library/influencer-recreation.md — 分析参考照片并复刻网红形象。
    • prompting/prompt-library/ugc-selfie-style.md — 跨模型UGC指南(iPhone风格、反向提示语、各模型公式)。
    • prompting/prompt-library/product-showcase.md — 手持产品视频工作流(Nano Banana图像 → 确认 → 视频)。
    • prompting/prompt-library/nano-banana.md — Nano Banana图像提示语指南。
    • prompting/prompt-library/character-sheet.md — 根据文本描述为新AI网红生成10张图像的角色设定集。
    • prompting/prompt-library/ugc-product-selfie.md — UGC自拍风格静态图像:角色 + 产品 + 风格参考。
  • prompting/brand-voice-starter.md — 可复制到
    MASTER_CONTEXT.md
    的品牌调性模板。