gen-ai-persona-creation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAI Influencer Persona
AI网红人设
Turn one sentence into a head-to-toe 4-angle casting card in signature wardrobe, persona profile, platform-tuned captions, and (optional) a reel with ambient audio. Output: .
./<persona-slug>/将一句话转化为包含专属穿搭的全身4角度选角卡、人设档案、适配平台的文案,以及(可选)带环境音的短视频。输出路径:。
./<persona-slug>/When to Use
适用场景
See the description above.
参见上方描述。
Prerequisites
前置条件
bash
gen-ai whoami # auth + gen-ai install + Node v22+ check
command -v curl # ships with macOS / Linux / Git-BashIf fails: or set + . No extra media tools needed.
gen-ai whoamigen-ai loginPICSART_ACCESS_TOKENPICSART_USER_IDbash
gen-ai whoami # 认证 + gen-ai 安装 + Node v22+ 版本检查
command -v curl # macOS / Linux / Git-Bash 自带若执行失败:运行或设置 + 。无需额外媒体工具。
gen-ai whoamigen-ai loginPICSART_ACCESS_TOKENPICSART_USER_IDHow to Run
运行方式
Use the agent's tool to invoke commands as described in the Procedure below.
terminalgen-ai使用Agent的工具,按照下方流程调用命令。
terminalgen-aiQuick Reference
快速参考
See the Procedure for canonical commands.
参见流程中的标准命令。
Procedure
流程
See sections below for the detailed walkthrough.
参见下方详细步骤说明。
Pitfalls
常见问题
See Common Pitfalls below.
参见下方常见问题部分。
Verification
验证方法
Run to confirm authentication, then re-run the failed command with .
gen-ai whoami--debug运行确认认证状态,然后添加参数重新执行失败的命令。
gen-ai whoami--debugHow the skill calls gen-ai
gen-ai技能调用gen-ai
的方式
gen-aibash
URL=$(gen-ai generate -m <model> -p "<prompt>" --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/<file>.<ext> "$URL"--download--json --no-inputBash footguns: never add or stderr redirects between and the closing — shell parse error before the command runs (verified). Keep the inner pipe strictly . One generation per .
2>&1--json --no-input)--json --no-input | grep -oE 'https?://[^"]+' | head -1URL=$(...)bash
URL=$(gen-ai generate -m <model> -p "<prompt>" --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/<file>.<ext> "$URL"--download--json --no-inputBash陷阱: 在和闭合之间绝对不要添加或标准错误重定向,否则命令执行前会出现Shell解析错误(已验证)。内部管道必须严格保持为。每个对应一次生成操作。
--json --no-input)2>&1--json --no-input | grep -oE 'https?://[^"]+' | head -1URL=$(...)Style routing
风格路由
| Style | Model | For | Cost |
|---|---|---|---|
| | photoreal humans + photoreal common pets / anthropomorphic animals | ~3 cr |
| | anime, 3D-animated fruit/object/character, illustration | ~1 cr |
Cross-provider fallback: primary fails → retry with (~3 cr, supports ). Both fail → surface error.
flux-2-maximageUrls| 风格 | 模型 | 适用场景 | 成本 |
|---|---|---|---|
| | 写实人类 + 写实常见宠物/拟人化动物 | ~3 cr |
| | 动漫、3D动画化水果/物品/角色、插画 | ~1 cr |
跨供应商降级方案: 主模型失败→重试使用(~3 cr,支持)。若两者均失败→显示错误信息。
flux-2-maximageUrlsStyle inference (read the brief)
风格推断(阅读需求描述)
| Brief contains | Style → opening |
|---|---|
| Fruit/veggie/food + "character" / "anthropomorphic" / "brainrot" | stylized → fruit/object |
| Animal name + "pet" / "influencer" / "creator" / breed (NOT "real form" / "four-legged") | realistic → anthropomorphic humanoid pet (default for animal/pet briefs — fluffy biped in cute clothes, matches project's |
| Animal name + explicit "real form" / "four-legged" / "on all fours" / "real cat / dog / animal" | realistic → real-form quadruped pet (opt-in) |
| "Anime", "manga", "magical girl", "kawaii", "shoujo", "shonen" | stylized → anime |
| "3D rendered", "stylized 3D", "claymation", "feature-film animation" | stylized → 3D character |
| "Illustrated", "painted", "watercolor", "comic book" | stylized → illustration |
| Human profession + demographic, no style cue | realistic → photoreal human |
Both anthropomorphic-humanoid and real-form-quadruped are supported, but anthropomorphic is the default for pet briefs — that matches the Picsart project's category which is fluffy biped influencers in cute clothes (tiny sweaters, mini hoodies, bow ties), with food-themed names (Biscuit, Mochi, Nugget, Bean, Waffles, Tofu, Pickle) and gen-z bios ("professional napper | treat negotiator | certified good boy/girl"). Real-form four-legged is the opt-in for creators who explicitly say so. Style conflict (e.g. "anime fitness coach") → prefer the stylistic cue.
petsMost creators want stylized. Don't blindly default to realistic.
IP-safe wording (mandatory): never name studios / franchises in prompts sent to the model — no "Pixar", "Disney", "Toy Story", "Studio Ghibli", "Marvel", etc. Recognize creator phrasing like "Pixar-style" as a 3D-animated intent (route to stylized 3D) but use generic descriptors in the actual prompt: "3D-animated", "feature-film animation aesthetic", "stylized 3D rendering", "anime cel-shaded illustration". Studio names trigger content policies + downstream IP risk.
| 需求描述包含内容 | 风格→初始设定 |
|---|---|
| 水果/蔬菜/食物 + "character" / "anthropomorphic" / "brainrot" | stylized → 水果/物品角色 |
| 动物名称 + "pet" / "influencer" / "creator" / 品种(不含"real form" / "four-legged") | realistic → 拟人化人形宠物(宠物类需求默认:穿着可爱衣服的毛茸茸双足角色,匹配项目 |
| 动物名称 + 明确的"real form" / "four-legged" / "on all fours" / "real cat / dog / animal" | realistic → 真实形态四足宠物(需主动选择) |
| "Anime", "manga", "magical girl", "kawaii", "shoujo", "shonen" | stylized → 动漫风格 |
| "3D rendered", "stylized 3D", "claymation", "feature-film animation" | stylized → 3D角色 |
| "Illustrated", "painted", "watercolor", "comic book" | stylized → 插画风格 |
| 人类职业 + 人口特征,无风格提示 | realistic → 写实人类 |
拟人化人形和真实形态四足宠物均支持,但宠物类需求默认采用拟人化风格——这与Picsart项目的分类一致:穿着可爱衣服(小毛衣、迷你卫衣、领结)的毛茸茸双足网红,食物系名字(Biscuit、Mochi、Nugget、Bean、Waffles、Tofu、Pickle等),Z世代风格简介("professional napper | treat negotiator | certified good boy/girl")。真实形态四足仅适用于明确指定的创作者。风格冲突(例如"anime fitness coach")→优先采用风格化提示。
pets大多数创作者偏好风格化。 不要盲目默认写实风格。
IP安全措辞(强制要求): 发送给模型的提示中绝对不要提及工作室/IP名称——禁止使用"Pixar"、"Disney"、"Toy Story"、"Studio Ghibli"、"Marvel"等。将创作者的"Pixar-style"这类表述理解为3D动画的意图(路由至风格化3D),但在实际提示中使用通用描述:"3D-animated"、"feature-film animation aesthetic"、"stylized 3D rendering"、"anime cel-shaded illustration"。工作室名称会触发内容政策及后续IP风险。
What creators express in their brief (natural language)
创作者在需求描述中常用的自然语言表达
The agent extracts intent — no CLI flags to learn:
- Reference image ("from /path/photo.png") → adds to casting-card call
-i - Reel ("add a tiktok reel", "with motion") → triggers Step 4 (~11 extra cr)
- Platform ("for tiktok", "instagram reel", "linkedin") → drives reel AR + caption tuning
- Style ("anime", "3D", "painted", "photoreal") → routes realistic / stylized
- Name ("named Nova") → sets persona name
- Character type ("strawberry character", "golden retriever pet", "magical girl") → picks subject opening
Agent会提取意图——无需学习CLI参数:
- 参考图片("from /path/photo.png")→ 在选角卡调用中添加参数
-i - 短视频("add a tiktok reel", "with motion")→ 触发步骤4(额外约11 cr)
- 平台("for tiktok", "instagram reel", "linkedin")→ 驱动短视频AR格式和文案适配
- 风格("anime", "3D", "painted", "photoreal")→ 路由至写实/风格化
- 名称("named Nova")→ 设置人设名称
- 角色类型("strawberry character", "golden retriever pet", "magical girl")→ 选择主题初始设定
Quick start
快速开始
Plain English. Examples:
- "Create a persona for: fitness coach, gen-z, neon vibe" (realistic human)
- "Create a fluffy golden puppy pet influencer, sassy queen energy, mini hoodie" (anthropomorphic pet — DEFAULT for pet briefs: fluffy biped in cute clothes)
- "Create a calico kitten content creator, sleepy baby vibe, tiny knitted sweater" (anthropomorphic pet)
- "Create a real four-legged tortoiseshell cat in a sunlit Tokyo apartment" (real-form pet — opt-in only with explicit "real form / four-legged" cue)
- "Make me an anime magical-girl librarian" (stylized)
- "Create a strawberry character, brainrot 3D-animated vibe" (stylized fruit)
- "Create a persona based on /path/photo.png — indie folk musician" (reference)
- "Create a persona for: fitness coach — and add a tiktok reel" (with reel)
Output: , , (+ + if reel requested).
casting.pngpersona.md_meta.jsonreel-hero.pngreel.mp4Cost: ~3 cr lean / ~14 cr with reel.
使用简单英文即可。示例:
- "Create a persona for: fitness coach, gen-z, neon vibe"(写实人类)
- "Create a fluffy golden puppy pet influencer, sassy queen energy, mini hoodie"(拟人化宠物——宠物需求默认:穿着可爱衣服的毛茸茸双足角色)
- "Create a calico kitten content creator, sleepy baby vibe, tiny knitted sweater"(拟人化宠物)
- "Create a real four-legged tortoiseshell cat in a sunlit Tokyo apartment"(真实形态宠物——仅在明确指定"real form / four-legged"时启用)
- "Make me an anime magical-girl librarian"(风格化)
- "Create a strawberry character, brainrot 3D-animated vibe"(风格化水果角色)
- "Create a persona based on /path/photo.png — indie folk musician"(带参考图)
- "Create a persona for: fitness coach — and add a tiktok reel"(带短视频)
输出文件:、、(若请求短视频则额外包含 + )。
casting.pngpersona.md_meta.jsonreel-hero.pngreel.mp4成本:约3 cr(基础版)/ 约14 cr(含短视频)。
Pipeline
流程
Step 1 — Intent
步骤1 — 意图提取
Extract: persona seed | style | reference image | reel + platform requested | name | slug.
Bias hard toward "infer and proceed." Only ask if brief is truly thin (1–2 words). Invent missing details (gender, age, ethnicity, vibe), note in , let creator re-roll.
persona.mdIf you must ask, ask exactly ONE direct question. Never enumerate A/B/C/D menus. Never stack multiple questions.
GOOD response (only when brief too thin):
Give me a one-liner — vibe / type / niche.
Examples:
- anthropomorphic pet (default for pet briefs): "fluffy golden puppy influencer, sassy queen, mini hoodie" / "calico kitten creator, sleepy baby, tiny sweater"
- real-form pet (opt-in): "real four-legged tortie cat in a sunlit apartment"
- realistic human: "Berlin art curator, dark academia, mid-thirties"
- stylized: "anime magical-girl librarian" / "anthropomorphic strawberry, brainrot 3D"
Add-ons: "from /path/photo.png" / "add a tiktok reel" / "named Mochi"BAD: A/B/C/D menus + multiple questions stacked. Don't.
提取:人设核心信息 | 风格 | 参考图片 | 是否请求短视频及平台 | 名称 | 标识slug。
强烈倾向于“推断并执行”。 仅当需求描述过于简略(1-2个词)时才询问。补充缺失细节(性别、年龄、种族、风格),记录在中,允许创作者重新生成。
persona.md若必须询问,仅问一个直接问题。绝不要列出A/B/C/D选项菜单,绝不要堆叠多个问题。
正确回应(仅当需求过简时):
请提供一句话描述——风格/类型/细分领域。
示例:
- 拟人化宠物(宠物需求默认):"fluffy golden puppy influencer, sassy queen, mini hoodie" / "calico kitten creator, sleepy baby, tiny sweater"
- 真实形态宠物(需主动选择):"real four-legged tortie cat in a sunlit apartment"
- 写实人类:"Berlin art curator, dark academia, mid-thirties"
- 风格化:"anime magical-girl librarian" / "anthropomorphic strawberry, brainrot 3D"
附加选项:"from /path/photo.png" / "add a tiktok reel" / "named Mochi"错误做法:A/B/C/D选项菜单+多个问题堆叠。请勿如此操作。
Step 2 — Identity → persona.md
persona.md步骤2 — 身份设定 → persona.md
persona.mdWrite: name | bio (2–3 sentences) | voice/tone | frozen appearance block (verbatim, reuse in every prompt). Block contains identity DNA only (face geometry, eye/hair/skin, body type, distinguishing marks, wardrobe aesthetic baseline) — NOT per-shot deltas (expression, pose, lighting, scene, specific outfits).
撰写:名称 | 简介(2-3句话) | 语气风格 | 固定外观模块(逐字复用,用于所有提示)。模块仅包含身份核心信息(面部轮廓、眼/发/肤色、体型、标志性特征、穿搭风格基准)——不包含单镜头差异(表情、姿势、光线、场景、特定服装)。
Step 3 — Casting card
步骤3 — 选角卡
One call. 4 head-to-toe angles, plain seamless gray, signature wardrobe, neutral expression. 9:16 portrait with 2×2 grid inside (each panel ≈ 9:16 — fits full body).
bash
URL=$(gen-ai generate -m <style-model> -p "<subject-opening> The image shows the same exact character from four camera angles in a 2x2 portrait grid (9:16 canvas). ALL FOUR PANELS share: identical plain seamless studio gray background — flat uniform fill, no gradient/texture/scene. Identical signature wardrobe — same complete outfit head to feet (or, for common pets, identical simple accessories like collar/bandana/sweater — never humanoid clothing). Identical neutral expression — relaxed mouth. Identical even soft frontal softbox key + subtle fill + soft ground shadow, no rim lights, no colored gels. Identical hair/fur. Same identity in every panel: <frozen appearance block>. Differs only in angle: TOP-LEFT front-facing full body eyes at camera; TOP-RIGHT 3/4 facing camera-right; BOTTOM-LEFT full left profile looking off-left; BOTTOM-RIGHT 3/4 from behind over-the-shoulder. Magazine fashion model sheet composition, thin clean grid lines. The four panels MUST look like consecutive shots from one session — same wardrobe, backdrop, lighting, character; only angle differs. Absolutely no text, no captions, no watermarks, no logos, no UI elements, no phone, no device, no screen, no social media overlays in any panel." --aspect-ratio 9:16 --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/casting.png "$URL"<style-model>gemini-3.1-flash-imagegrok-imagineflux-2-maxReference image: add to this same call. Identity via i2i, same prompt + cost.
-i <reference-path>单次调用。生成4个全身角度,纯色无缝灰色背景,专属穿搭,中性表情。9:16竖幅,内部包含2×2网格(每个面板≈9:16——适配全身展示)。
bash
URL=$(gen-ai generate -m <style-model> -p "<subject-opening> The image shows the same exact character from four camera angles in a 2x2 portrait grid (9:16 canvas). ALL FOUR PANELS share: identical plain seamless studio gray background — flat uniform fill, no gradient/texture/scene. Identical signature wardrobe — same complete outfit head to feet (or, for common pets, identical simple accessories like collar/bandana/sweater — never humanoid clothing). Identical neutral expression — relaxed mouth. Identical even soft frontal softbox key + subtle fill + soft ground shadow, no rim lights, no colored gels. Identical hair/fur. Same identity in every panel: <frozen appearance block>. Differs only in angle: TOP-LEFT front-facing full body eyes at camera; TOP-RIGHT 3/4 facing camera-right; BOTTOM-LEFT full left profile looking off-left; BOTTOM-RIGHT 3/4 from behind over-the-shoulder. Magazine fashion model sheet composition, thin clean grid lines. The four panels MUST look like consecutive shots from one session — same wardrobe, backdrop, lighting, character; only angle differs. Absolutely no text, no captions, no watermarks, no logos, no UI elements, no phone, no device, no screen, no social media overlays in any panel." --aspect-ratio 9:16 --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/casting.png "$URL"<style-model>gemini-3.1-flash-imagegrok-imagineflux-2-max参考图片: 在本次调用中添加参数。通过图生图确定身份,提示和成本不变。
-i <reference-path>Subject openings (replace <subject-opening>
above)
<subject-opening>主题初始设定(替换上方<subject-opening>
)
<subject-opening>- Photoreal human (default) — "Professional fashion photograph head-to-toe casting card / model sheet, shot on 85mm lens, RAW photo, 8k UHD, crisp focus, photorealistic, natural skin texture with visible pores, no AI smoothing."
- Anthropomorphic humanoid pet (DEFAULT for pet/animal briefs — fluffy biped in cute clothes, project's category) — "An anthropomorphic [puppy / kitten / bunny / hamster / duckling / fox cub / baby panda / hedgehog / penguin / monkey] character standing upright on two legs like a human, full body visible head to toe, humanoid body proportions, expressive face, [coat detail — e.g. warm golden honey-colored fur / pure snow white fluffy fur / deep midnight black sleek fur / warm ginger orange fur / chocolate brown fur / shimmering silver grey fur / patchy calico orange-white-black fur / soft cream colored fur], adorable, looking directly at camera, professional fashion photograph, shot on 85mm lens, shallow depth of field, cinematic studio lighting with soft key light, photorealistic, RAW photo, 8k ultra high definition, crisp focus." Wardrobe options the agent can pick from when composing the casting card outfit: tiny knitted sweater | mini oversized hoodie | dapper bow tie + collar | flower crown of daisies and roses | tiny stylish sunglasses | flowing superhero cape | stylish bandana around neck | au naturel (no clothing, just fluffy fur). Vibe options for expression / pose: Sassy Queen (hand on hip, serving looks, unbothered) | Silly King (goofy, tongue out, awkward funny pose) | Sleepy Baby (drowsy half-asleep, leaning) | Zoomies Mode (excited, arms up, chaotic joy) | Distinguished (regal, arms crossed, noble) | Mischief Maker (sneaky, hands behind back, guilty-not-sorry). Suggested name (food-themed, project's pool): Biscuit, Mochi, Nugget, Bean, Waffles, Tofu, Dumpling, Peanut, Pickle, Noodle, Churro, Pretzel, Taco, Maple, Truffle, Sesame, Crouton, Muffin, Cupcake, Boba. Suggested bio style (gen-z internet humor, pipe-separated): "professional napper | treat negotiator | certified good boy/girl" / "fluffy & unbothered | snack motivated | full-time cuddle bug" / "chaos gremlin | zoomies champion | will boop for treats".
pets - Real-form quadruped pet (opt-in only — creator explicitly said "real form / four-legged / real cat / on all fours") — "Professional pet portrait photograph head-to-toe model sheet of a [breed] [animal] in their natural anatomical form (four-legged / quadruped, NOT humanoid), full body nose-to-tail visible, shot on 85mm with shallow depth of field, RAW, 8k UHD, photorealistic natural fur with visible individual hairs, no AI smoothing. Pet may wear simple accessories (collar, bandana, harness) but never humanoid clothing — the character is the animal in real anatomical form."
- 3D-animated anthropomorphic fruit / object — "High quality 3D-animated head-to-toe character sheet of an anthropomorphic [fruit/object] character, feature-film animation aesthetic, [fruit/object] serves as the head on a full human-proportioned athletic body, [skin/surface] texture extending naturally to arms and hands, ultra-high resolution, brainrot character-drama vibe, dramatic cinematic studio lighting with soft fill + subtle ground shadow."
- Anime / manga — "High quality anime / manga style head-to-toe character sheet, cel-shaded illustration, clean line art, vibrant saturated colors, soft anime lighting, expressive eyes, [shoujo/shonen/kawaii] aesthetic, magazine character reference sheet composition."
- Stylized 3D-animated human / fantasy — "High quality stylized 3D-animated head-to-toe character sheet, feature-film animation aesthetic, soft global illumination, slightly exaggerated proportions, expressive features, character-animation art direction."
- Painted / illustrated — "Hand-painted editorial illustration head-to-toe character sheet, [watercolor/gouache/digital painting] aesthetic, painterly brushwork, layered soft light, magazine illustration composition."
Casting-card rules — non-negotiable: identical bg / wardrobe-or-accessories / lighting / expression / hair-fur across all 4 panels — only angle differs | bg flat plain gray | full body (head-to-toe humans/bipeds INCLUDING anthropomorphic humanoid pets, nose-to-tail quadrupeds for real-form pet opt-in) | wardrobe stays same in all panels (same outfit for humans + anthropomorphic pets — yes, anthropomorphic pets wear humanoid clothing like tiny sweaters/mini hoodies/bow ties; only real-form quadruped pets are limited to simple accessories like collar/bandana/harness) | expression and pose match the chosen vibe (Sassy Queen / Silly King / etc. for anthropomorphic pets) — neutral default for humans, eyes at camera (or off per profile/back).
- 写实人类(默认)—— "Professional fashion photograph head-to-toe casting card / model sheet, shot on 85mm lens, RAW photo, 8k UHD, crisp focus, photorealistic, natural skin texture with visible pores, no AI smoothing."
- 拟人化人形宠物(宠物/动物需求默认——穿着可爱衣服的毛茸茸双足角色,项目分类风格)—— "An anthropomorphic [puppy / kitten / bunny / hamster / duckling / fox cub / baby panda / hedgehog / penguin / monkey] character standing upright on two legs like a human, full body visible head to toe, humanoid body proportions, expressive face, [coat detail — e.g. warm golden honey-colored fur / pure snow white fluffy fur / deep midnight black sleek fur / warm ginger orange fur / chocolate brown fur / shimmering silver grey fur / patchy calico orange-white-black fur / soft cream colored fur], adorable, looking directly at camera, professional fashion photograph, shot on 85mm lens, shallow depth of field, cinematic studio lighting with soft key light, photorealistic, RAW photo, 8k ultra high definition, crisp focus." Agent可从以下选项中为选角卡选择穿搭:tiny knitted sweater | mini oversized hoodie | dapper bow tie + collar | flower crown of daisies and roses | tiny stylish sunglasses | flowing superhero cape | stylish bandana around neck | au naturel(无衣物,仅毛茸茸的皮毛)。表情/姿势风格选项:Sassy Queen(手叉腰,气场十足,满不在乎)| Silly King(搞怪,吐舌头,笨拙有趣的姿势)| Sleepy Baby(昏昏欲睡,半梦半醒,身体倾斜)| Zoomies Mode(兴奋,手臂举起,混乱的快乐)| Distinguished(高贵,双臂交叉,端庄)| Mischief Maker(鬼鬼祟祟,手背后,知错不改)。推荐食物系名字(项目备选池):Biscuit、Mochi、Nugget、Bean、Waffles、Tofu、Dumpling、Peanut、Pickle、Noodle、Churro、Pretzel、Taco、Maple、Truffle、Sesame、Crouton、Muffin、Cupcake、Boba。推荐Z世代幽默风格简介(竖线分隔):"professional napper | treat negotiator | certified good boy/girl" / "fluffy & unbothered | snack motivated | full-time cuddle bug" / "chaos gremlin | zoomies champion | will boop for treats"。
pets - 真实形态四足宠物(仅主动选择——创作者明确指定"real form / four-legged / real cat / on all fours")—— "Professional pet portrait photograph head-to-toe model sheet of a [breed] [animal] in their natural anatomical form (four-legged / quadruped, NOT humanoid), full body nose-to-tail visible, shot on 85mm with shallow depth of field, RAW, 8k UHD, photorealistic natural fur with visible individual hairs, no AI smoothing. Pet may wear simple accessories (collar, bandana, harness) but never humanoid clothing — the character is the animal in real anatomical form."
- 3D动画化拟人水果/物品—— "High quality 3D-animated head-to-toe character sheet of an anthropomorphic [fruit/object] character, feature-film animation aesthetic, [fruit/object] serves as the head on a full human-proportioned athletic body, [skin/surface] texture extending naturally to arms and hands, ultra-high resolution, brainrot character-drama vibe, dramatic cinematic studio lighting with soft fill + subtle ground shadow."
- 动漫/漫画—— "High quality anime / manga style head-to-toe character sheet, cel-shaded illustration, clean line art, vibrant saturated colors, soft anime lighting, expressive eyes, [shoujo/shonen/kawaii] aesthetic, magazine character reference sheet composition."
- 风格化3D动画人类/奇幻角色—— "High quality stylized 3D-animated head-to-toe character sheet, feature-film animation aesthetic, soft global illumination, slightly exaggerated proportions, expressive features, character-animation art direction."
- 手绘/插画风格—— "Hand-painted editorial illustration head-to-toe character sheet, [watercolor/gouache/digital painting] aesthetic, painterly brushwork, layered soft light, magazine illustration composition."
选角卡规则——不可协商: 所有4个面板的背景/穿搭或配饰/光线/表情/毛发完全一致——仅角度不同 | 背景为纯色平灰 | 全身展示(人类/双足角色包括拟人化人形宠物,真实形态四足宠物为从头到尾) | 所有面板穿搭保持一致(人类和拟人化宠物为相同服装——是的,拟人化宠物穿着小毛衣/迷你卫衣/领结等人形服装;仅真实形态四足宠物仅限项圈/头巾/胸背带等简单配饰) | 表情和姿势匹配所选风格(拟人化宠物为Sassy Queen/Silly King等;人类默认中性,眼睛看向镜头或根据侧面/背面视角偏移)。
Step 4 — Reel (only if requested)
步骤4 — 短视频(仅当请求时)
Two sub-calls. Seedance treats as first frame (verified) — passing the casting-card grid would open the reel on it. So: generate single-frame reel-hero first, then animate.
imageUrls两次子调用。Seedance将视为第一帧(已验证)——传入选角卡网格会让短视频从该画面开始。因此:先生成单帧短视频首图,再进行动画处理。
imageUrlsPick the concept first
先确定创意概念
Don't auto-default to "slow contemplative push-in" — most creator content rewards confident energy.
Concepts: Hook reveal | Power pose | Attitude flick (look-away → snap-back smirk) | Walk-by | Outfit reveal | Vibe drop (lighting shift mid-clip) | Establish-and-hold | Calm narrative beat (only for genuinely-calm-niche personas).
Hook rule: first second must arrest attention. Platform sensitivity: TikTok / IG Reel / Shorts → punchy; LinkedIn / YouTube → professional / calm; fruit / 3D / anime → lean stylized + confident (calm beats fall flat for them).
Camera move (pick ONE): slow push-in | slow pull-out | partial orbit | slow track left/right | static | tilt up | whip-in.
Action (pick ONE): punchy default — confident camera-direct stare with attitude shift | power-pose hold | hair flip | smile breaking through | walking confidently toward camera | outfit-reveal turn | hand gesture | rhythmic vibe | look over shoulder | lighting drop. Quieter (calm-niche only) — looking up from book + soft smile | slow head tilt | hair lift in light wind | eyes opening | lip part.
Environment / lighting: atmospheric specifics > generic. Replace "in a cafe" with "neon-pink Tokyo coffee shop interior, signage reflections" (punchy) OR "rain-streaked window with candlelight, steam from teacup" (calm). Match energy to concept.
不要默认采用“缓慢推进的沉思镜头”——大多数创作者内容需要充满活力的风格。
创意概念:钩子式展示 | 力量姿势 | 态度切换(转头→突然回头微笑) | 走过镜头 | 穿搭展示 | 风格突变(镜头中光线切换) | 定格展示 | 平静叙事镜头(仅适用于真正平静的细分领域人设)。
钩子规则: 第一秒必须抓住注意力。平台适配: TikTok / IG Reel / Shorts → 节奏明快;LinkedIn / YouTube → 专业/沉稳;水果/3D/动漫风格→偏向风格化+充满活力(平静镜头效果不佳)。
镜头移动(选一个): 缓慢推进 | 缓慢拉远 | 局部环绕 | 缓慢左右平移 | 静止 | 向上倾斜 | 快速切入。
动作(选一个): 明快默认——自信直视镜头并切换态度 | 力量姿势定格 | 甩头发 | 露出微笑 | 自信走向镜头 | 转身展示穿搭 | 手势 | 节奏风格 | 回头看 | 光线变暗。平静风格(仅适用于平静细分领域)——抬头看书+温柔微笑 | 缓慢歪头 | 风吹起头发 | 睁眼 | 嘴唇微张。
环境/光线: 具体氛围 > 通用描述。将“in a cafe”替换为“霓虹粉色东京咖啡店内部,招牌反光”(明快)或“雨痕玻璃窗,烛光,茶杯蒸汽”(平静)。氛围需匹配创意概念。
Sub-step 4-i: Reel hero (gemini i2i, target AR, single full-body frame)
子步骤4-i:短视频首图(Gemini图生图,目标AR格式,单帧全身画面)
bash
URL=$(gen-ai generate -m gemini-3.1-flash-image -i ./<persona-slug>/casting.png -p "<subject-opening from Step 3> Single full-body photograph of the same character from the casting-card reference, head-to-toe in frame. <frozen appearance block>. Wearing the same signature wardrobe shown in casting card. <opening pose / framing for chosen concept>. <atmospheric environment + lighting>. Composition: full body head to toe, framed for video animation in <platform-AR>. Real photograph quality (or stylized rendering per opening). No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel-hero.png "$URL"Cost: ~3 cr. Apply fallback to .
flux-2-maxReel-hero ≠ final action pose. Gemini tends to preserve the casting card's neutral stance even when prompted for power-pose / mid-action (verified). That's fine — the action lands in the Seedance prompt at 4-ii. Don't re-roll the hero just because the pose looks calmer than expected.
bash
URL=$(gen-ai generate -m gemini-3.1-flash-image -i ./<persona-slug>/casting.png -p "<subject-opening from Step 3> Single full-body photograph of the same character from the casting-card reference, head-to-toe in frame. <frozen appearance block>. Wearing the same signature wardrobe shown in casting card. <opening pose / framing for chosen concept>. <atmospheric environment + lighting>. Composition: full body head to toe, framed for video animation in <platform-AR>. Real photograph quality (or stylized rendering per opening). No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel-hero.png "$URL"成本:约3 cr。若失败则降级为。
flux-2-max短视频首图 ≠ 最终动作姿势。 Gemini即使提示力量姿势/动作中,也倾向于保留选角卡的中立姿态(已验证)。这没问题——动作将在4-ii的Seedance提示中体现。不要仅因为姿势比预期平静就重新生成首图。
Sub-step 4-ii: Animation (Seedance i2v, audio enabled)
子步骤4-ii:动画(Seedance图生视频,启用音频)
Platform → AR + duration:
| Platform | AR | Duration |
|---|---|---|
| tiktok / instagram-reel / instagram-story / youtube-shorts | 9:16 | 8s |
| instagram-feed | 1:1 (Seedance has no 4:5; closest universal) | 6s |
| youtube / linkedin / x / twitter | 16:9 | 8–10s |
bash
URL=$(gen-ai generate -m seedance-2.0 -i ./<persona-slug>/reel-hero.png -p "<subject-opening>. <frozen appearance block>. Wearing same signature wardrobe. <single action from vocabulary matching the concept — strong language here, this is where action actually lands>. <same atmospheric environment + lighting as hero>. <single camera move from vocabulary>. Audio: <ambient soundscape matching scene — environmental sounds, mood-appropriate underscore; no spoken dialogue, no voiceover, no music vocals>. Single continuous moment, no scene changes, no multiple sequential actions, no fast or chaotic movement. No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --duration <platform-duration> --generate-audio --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel.mp4 "$URL"Cost: 1 cr/sec × duration. Total reel: ~8–13 cr.
Seedance prompt order (verified KLING_RULES): Subject → Action → Environment → Camera → Lighting → Audio. One continuous camera move, one primary action — never chain.
Models we DON'T use for reel: any -only i2v (, , , , , ) drifts across the clip; returns a still PNG (verified, not a video); / / are -only — multi-image char-ref modes (Kling element / Veo Ingredients) aren't surfaced in the CLI today (roadmap).
startFrameseedance-i2vhailuo-2.3-fastrunway-gen3a-turbowan-2.7-i2vluma-flash2-i2vpika-framesrunway-gen4-refkling-3.0-proveo-3.1veo-3.1-faststartFrameHonest constraint: Seedance's behaves as first frame, not pure char-ref. Single-frame hero + i2v = clean character image opens the reel and animates from there.
imageUrls平台→AR格式+时长:
| 平台 | AR格式 | 时长 |
|---|---|---|
| tiktok / instagram-reel / instagram-story / youtube-shorts | 9:16 | 8s |
| instagram-feed | 1:1(Seedance无4:5;最接近的通用格式) | 6s |
| youtube / linkedin / x / twitter | 16:9 | 8–10s |
bash
URL=$(gen-ai generate -m seedance-2.0 -i ./<persona-slug>/reel-hero.png -p "<subject-opening>. <frozen appearance block>. Wearing same signature wardrobe. <single action from vocabulary matching the concept — strong language here, this is where action actually lands>. <same atmospheric environment + lighting as hero>. <single camera move from vocabulary>. Audio: <ambient soundscape matching scene — environmental sounds, mood-appropriate underscore; no spoken dialogue, no voiceover, no music vocals>. Single continuous moment, no scene changes, no multiple sequential actions, no fast or chaotic movement. No text, no captions, no watermarks, no logos, no UI, no phone, no device, no screen, no social media overlays." --aspect-ratio <platform-AR> --duration <platform-duration> --generate-audio --json --no-input | grep -oE 'https?://[^"]+' | head -1)
curl -sSL -o ./<persona-slug>/reel.mp4 "$URL"成本:1 cr/秒 × 时长。短视频总成本:约8–13 cr。
Seedance提示顺序(已验证KLING_RULES):主题→动作→环境→镜头→光线→音频。单次连续镜头移动,单个主要动作——绝不要串联多个动作。
我们不用于短视频的模型: 任何仅支持的图生视频模型(、、、、、)会在视频中出现漂移;返回静态PNG(已验证,非视频); / / 仅支持——多图像角色参考模式(Kling元素/Veo成分)目前未在CLI中开放(规划中)。
startFrameseedance-i2vhailuo-2.3-fastrunway-gen3a-turbowan-2.7-i2vluma-flash2-i2vpika-framesrunway-gen4-refkling-3.0-proveo-3.1veo-3.1-faststartFrame实际限制: Seedance的功能是作为第一帧,而非纯粹的角色参考。单帧首图+图生视频=短视频以清晰的角色画面开场并从该画面开始动画。
imageUrlsStep 5 — Captions, deliver
步骤5 — 文案,交付
Append captions to — 3 by default, in persona's voice. Hashtag block ALWAYS leads with , then platform-specific niche tags.
persona.md#picsart #picsartcreator| Platform | Length | Niche tags after Picsart pair |
|---|---|---|
| tiktok / youtube-shorts | 80–150 chars, single hook | 4–6 trending |
| instagram (reel/story/feed) | 150–300 chars, hook + story | 6–10 |
| youtube standard | 300–500 chars, keyword-dense | 3–5 keyword |
| 500–1000 chars, professional | 3–5 industry | |
| x / twitter | ≤280 chars total (incl tags) | 1–2 |
| (no platform) | ~150 chars, balanced | 4–6 generic |
Print final summary: . Add + to file list if reel was generated.
✓ Persona "Lena" delivered. Local: ./lena/. Spent: ~3 credits. Files: casting.png, persona.md (+ _meta.json)reel-hero.pngreel.mp4将文案追加到中——默认3条,匹配人设语气。标签块必须以开头,然后添加平台专属细分领域标签。
persona.md#picsart #picsartcreator| 平台 | 长度 | Picsart标签后的细分领域标签 |
|---|---|---|
| tiktok / youtube-shorts | 80–150字符,单个钩子 | 4–6个热门标签 |
| instagram(reel/story/feed) | 150–300字符,钩子+故事 | 6–10个 |
| youtube标准视频 | 300–500字符,关键词密集 | 3–5个关键词标签 |
| 500–1000字符,专业风格 | 3–5个行业标签 | |
| x / twitter | 总计≤280字符(含标签) | 1–2个 |
| (无指定平台) | ~150字符,平衡风格 | 4–6个通用标签 |
打印最终总结:。若生成了短视频,在文件列表中添加 + 。
✓ Persona "Lena" delivered. Local: ./lena/. Spent: ~3 credits. Files: casting.png, persona.md (+ _meta.json)reel-hero.pngreel.mp4Cost transparency
成本透明度
Show plan before spending — pull live rates with , never hardcode. After each step: .
gen-ai pricing <model>✓ <step> (<credits>)Plan:
Casting card (gemini-3.1-flash-image, 1 image) ~3 cr
[ Reel hero (gemini-3.1-flash-image, 1 image) ~3 cr ] reel only
[ Reel animation (seedance-2.0, 8s @ 9:16) ~8 cr ]
────────────────────────────────────────────────────────
Estimated total ~3 or ~14 cr
Continue? [Y/n]执行前展示计划——使用获取实时费率,绝不要硬编码。每步完成后:。
gen-ai pricing <model>✓ <step> (<credits>)Plan:
Casting card (gemini-3.1-flash-image, 1 image) ~3 cr
[ Reel hero (gemini-3.1-flash-image, 1 image) ~3 cr ] reel only
[ Reel animation (seedance-2.0, 8s @ 9:16) ~8 cr ]
────────────────────────────────────────────────────────
Estimated total ~3 or ~14 cr
Continue? [Y/n]Output
输出结构
./<persona-slug>/
├── persona.md # name, bio, voice, frozen appearance block, captions
├── casting.png # head-to-toe 4-angle casting card
├── reel-hero.png # only if reel requested
├── reel.mp4 # only if reel requested (includes ambient audio)
└── _meta.json # step parameters./<persona-slug>/
├── persona.md # 名称、简介、语气、固定外观模块、文案
├── casting.png # 全身4角度选角卡
├── reel-hero.png # 仅当请求短视频时存在
├── reel.mp4 # 仅当请求短视频时存在(含环境音)
└── _meta.json # 步骤参数Re-rolls
重新生成
Natural language. Agent reads and reruns the right step:
_meta.json- "Regenerate Lena with darker hair" (~3 cr)
- "Redo the reel with a slow camera push instead of static" (~8 cr)
Confirm spend before re-running.
使用自然语言。Agent读取并重新执行对应步骤:
_meta.json- "Regenerate Lena with darker hair"(约3 cr)
- "Redo the reel with a slow camera push instead of static"(约8 cr)
重新执行前确认成本。
Limitations (today)
当前限制
- Local-only output (Drive integration tracked v1.1 once new CLI Drive API ships)
- One persona per run (multi-persona via or future Scene Composer skill)
gen-ai generate -m kling-multi-image-v2-1 -i nova/casting.png -i lena/casting.png -p "<scene>" - No premium photoreal tier (deferred)
gemini-3-pro-image - No premium motion-control reel (Kling Motion Control V3 + creator motion-ref deferred)
- No voice / talking-head reel (Picsart-Eleven gender unreliable). Reel ships with Seedance ambient audio — environmental + atmospheric underscore, no synthesized speech
- No bespoke music (Seedance underscore via ; dedicated music pass deferred)
--generate-audio - No Kling-element / Veo-Ingredients char-ref video (not surfaced in CLI)
- No built-in scene variations (casting card is the character; downstream tools handle scenes)
- 仅本地输出(Drive集成计划在v1.1版本中支持,待新CLI Drive API发布)
- 每次运行仅生成一个人设(多人设可通过或未来的场景合成技能实现)
gen-ai generate -m kling-multi-image-v2-1 -i nova/casting.png -i lena/casting.png -p "<scene>" - 无高级写实 tier(延期支持)
gemini-3-pro-image - 无高级运动控制短视频(Kling Motion Control V3 + 创作者运动参考延期支持)
- 无语音/对话式短视频(Picsart-Eleven性别识别不可靠)。短视频附带Seedance环境音——环境音+氛围背景音乐,无合成语音
- 无定制音乐(Seedance通过提供背景音乐;专属音乐模块延期支持)
--generate-audio - 无Kling元素/Veo成分角色参考视频(未在CLI中开放)
- 无内置场景变体(选角卡为角色本身;后续工具处理场景)