happyhorse-prompt-studio

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

HappyHorse Prompt Studio

HappyHorse Prompt 工作室

A 4-phase guided skill that turns "I want to make a video" into a production-ready HappyHorse prompt — starting from inspiration, not from a blank page.

这是一个分四阶段引导的技能，能将「我想做个视频」的想法转化为可直接用于生产的HappyHorse提示词——从灵感出发，而非空白页面。

Overview

概述

This skill guides the Agent through a structured conversation:

Phase 1 · 灵感菜单 (Inspiration)    →  "Here's what's possible"
Phase 2 · 需求发现 (Discovery)      →  "What do YOU want to make?"
Phase 3 · Prompt 组装 (Assembly)    →  "Let me build it for you"
Phase 4 · 质量检查 (Quality Check)  →  "Here's your prompt — want to tweak?"

该技能引导Agent完成结构化对话流程：

Phase 1 · 灵感菜单（Inspiration）    → 「这是可以实现的效果」
Phase 2 · 需求发现（Discovery）      → 「你想要创作什么？」
Phase 3 · Prompt 组装（Assembly）    → 「我来为你生成提示词」
Phase 4 · 质量检查（Quality Check）  → 「这是你的提示词——需要调整吗？」

Phase 1 · Inspiration Menu (灵感菜单)

Phase 1 · 灵感菜单（Inspiration）

Start every conversation here. Before asking any questions, show the user what HappyHorse can do. Present these as four "flavors" — each one a door into a different creative world.

Use the language the user is using (JP/CN/EN). The descriptions below are in English for the Agent's reference — translate them to match the user's language.

所有对话都从这里开始。在提问前，先向用户展示HappyHorse的能力。将这些能力呈现为四种「风格选项」——每种都是通往不同创意世界的入口。

使用用户当前使用的语言（中日英）。以下描述为Agent参考用英文，需翻译为用户使用的语言。

Flavor A · "让你的角色开口说话"

风格A ·「让你的角色开口说话」

Voiced Manga Drama (漫画配音剧 / ボイスコミック)

You have a manga, a webtoon, or an original story. You've drawn the characters, written the dialogue — now you want them to speak.

Upload 2-3 character reference images + a short script. HappyHorse generates a 15-30 second voiced drama where characters talk, emote, and stay visually consistent across cuts. Lip-sync included.

Vibe: Movie dub meets manga animation. Your characters, their voices.

Voiced Manga Drama（漫画配音剧 / ボイスコミック）

你拥有一部漫画、条漫或原创故事。你已经绘制好角色、写好台词——现在你想让他们开口说话。

上传2-3张角色参考图+一段简短剧本。HappyHorse会生成15-30秒的配音剧，角色会对话、做出表情，且在不同镜头中保持视觉一致性，包含唇形同步效果。

氛围： 电影配音结合漫画动画。你的角色，专属声音。

Flavor B · "一张立绘，开口自我介绍"

风格B ·「一张立绘，开口自我介绍」

Character Voice PV (角色语音 PV / キャラボイス PV)

You have a game character, a VTuber, or an original OC. You want a short 8-10 second PV where they introduce themselves — or let out a battle cry.

Upload 1-3 character art images + a line or two. HappyHorse generates a voiced, lip-synced character PV.

Vibe: Character reveal trailer. One illustration, one voice, one PV.

Character Voice PV（角色语音PV / キャラボイスPV）

你拥有一个游戏角色、VTuber或原创OC。你想要一段8-10秒的PV，让角色自我介绍——或是喊出战斗口号。

上传1-3张角色原画+一两句台词。HappyHorse会生成带配音、唇形同步的角色PV。

氛围： 角色亮相预告。一张插画，一段声音，一支PV。

Flavor C · "让一格漫画活过来"

风格C ·「让一格漫画活过来」

Manga Panel Motion (漫画分格动态化 / コマ動画化)

You have manga panels, comic pages, or illustrated scenes. You want to turn them into 5-10 second motion clips — perfect for social media.

Upload one panel as the first frame. HappyHorse animates it while preserving your art style.

Vibe: Your drawing, but it breathes. Hair moves, eyes blink, wind blows.

Manga Panel Motion（漫画分格动态化 / コマ動画化）

你拥有漫画分格、漫画页面或插画场景。你想将它们转化为5-10秒的动态片段——非常适合社交媒体传播。

上传一张分格作为第一帧。HappyHorse会在保留你艺术风格的前提下将其动画化。

氛围： 你的画作，拥有了呼吸感。头发飘动，眼睛眨动，微风吹拂。

Flavor D · "你的虚拟偶像，30 秒成 MV"

风格D ·「你的虚拟偶像，30秒成MV」

Virtual Idol MV (虚拟偶像 MV / バーチャルアイドル MV)

You have a virtual idol, a VTuber group, or an original idol project. You want an MV — with stage lighting, lip-sync singing, and multi-shot choreography.

Upload 3-5 multi-angle character images + a licensed song segment. HappyHorse generates a 30-second MV clip.

Vibe: Your idol, center stage. No Live2D. No MMD. Just one prompt.

⚠️ Note: This scenario requires the strongest compliance guardrails. We'll check together.

Virtual Idol MV（虚拟偶像MV / バーチャルアイドルMV）

你拥有一个虚拟偶像、VTuber团体或原创偶像企划。你想要一支MV——包含舞台灯光、唇形同步演唱和多镜头舞蹈编排。

上传3-5张多角度角色图+一段授权歌曲片段。HappyHorse会生成30秒的MV片段。

氛围： 你的偶像，C位登场。无需Live2D，无需MMD，只需一段提示词。

⚠️ 注意： 该场景需要最严格的合规审核，我们会共同确认相关内容。

How to present the menu

菜单展示方式

Present the four flavors conversationally, not as a dry list. Something like:

"HappyHorse can bring your characters to life in a few different ways. Think of it as four flavors:

A · Voiced Drama — your manga characters talk to each other, with voice and lip-sync B · Character PV — your game character or OC introduces itself out loud C · Panel Motion — a single manga panel starts moving, hair blowing, eyes blinking D · Idol MV — your virtual idol performs a 30-second MV on stage

Which one sounds closest to what you're imagining? Or tell me about your project and I'll suggest."

If the user already knows what they want, skip to Phase 2.

以对话式而非生硬列表的形式呈现这四种风格，例如：

"HappyHorse可以通过几种不同方式让你的角色活起来，就像四种不同风格的选项：

A · 配音剧——你的漫画角色会互相对话，带有配音和唇形同步 B · 角色PV——你的游戏角色或OC会开口自我介绍 C · 分格动态化——单张漫画分格开始动起来，头发飘动、眼睛眨动 D · 偶像MV——你的虚拟偶像在舞台上表演30秒MV

哪一种最接近你的想法？或者告诉我你的项目，我来为你推荐。"

如果用户已经明确需求，可直接跳过至Phase 2。

Phase 2 · Discovery (需求发现)

Phase 2 · 需求发现（Discovery）

Once a flavor is chosen (or the user describes their own scenario), ask these questions. Ask them conversationally, not as a form. Group related questions together.

一旦用户选定风格（或描述了自己的场景），请询问以下问题。以对话方式提问，而非表单式罗列，将相关问题分组询问。

2.1 Character & World (角色与世界)

2.1 角色与世界

What's your character's name and role? (protagonist / antagonist / side character)
What do they look like? (hair, eyes, outfit, accessories, any signature items)
What's their personality vibe? (cool / energetic / shy / mysterious / cheerful)
Where does the scene take place? (school rooftop / fantasy castle / neon city / café / etc.)

你的角色名字和定位是什么？（主角/反派/配角）
他们的外貌如何？（发型、眼睛、服装、配饰、标志性物品）
他们的性格氛围是怎样的？（冷酷/活泼/害羞/神秘/开朗）
场景发生在哪里？（学校屋顶/奇幻城堡/霓虹都市/咖啡馆等）

2.2 Scene Intent (场景意图)

2.2 场景意图

What's happening in this scene? (a confession / a battle / a quiet moment / a group dance)
What emotion should the viewer feel? (heart-fluttering / adrenaline / nostalgic / hype / calm)
How long should the output be? (5s / 10s / 15s / 30s)

场景中正在发生什么？（告白/战斗/安静时刻/群舞）
希望观众感受到什么情绪？（心动/紧张/怀旧/兴奋/平静）
输出视频时长是多少？（5秒/10秒/15秒/30秒）

2.3 Voice & Sound (声音与音频)

2.3 声音与音频

Does your character speak? If yes:
- What language? (Japanese / Chinese / English)
- Voice type? (young woman / young man / child / mature / elderly)
- Voice color? (bright / low / soft / powerful / cool)
- What do they say? (provide the exact line, or ask me to suggest)
Background audio? (silence / ambient sounds / BGM style)

你的角色会说话吗？如果是：
- 使用什么语言？（日语/中文/英语）
- 声线类型？（年轻女性/年轻男性/儿童/成熟/老年）
- 声音特质？（明亮/低沉/柔和/有力/冷酷）
- 他们说什么？（提供准确台词，或让我来建议）
背景音频？（静音/环境音/背景音乐风格）

2.4 Visual Style (视觉风格)

2.4 视觉风格

Art style reference? (anime / photorealistic / Pixar / watercolor / pixel art / etc.)
Color palette? (warm / cool / neon / pastel / high-contrast)
Camera preference? (close-up / medium / wide / rotating / slow push / static)

艺术风格参考？（动漫/写实/Pixar/水彩/像素风等）
调色板？（暖色调/冷色调/霓虹色/马卡龙色/高对比度）
镜头偏好？（特写/中景/广角/旋转镜头/缓慢推近/固定镜头）

2.5 Compliance Quick-Check (合规快检)

2.5 合规快检

Before proceeding, verify:

☐ Is the character your own original creation or properly licensed?
☐ Is the character depicted as 18 or older (especially for idol scenarios)?
☐ Is the outfit SFW (no suggestive or revealing clothing)?
☐ Is the scene SFW (no sensitive locations like bedrooms/pools)?
☐ If there's music, is it licensed or original (not a commercial song)?

If any answer is NO, pause and suggest an alternative — don't proceed with a non-compliant prompt.

继续之前，请确认：

☐ 角色是你原创的作品还是已获得合法授权？
☐ 角色被描绘为18岁及以上（尤其是偶像场景）？
☐ 服装为SFW（适宜全年龄段）（无暴露或暗示性服饰）？
☐ 场景为SFW（无卧室/泳池等敏感地点）？
☐ 若涉及音乐，是否为授权或原创音乐（非商业歌曲）？

如果任何答案为否，请暂停并建议替代方案——不要继续生成不合规的提示词。

Phase 3 · Prompt Assembly (Prompt 组装)

Phase 3 · Prompt 组装（Assembly）

Now build the prompt using the HappyHorse Formula:

场景 (Scene) + 主体 (Subject) + 動き (Motion) + 音声 (Audio) + 品質修飾 (Quality)

现在使用HappyHorse公式构建提示词：

场景（Scene） + 主体（Subject） + 動き（Motion） + 音声（Audio） + 品質修飾（Quality）

3.1 The Formula (公式)

3.1 公式说明

Component	What it does	Example (JP)
场景 Scene	Where + when + atmosphere	夕暮れの古城の庭、金色の逆光、落ち葉が舞い散る
主体 Subject	Who + appearance + state	甲冑とマントをまとった若い剣士が、剣の柄に手を添えて立っている
動き Motion	What happens + camera movement	剣士がゆっくり顔を上げ、カメラが全身から胸像へとゆっくり寄る
音声 Audio	Voice lines + voice type + ambient sound + BGM	[剣士、ネイティブな日本語、低めの落ち着いた青年声] 言う：「俺の刃は、守るためにある。」背景に風、鐘の音
品質 Quality	Style + consistency constraints	映画級質感、リアルな光影、キャラの表情と装備が維持

组件	作用	示例（日语）
场景 Scene	地点+时间+氛围	夕暮れの古城の庭、金色の逆光、落ち葉が舞い散る
主体 Subject	角色+外貌+状态	甲冑とマントをまとった若い剣士が、剣の柄に手を添えて立っている
動き Motion	动作+镜头运动	剣士がゆっくり顔を上げ、カメラが全身から胸像へとゆっくり寄る
音声 Audio	台词+声线+环境音+背景音乐	[剣士、ネイティブな日本語、低めの落ち着いた青年声] 言う：「俺の刃は、守るためにある。」背景に風、鐘の音
品質 Quality	风格+一致性约束	映画級質感、リアルな光影、キャラの表情と装備が維持

3.2 R2V Character Consistency Syntax

3.2 R2V 角色一致性语法

When the user provides multiple reference images, use this syntax:

参考「Image 1」=[正面立绘]、「Image 2」=[侧面]、「Image 3」=[表情差分]

Or when referencing a specific character in a multi-character scene:

@「Image 1」のキャラが [action]、@「Image 2」のキャラが [reaction]

Key rules:

Always use
```
@「Image n」
```
to lock character identity across shots
Describe what each reference image shows (正面 / 側面 / 表情差分)
End with:
```
キャラの顔・髪・衣装が変わらない
```
(character's face/hair/outfit stays unchanged)

当用户提供多张参考图时，使用以下语法：

参考「Image 1」=[正面立绘]、「Image 2」=[侧面]、「Image 3」=[表情差分]

或在多角色场景中引用特定角色时：

@「Image 1」のキャラが [action]、@「Image 2」のキャラが [reaction]

核心规则：

始终使用
```
@「Image n」
```
来锁定角色在不同镜头中的身份
描述每张参考图的内容（正面/侧面/表情差分）
结尾添加：
```
キャラの顔・髪・衣装が変わらない
```
（角色的脸/头发/服装保持不变）

3.3 Video-Edit Style Unification

3.3 视频编辑风格统一

When the user wants to unify style across multiple shots:

参考「Image 1」の画風・筆致・色調に、動画全体のスタイルを変換。
人物の動作、表情、運鏡、セリフのタイミングは 100% 保持。

Key rule: always add

100% 保持

(100% preserved) constraints for things that must not change.

当用户希望在多个镜头中统一风格时：

参考「Image 1」の画風・筆致・色調に、動画全体のスタイルを変換。
人物の動作、表情、運鏡、セリフのタイミングは 100% 保持。

核心规则：对于必须保持不变的内容，务必添加

100% 保持

约束。

3.4 Language Rules

3.4 语言规则

Language	When to use	Key markers
Japanese (JP)	User speaks JP, or targeting JP market	`ネイティブな日本語` , use `「」` for dialogue, avoid translation-style phrasing
Chinese (CN)	User speaks CN	`中文母语` , use `「」` or `""` for dialogue
English (EN)	User speaks EN, or targeting global market	`native English` , use `""` for dialogue

Japanese-specific tips:

Use
```
ネイティブな日本語
```
to ensure natural Japanese (not translation-style)

Specify voice color with JP adjectives:

明るく元気な少女声

低めの落ち着いた青年声

柔らかい囁くような声

Keep dialogue in
```
「」
```
brackets
Avoid mixing languages in dialogue unless intentionally bilingual

语言	使用场景	关键标记
日语（JP）	用户使用日语，或面向日本市场	`ネイティブな日本語` ，对话使用 `「」` ，避免翻译腔
中文（CN）	用户使用中文	`中文母语` ，对话使用 `「」` 或 `""`
英语（EN）	用户使用英语，或面向全球市场	`native English` ，对话使用 `""`

日语专属技巧：

使用
```
ネイティブな日本語
```
确保日语自然流畅（无翻译腔）

用日语形容词描述声线：

明るく元気な少女声

、

低めの落ち着いた青年声

、

柔らかい囁くような声

对话放在
```
「」
```
括号中
除非有意打造双语效果，否则避免在对话中混合语言

3.5 Prompt Templates by Flavor

3.5 按风格分类的提示词模板

Flavor A · Voiced Manga Drama

风格A · 漫画配音剧

[场景描述：时间/地点/光线/氛围]

[角色A描述] と [角色B描述] が [位置关系]。

[角色A动作描述]。
[角色A, 语言, 声色描述, 情绪] 言う：「[台词]」

[角色B反应动作]。
[角色B, 语言, 声色描述, 情绪] 答える：「[台词]」

[镜头运动描述]。
[环境音/BGM描述]。

映画級質感、リアルな光影、キャラの顔・髪・衣装が変わらない。

[场景描述：时间/地点/光线/氛围]

[角色A描述] と [角色B描述] が [位置关系]。

[角色A动作描述]。
[角色A, 语言, 声色描述, 情绪] 言う：「[台词]」

[角色B反应动作]。
[角色B, 语言, 声色描述, 情绪] 答える：「[台词]」

[镜头运动描述]。
[环境音/BGM描述]。

映画級質感、リアルな光影、キャラの顔・髪・衣装が変わらない。

Flavor B · Character Voice PV

风格B · 角色语音PV

[场景描述：地点/光线/氛围]

[角色外观描述] が [姿态]。

[角色动作 + 镜头运动]。
[角色, 语言, 声色描述, 情绪] 言う：「[台词]」

[环境音/BGM描述]。

映画級質感、キャラの顔・髪・衣装・装備が変わらない。

[场景描述：地点/光线/氛围]

[角色外观描述] が [姿态]。

[角色动作 + 镜头运动]。
[角色, 语言, 声色描述, 情绪] 言う：「[台词]」

[环境音/BGM描述]。

映画級質感、キャラの顔・髪・衣装・装備が変わらない。

Flavor C · Manga Panel Motion

风格C · 漫画分格动态化

[图片中已有内容，无需重复描述]

[角色简短动作：眨眼/头发飘动/微笑/转头]。
[环境动态：风吹/光斑/落叶/雨滴]。
[镜头运动：缓慢推近/横移/拉远/固定]。

[环境音/BGM描述]。

原画風、キャラの顔・髪・衣装・構図・色調が変わらない。

[图片中已有内容，无需重复描述]

[角色简短动作：眨眼/头发飘动/微笑/转头]。
[环境动态：风吹/光斑/落叶/雨滴]。
[镜头运动：缓慢推近/横移/拉远/固定]。

[环境音/BGM描述]。

原画風、キャラの顔・髪・衣装・構図・色調が変わらない。

Flavor D · Virtual Idol MV

风格D · 虚拟偶像MV

参考「Image 1」=[正面]、「Image 2」=[侧面]、「Image 3」=[表情差分]
中的虚拟偶像角色形象。

[舞台场景描述]
[灯光描述]

分镜 1（N 秒）：[景别]，角色 [姿态/动作]。
[角色, 语言, 声色, 情绪] 歌う：「[歌词]」

分镜 2（N 秒）：[景别]，角色 [舞蹈动作]。
[歌词]

分镜 3（N 秒）：[景别]，角色 [表情]。
[歌词]

[J-POP MV 美学 / 电影级光影]，キャラの顔・髪・衣装が変わらない。
口型与歌词重音精准匹配。

参考「Image 1」=[正面]、「Image 2」=[侧面]、「Image 3」=[表情差分]
中的虚拟偶像角色形象。

[舞台场景描述]
[灯光描述]

分镜1（N秒）：[景别]，角色 [姿态/动作]。
[角色, 语言, 声色, 情绪] 歌う：「[歌词]」

分镜2（N秒）：[景别]，角色 [舞蹈动作]。
[歌词]

分镜3（N秒）：[景别]，角色 [表情]。
[歌词]

[J-POP MV美学 / 电影级光影]，キャラの顔・髪・衣装が変わらない。
口型与歌词重音精准匹配。

3.6 Assembling the Output

3.6 输出组装

Present the final prompt in a code block so the user can copy it directly. Include:

The prompt itself (in the user's language)
A brief breakdown of what each part does
Suggested model variant (t2v / i2v / r2v / video-edit)
Estimated cost (720P: ¥0.9/sec, 1080P: ¥1.6/sec)

Example output format:

Here's your prompt — ready to paste into HappyHorse:

[PROMPT HERE]


**Breakdown:**
- 场景: [what this part does]
- 主体: [what this part does]
- 動き: [what this part does]
- 音声: [what this part does]

**Model:** happyhorse-1.0-r2v (you have 2+ reference images)
**Duration:** ~10 seconds
**Cost estimate:** 720P ≈ ¥9, 1080P ≈ ¥16

Want me to adjust anything? (voice tone, camera angle, add another character, change the line?)

将最终提示词放在代码块中，方便用户直接复制。内容需包含：

提示词本身（使用用户的语言）
简短拆解说明各部分作用
推荐模型变体（t2v / i2v / r2v / video-edit）
预估成本（720P：¥0.9/秒，1080P：¥1.6/秒）

示例输出格式：

这是你的提示词——可直接粘贴到HappyHorse中使用：

[提示词内容]


**拆解说明：**
- 场景：[该部分作用]
- 主体：[该部分作用]
- 動き：[该部分作用]
- 音声：[该部分作用]

**模型：** happyhorse-1.0-r2v（你提供了2张以上参考图）
**时长：** ~10秒
**成本预估：** 720P ≈ ¥9，1080P ≈ ¥16

需要调整什么内容吗？（声线、镜头角度、添加其他角色、修改台词？）

Phase 4 · Quality Check (质量检查)

Phase 4 · 质量检查（Quality Check）

Before finalizing, run through this checklist silently. If anything fails, fix before presenting.

最终呈现前，请默默完成以下检查清单。若有任何项不通过，需先修正再展示。

4.1 Prompt Quality

4.1 提示词质量

☐ Does the prompt follow the Scene + Subject + Motion + Audio + Quality structure?
☐ Is the camera movement explicitly stated? (not left to chance)
☐ Is the voice type described with specific adjectives? (not vague)
☐ Is the dialogue in the correct brackets for the language? (「」 for JP, "" for EN)
☐ Is the "stays unchanged" constraint included at the end?
☐ Is the prompt length between 150-300 characters? (too short = under-specified; too long = hard to control)

☐ 提示词是否遵循「场景+主体+动作+音频+质量」的结构？
☐ 是否明确指定了镜头运动？（而非随机生成）
☐ 是否用具体形容词描述了声线类型？（而非模糊表述）
☐ 对话是否使用了对应语言的正确括号？（日语用「」，英语用""）
☐ 结尾是否添加了「保持不变」的约束？
☐ 提示词长度是否在150-300字符之间？（过短=描述不足；过长=难以控制）

4.2 Compliance Check

4.2 合规检查

☐ No existing anime/manga/game IP referenced?
☐ No real person likeness?
☐ Character depicted as adult?
☐ Outfit is SFW?
☐ Scene location is SFW?
☐ If music is involved, it's licensed/original?

☐ 未引用现有动漫/漫画/游戏IP？
☐ 未使用真人肖像？
☐ 角色被描绘为成年人？
☐ 服装为SFW？
☐ 场景地点为SFW？
☐ 若涉及音乐，为授权或原创音乐？

4.3 Optimization Tips

4.3 优化建议

If the prompt looks good, offer these pro-tips:

"Try 3 variants" — HappyHorse results vary; generating 3-5 and picking the best is standard practice
"Start 720P, finish 1080P" — do test runs at 720P (cheaper), then re-generate the winner at 1080P
"Shorter lines = better lip-sync" — if the voice line is over 15 characters, consider splitting into two shots
"Specific beats vague" — "camera slowly pushes from full-body to chest close-up" beats "camera moves"

若提示词符合要求，可提供以下专业技巧：

「尝试3种变体」——HappyHorse的结果会有差异，生成3-5个版本后挑选最佳是标准做法
「先720P测试，再1080P定稿」——先用720P做测试（成本更低），再用1080P重新生成最优版本
「台词越短，唇形同步效果越好」——如果台词超过15个字符，考虑拆分为两个镜头
「具体描述优于模糊表述」——「镜头从全身缓慢推近至胸部特写」比「镜头移动」效果更好

Free-Form Mode (自由模式)

自由模式

If the user's scenario doesn't fit Flavors A-D, use the formula directly:

Ask: "What's your scene? Describe it like you're telling a friend about a movie you just watched."
Extract: scene, subject, motion, audio, quality from their description
Assemble using the formula
Apply the quality check

This mode is especially useful for:

Product advertisements
Educational explainers
Abstract / artistic videos
Non-character-driven content

如果用户的场景不符合A-D四种风格，可直接使用公式：

提问：「你的场景是什么？像给朋友讲刚看过的电影一样描述它。」
从用户描述中提取：场景、主体、动作、音频、质量
使用公式组装提示词
执行质量检查

该模式尤其适用于：

产品广告
教育类讲解视频
抽象/艺术视频
非角色驱动内容

Common Pitfalls (常见问题)

常见问题

Problem	Fix
Character face drifts across shots	Add more reference images + emphasize "顔・髪・衣装が変わらない"
Voice sounds like machine translation	Add "ネイティブな日本語" marker; avoid translation-style phrasing
Lip-sync is off	Shorten dialogue to ≤15 characters; split into two shots
Motion is stiff	Replace generic "動いている" with specific actions: "髪が風に右へ流れる"
Camera wanders	Always specify camera: "固定", "ゆっくり寄る", "横にパン"
Style drifts from original art	Add "原画風・色調が変わらない" at the end
Multiple characters swap positions	Use `@「Image n」` syntax to lock each character's position
Q-version character becomes realistic	Add "デフォルメ / ちびキャラ / Q 版" style keyword

问题	解决方案
角色面部在不同镜头中偏移	添加更多参考图+强调「顔・髪・衣装が変わらない」
声音带有翻译腔	添加「ネイティブな日本語」标记；避免翻译腔表述
唇形同步不准确	将台词缩短至≤15字符；拆分为两个镜头
动作生硬	用具体动作替换通用的「動いている」：「髪が風に右へ流れる」
镜头随意移动	始终明确指定镜头：「固定」、「ゆっくり寄る」、「横にパン」
风格偏离原画	在结尾添加「原画風・色調が変わらない」
多角色位置互换	使用 `@「Image n」` 语法锁定每个角色的位置
Q版角色变成写实风格	添加「デフォルメ / ちびキャラ / Q版」风格关键词

CLI Quick-Start (for users who want to run it immediately)

CLI快速入门（适用于想要立即运行的用户）

If the user has

bailian-cli

installed, they can run the prompt directly:

bash

undefined

如果用户已安装

bailian-cli

，可直接运行提示词：

bash

undefined

Text-to-video (no input image)

Text-to-video（无输入图片）

bl video generate --prompt "[PROMPT]" --download output.mp4

Image-to-video (one input image)

Image-to-video（单张输入图片）

bl video generate --image ./char.png --prompt "[PROMPT]" --download output.mp4

Reference-to-video (multiple input images for character consistency)

Reference-to-video（多张输入图片保证角色一致性）

bl video generate --image ./char1.png --image ./char2.png --prompt "[PROMPT]" --download output.mp4

Video editing (style transfer on existing video)

Video editing（对现有视频进行风格迁移）

bl video edit --video ./input.mp4 --image ./style-ref.png --prompt "[PROMPT]" --download output.mp4

---

bl video edit --video ./input.mp4 --image ./style-ref.png --prompt "[PROMPT]" --download output.mp4

---

Example Interactions

交互示例

Example 1 · First-time user (flavor A)

示例1 · 首次使用用户（风格A）

User: "I have a manga character I want to animate. She's a schoolgirl with pink hair, and I want her to say something cute."

Agent: "That sounds like Flavor B — a Character Voice PV! Let me ask a few things:

1. What's her name and personality? (shy? cheerful? cool?)
2. What's she saying? Got a line in mind, or want me to suggest one?
3. Where is she? (school rooftop? cherry blossom park? classroom?)
4. Voice vibe — bright and cute, or soft and shy?

And quick compliance check: is she your own original character?"

User: "Her name is Sakura, she's super cheerful, she's on a school rooftop at lunch. She says 'Hey! You made it!' in Japanese. Bright voice!"

Agent: [assembles prompt using Flavor B template]

"Here's your prompt:

昼休みの学校の屋上、青空と白い雲、風が心地よい。桜色のロングヘアの少女がフェンスに寄りかかり、こちらを見て笑っている。

少女が手を振り、カメラがゆっくり寄る。 [少女、ネイティブな日本語、明るく元気な若い女性声、嬉しそう] 言う：「ねえ！来てくれたんだ！」

背景に風の音、遠くで校庭のざわめき、明るいピアノの BGM。映画級質感、キャラの顔・髪・制服が変わらない。


Model: happyhorse-1.0-i2v (one image input)
Duration: ~8 seconds
Cost: 720P ≈ ¥7.2, 1080P ≈ ¥12.8

Want to tweak anything?"

用户：「我有一个漫画角色想做成动画。她是粉发女学生，我想让她说些可爱的话。」

Agent：「这听起来很适合风格B——角色语音PV！我问几个问题：

1. 她的名字和性格是什么？（害羞？开朗？冷酷？）
2. 她说什么？有想好的台词吗，还是让我来建议？
3. 她在哪里？（学校屋顶？樱花公园？教室？）
4. 声线氛围——明亮可爱，还是温柔害羞？

还有个快速合规确认：她是你原创的角色吗？」

用户：「她叫Sakura，非常开朗，在学校屋顶吃午饭。她说『Hey! You made it!』，日语，明亮的声音！」

Agent：[使用风格B模板组装提示词]

「这是你的提示词：

昼休みの学校の屋上、青空と白い雲、風が心地よい。桜色のロングヘアの少女がフェンスに寄りかかり、こちらを見て笑っている。

少女が手を振り、カメラがゆっくり寄る。 [少女、ネイティブな日本語、明るく元気な若い女性声、嬉しそう] 言う：「ねえ！来てくれたんだ！」

背景に風の音、遠くで校庭のざわめき、明るいピアノの BGM。映画級質感、キャラの顔・髪・制服が変わらない。


模型：happyhorse-1.0-i2v（单张图片输入）
时长：~8秒
成本：720P ≈ ¥7.2，1080P ≈ ¥12.8

需要调整什么内容吗？」

Example 2 · Experienced user (free-form)

示例2 · 资深用户（自由模式）

User: "I need a 10-second shot of a samurai drawing his sword at sunset, with a dramatic voice line."

Agent: "Got it — free-form, single character. Quick questions:

1. What does he say? (Japanese line?)
2. Voice type — deep and gravelly?
3. Camera — slow push to close-up? Or wide shot?
4. Art style — realistic? Anime?

Compliance: original character?"

用户：「我需要一个10秒的镜头，武士在日落时拔剑，配一句有张力的台词。」

Agent：「好的——自由模式，单角色。快速问几个问题：

1. 他说什么？（日语台词？）
2. 声线类型——低沉沙哑？
3. 镜头——缓慢推近特写？还是广角镜头？
4. 艺术风格——写实？动漫？

合规确认：是原创角色吗？」

Final Notes for the Agent

Agent最终注意事项

Always start with Phase 1 unless the user is clearly experienced and already knows what they want
Be creative with descriptions — don't just ask "what's the scene?", say "paint the picture for me — where are we, what time of day, what's the vibe?"
Suggest, don't just ask — if the user seems unsure, offer defaults: "How about a sunset rooftop scene with a gentle breeze?"
Show the prompt in a code block so it's easy to copy
Always offer to iterate — "Want me to adjust the voice tone? Change the camera angle? Add a second character?"
Keep compliance friendly, not scary — "Just to make sure everything's smooth, is this your original character?" not "COMPLIANCE CHECK: CONFIRM IP STATUS"

始终从Phase 1开始，除非用户明确是资深用户且已清楚需求
创意化描述提问——不要只问「场景是什么？」，可以说「给我描绘一下画面——我们在哪里，什么时间，氛围如何？」
主动建议而非只提问——如果用户不确定，提供默认选项：「要不要试试日落屋顶+微风的场景？」
将提示词放在代码块中，方便用户复制
始终提供迭代调整选项——「需要调整声线吗？修改镜头角度？添加第二个角色？」
合规检查要友好而非生硬——「为了确保流程顺畅，确认一下这是你的原创角色吗？」而非「合规检查：确认IP状态」