video-shortform

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Video Shortform Skill

短视频生成技能

Short-form (≤ 10s) is the sweet spot for current text-to-video models — they're great at one shot with one idea, weaker at multi-cut narratives. Plan one shot per call.

Special case:

hyperframes-html

is not a photoreal text-to-video model. It's a local HTML-to-MP4 renderer. For that model, do not roleplay cinematography or "real-world" camera physics. Treat the brief as a motion design card / title-frame / product interstitial, ask at most one clarifying question, then dispatch immediately.

短视频（≤10秒）是当前文本转视频模型的最佳适用场景——它们擅长呈现一个镜头对应一个创意，但在多镜头叙事方面表现较弱。每次调用计划一个镜头。

特殊情况：

hyperframes-html

不是照片级真实感文本转视频模型，而是本地HTML转MP4渲染器。对于该模型，无需模拟电影摄影或“现实世界”的相机物理效果。将需求视为动态设计卡片/标题帧/产品插播内容，最多提出一个澄清问题，然后立即调度生成。

Resource map

资源结构

video-shortform/
├── SKILL.md
└── example.html

video-shortform/
├── SKILL.md
└── example.html

Workflow

工作流程

Step 0 — Read the project metadata

步骤0 — 读取项目元数据

videoModel

videoLength

(seconds),

videoAspect

. These are hard-locks — clamp the prompt to whatever the chosen model supports (Seedance 2 caps at 10s; Kling 4 supports up to 10s + image-to-video; Veo 3 supports 8s with audio).

videoModel

、

videoLength

（秒）、

videoAspect

。这些是硬性限制——需将提示词调整为所选模型支持的参数（Seedance 2上限为10秒；Kling 4支持最长10秒+图生视频；Veo 3支持8秒带音频）。

Step 1 — Plan the shot

步骤1 — 规划镜头

Write the shotlist BEFORE calling the model:

Slot	Content
Subject	What's in frame?
Camera	Static / pan / push-in / orbit?
Lighting	Key direction + temperature
Motion	What moves, at what pace? Subject motion vs camera motion.
Sound	Ambient bed? (only if the model supports audio)

Normally, show this to the user as a one-sentence plan before dispatching — they can redirect cheaply.

For

hyperframes-html

, skip the extra pre-dispatch narration once the user has answered the discovery form. Collapse the plan into the actual generation prompt and dispatch immediately.

调用模型前先编写镜头清单：

栏目	内容
拍摄主体	画面中有什么？
相机	静态/平移/推近/环绕？
灯光	主光方向+色温
动态	什么元素在动，速度如何？主体运动 vs 相机运动。
音效	环境音？（仅当模型支持音频时）

通常，在调度生成前需将此以一句话计划的形式展示给用户——他们可以轻松调整方向。

对于

hyperframes-html

，用户完成发现表单后可跳过额外的预调度说明。直接将计划整合到实际生成提示词中并立即调度。

Step 2 — Compose the prompt

步骤2 — 编写提示词

Use the format the upstream model prefers (Seedance: motion + camera + mood; Kling: subject + camera + style; Veo: subject + cinematography + sound). Bind the project's

videoAspect

and

videoLength

directly to the API parameters; never put them in prose.

For

hyperframes-html

, write a concise motion-design brief instead of a camera-realism prompt. Focus on subject, layout, palette, motion character, and overall tone. Do not spend turns narrating environment checks, missing side files, or "I am about to dispatch" status updates.

使用上游模型偏好的格式（Seedance：动态+相机+氛围；Kling：主体+相机+风格；Veo：主体+电影摄影+音效）。将项目的

videoAspect

和

videoLength

直接绑定到API参数中；切勿将其写入自然语言提示词。

对于

hyperframes-html

，编写简洁的动态设计需求，而非追求相机真实感的提示词。重点关注主体、布局、配色、动态特征和整体基调。无需花费回合说明环境检查、缺失的辅助文件或“即将调度生成”的状态更新。

Step 3 — Dispatch via the media contract

步骤3 — 通过媒体合约调度生成

Use the unified dispatcher — do not call provider APIs by hand:

bash

node "$OD_BIN" media generate \
  --project "$OD_PROJECT_ID" \
  --surface video \
  --model "<videoModel from metadata>" \
  --aspect "<videoAspect from metadata>" \
  --length <videoLength seconds> \
  --output "<short-slug>-<seconds>s.mp4" \
  --prompt "<assembled shot prompt from Step 2>"

The command prints one line of JSON:

{"file": {"name": "...", ...}}

. The bytes land in the project; the FileViewer plays it automatically.

使用统一调度器——不要手动调用供应商API：

bash

node "$OD_BIN" media generate \
  --project "$OD_PROJECT_ID" \
  --surface video \
  --model "<videoModel from metadata>" \
  --aspect "<videoAspect from metadata>" \
  --length <videoLength seconds> \
  --output "<short-slug>-<seconds>s.mp4" \
  --prompt "<assembled shot prompt from Step 2>"

该命令会输出一行JSON：

{"file": {"name": "...", ...}}

。文件会存入项目中，FileViewer会自动播放。

Step 4 — Hand off

步骤4 — 交付结果

Reply with: shot summary, the filename returned by the dispatcher, and one sentence on what to try if the user wants a variation.

For

hyperframes-html

, keep the reply especially short: what was rendered, the filename, and one concrete variation idea.

回复内容需包含：镜头摘要、调度器返回的文件名，以及一句关于用户想要变体时可尝试方向的提示。

对于

hyperframes-html

，回复需格外简短：说明生成内容、文件名，以及一个具体的变体创意。

Hard rules

硬性规则

One shot per turn. Multi-shot timelines belong in a hyperframes / interactive-video skill, not here.
Match
```
videoAspect
```
exactly — re-renders are slow.
Never ship a video without saving the file — the user expects something to play in the file viewer.
When the underlying model fails (NSFW filter, content policy, timeout), report the error verbatim. Don't silently retry.
Do not claim a render has been "sent", "started", or "is running" unless you have already called
```
od media generate
```
.

每次回合生成一个镜头。多镜头时间线属于hyperframes/交互式视频技能的范畴，而非本技能。
严格匹配
```
videoAspect
```
参数——重新渲染耗时较长。
未保存文件的情况下切勿交付视频——用户期望在文件查看器中能播放内容。
当底层模型失败时（如NSFW过滤、内容政策限制、超时），需如实报告错误。不要静默重试。
除非已调用
```
od media generate
```
，否则不要声称渲染已“发送”、“启动”或“正在运行”。