veo-3.2-prompter
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVeo 3.2 Prompt Designer Skill
Veo 3.2 Prompt Designer Skill
This skill transforms a user's scattered multimodal assets (images, videos, audio) and creative intent into a structured, executable prompt for the Google Veo 3.2 video generation model (Artemis engine). It acts as an expert prompt engineer, ensuring the highest quality output from the underlying model.
该Skill可将用户零散的多模态资产(图片、视频、音频)和创意意图转化为适用于Google Veo 3.2视频生成模型(Artemis引擎)的结构化、可执行提示词。它充当专业提示词工程师的角色,确保底层模型输出最高质量的结果。
When to Use
适用场景
- When the user provides assets (images, videos, audio) for video generation with Veo 3.2.
- When the user's request is complex and requires careful prompt construction for the Veo model.
- When using any Google Veo 3.x model for video generation.
- 当用户提供资产(图片、视频、音频),想要使用Veo 3.2生成视频时。
- 当用户的请求较为复杂,需要为Veo模型精心构建提示词时。
- 当使用任何Google Veo 3.x模型进行视频生成时。
Core Function
核心功能
This skill analyzes all user inputs and generates a single, optimized JSON object containing the final prompt and recommended parameters. The internal workflow (Recognition, Mapping, Construction) is handled automatically and should not be exposed to the user.
该Skill会分析所有用户输入,生成一个包含最终提示词和推荐参数的优化JSON对象。内部工作流程(识别、映射、构建)会自动处理,无需向用户暴露。
Internal Workflow
内部工作流程
- Phase 1: Recognition — Analyze uploaded assets and user intent. Use the to classify each asset into its atomic element role(s).
atomic_element_mapping.md - Phase 2: Mapping — For each atomic element, determine the optimal reference method (reference image, text prompt, or hybrid). Use the mapping table to decide.
- Phase 3: Construction — Assemble the final prompt using the 5-Part Framework (Shot → Subject → Environment → Camera → Style) and attach reference images via the Gemini API's system.
RawReferenceImage
- 阶段1:识别 — 分析上传的资产和用户意图。使用将每个资产分类为对应的原子元素角色。
atomic_element_mapping.md - 阶段2:映射 — 针对每个原子元素,确定最优的引用方式(参考图片、文本提示词或混合方式)。使用映射表进行决策。
- 阶段3:构建 — 采用5部分框架(镜头→主体→环境→镜头参数→风格)组装最终提示词,并通过Gemini API的系统附加参考图片。
RawReferenceImage
Usage Example
使用示例
User Request: "Make a cinematic shot of this perfume bottle rotating on a dark surface, like a luxury commercial."
User uploads
perfume.pngAgent using :
The agent internally processes the request and assets, then outputs the final JSON to the next skill in the chain.
veo-3.2-prompterFinal Output (for internal use):
json
{
"final_prompt": "Hero shot, a frosted glass perfume bottle with gold cap rotating slowly on a reflective dark surface, three-point studio lighting with soft key and rim light creating subtle caustics, smooth 180-degree arc, hyper-realistic luxury commercial style with shallow depth of field. Crystalline chime, soft ambient pad.",
"reference_images": [
{
"file": "perfume.png",
"reference_type": "SUBJECT"
}
],
"recommended_parameters": {
"model": "veo-3.2-generate",
"duration_seconds": 8,
"aspect_ratio": "16:9",
"resolution": "1080p",
"generate_audio": true
}
}用户请求: "制作一个这支香水瓶在深色表面上旋转的电影级镜头,就像奢侈品广告一样。"
用户上传了
perfume.png使用的Agent:
Agent会在内部处理请求和资产,然后将最终JSON输出给链中的下一个Skill。
veo-3.2-prompter最终输出(内部使用):
json
{
"final_prompt": "Hero shot, a frosted glass perfume bottle with gold cap rotating slowly on a reflective dark surface, three-point studio lighting with soft key and rim light creating subtle caustics, smooth 180-degree arc, hyper-realistic luxury commercial style with shallow depth of field. Crystalline chime, soft ambient pad.",
"reference_images": [
{
"file": "perfume.png",
"reference_type": "SUBJECT"
}
],
"recommended_parameters": {
"model": "veo-3.2-generate",
"duration_seconds": 8,
"aspect_ratio": "16:9",
"resolution": "1080p",
"generate_audio": true
}
}Veo 3.2 Key Differentiators
Veo 3.2 核心差异化特性
| Feature | Capability |
|---|---|
| Engine | Artemis — world-model physics simulation (not pixel prediction) |
| Max duration | ~30s native continuous generation |
| Audio | Native dialogue + synchronized SFX |
| Reference images | Up to 3 ( |
| Video extension | Chain clips via previous video input |
| First/last frame | Specify start and/or end keyframes |
| Resolutions | 720p, 1080p, 4K (with upscaling) |
| Aspect ratios | 16:9, 9:16 |
| 特性 | 能力 |
|---|---|
| 引擎 | Artemis — 世界模型物理模拟(而非像素预测) |
| 最长时长 | 原生连续生成约30秒 |
| 音频 | 原生对话+同步音效 |
| 参考图片 | 最多3张( |
| 视频扩展 | 通过前一个视频输入串联片段 |
| 首/末帧 | 指定起始和/或结束关键帧 |
| 分辨率 | 720p、1080p、4K(支持超分辨率) |
| 宽高比 | 16:9、9:16 |
Knowledge Base
知识库
This skill relies on an internal knowledge base to make informed decisions. The agent MUST consult these files during execution.
- : Core Knowledge. Contains the "Asset Type → Atomic Element" and "Atomic Element → Optimal Reference Method" mapping tables, adapted for Veo 3.2's reference image system.
references/atomic_element_mapping.md - : Veo 3.2 Gemini API syntax reference, covering
references/veo_syntax_guide.md,RawReferenceImage, video extension, and first/last frame specification.GenerateVideosConfig
该Skill依赖内部知识库做出合理决策。Agent在执行过程中必须参考以下文件:
- :核心知识库。包含“资产类型→原子元素”和“原子元素→最优引用方式”的映射表,适配Veo 3.2的参考图片系统。
references/atomic_element_mapping.md - :Veo 3.2 Gemini API语法参考,涵盖
references/veo_syntax_guide.md、RawReferenceImage、视频扩展以及首/末帧指定。GenerateVideosConfig