sora

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Sora Video Generation Skill

Sora视频生成Skill

Creates or manages short video clips for the current project (product demos, marketing spots, cinematic shots, UI mocks). Defaults to
sora-2
and a structured prompt augmentation workflow, and prefers the bundled CLI for deterministic runs. Note:
$sora
is a skill tag in prompts, not a shell command.
为当前项目创建或管理短视频片段(产品演示、营销广告、电影镜头、UI模拟)。默认使用
sora-2
模型和结构化提示词增强工作流,且优先使用捆绑的CLI以确保运行结果可预测。注意:
$sora
是提示词中的skill标签,而非Shell命令。

When to use

使用场景

  • Generate a new video clip from a prompt
  • Remix an existing video by ID
  • Poll status, list jobs, or download assets (video/thumbnail/spritesheet)
  • Batch runs (many prompts or variants)
  • 从提示词生成新的视频片段
  • 通过ID重制现有视频
  • 轮询状态、列出任务或下载资源(视频/缩略图/精灵图)
  • 批量处理(多个提示词或变体)

Decision tree (create vs remix vs status/download vs batch)

决策树(创建 vs 重制 vs 状态查询/下载 vs 批量处理)

  • If the user has a video id and wants a change → remix
  • If the user has a video id and wants status or assets → status/poll/download
  • If the user needs many prompts/assets → create-batch
  • If the user asks for two versions with a small change (same shot, different subject/detail) → create the base, then remix for the variation
  • Otherwise → create (or create-and-poll if they need a ready asset in one step)
  • 如果用户提供了视频ID并要求修改 → 重制
  • 如果用户提供了视频ID并要求查询状态或获取资源 → 状态查询/轮询/下载
  • 如果用户需要处理多个提示词/资源 → 批量创建
  • 如果用户要求生成两个仅有细微差异的版本(相同镜头,不同主体/细节) → 先创建基础版本,再重制生成变体
  • 其他情况 → 创建(如果用户希望一步获取可用资源,则使用创建并轮询

Workflow

工作流

  1. Decide intent: create vs remix vs status/download vs batch.
  2. Collect inputs: prompt, model, size, seconds, and any input reference image.
  3. If batch: write a temporary JSONL under tmp/ (one job per line), run once, then delete the JSONL.
  4. Prefer CLI augmentation flags (
    --use-case
    ,
    --scene
    ,
    --camera
    , etc.) instead of pre-writing a structured prompt. If you already produced a structured prompt file, pass
    --no-augment
    to avoid double-wrapping.
  5. Run the bundled CLI (
    scripts/sora.py
    ) with sensible defaults (see references/cli.md). For long prompts, prefer
    --prompt-file
    to avoid shell-escaping issues; pair with
    --no-augment
    if the prompt is already structured.
  6. For async jobs, poll until complete (or use create-and-poll).
  7. Download assets (video/thumbnail/spritesheet) and save locally.
  8. Remove intermediate files created during invocation (for example
    prompt.txt
    ,
    remix_job.json
    , temp JSONL). If the sandbox blocks
    rm
    , skip cleanup or truncate the files without surfacing an error.
  9. Iterate with a single targeted change per prompt.
  1. 确定意图:创建、重制、状态查询/下载或批量处理。
  2. 收集输入信息:提示词、模型、尺寸、时长,以及任何输入参考图片。
  3. 若是批量处理:在tmp/目录下写入临时JSONL文件(每行一个任务),运行一次后删除该JSONL文件。
  4. 优先使用CLI增强标志(
    --use-case
    --scene
    --camera
    等),而非预先编写结构化提示词。如果已生成结构化提示词文件,需传递
    --no-augment
    参数以避免重复包装。
  5. 使用捆绑的CLI(
    scripts/sora.py
    )并采用合理默认值(详见references/cli.md)。对于长提示词,优先使用
    --prompt-file
    参数以避免Shell转义问题;若提示词已结构化,需搭配
    --no-augment
    参数。
  6. 对于异步任务,需轮询直至完成(或使用创建并轮询功能)。
  7. 下载资源(视频/缩略图/精灵图)并保存到本地。
  8. 删除调用过程中生成的中间文件(例如
    prompt.txt
    remix_job.json
    、临时JSONL文件)。若沙箱环境阻止
    rm
    命令,可跳过清理步骤或截断文件,无需向用户报错。
  9. 每次提示词仅进行一项针对性修改,逐步迭代。

Authentication

身份验证

  • OPENAI_API_KEY
    must be set for live API calls.
If the key is missing, give the user these steps:
  1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
  2. Set
    OPENAI_API_KEY
    as an environment variable in their system.
  3. Offer to guide them through setting the environment variable for their OS/shell if needed.
  • Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.
  • 调用实时API必须设置
    OPENAI_API_KEY
    环境变量。
如果密钥缺失,请告知用户以下步骤:
  1. 在OpenAI平台UI中创建API密钥:https://platform.openai.com/api-keys
  2. 在系统中设置
    OPENAI_API_KEY
    为环境变量。
  3. 若需要,可指导用户根据其操作系统/Shell设置环境变量。
  • 切勿要求用户在聊天中粘贴完整密钥。请让用户在本地设置并确认已准备就绪。

Defaults & rules

默认设置与规则

  • Default model:
    sora-2
    (use
    sora-2-pro
    for higher fidelity).
  • Default size:
    1280x720
    .
  • Default seconds:
    4
    (allowed: "4", "8", "12" as strings).
  • Always set size and seconds via API params; prose will not change them.
  • Use the OpenAI Python SDK (
    openai
    package); do not use raw HTTP.
  • Require
    OPENAI_API_KEY
    before any live API call.
  • If uv cache permissions fail, set
    UV_CACHE_DIR=/tmp/uv-cache
    .
  • Input reference images must be jpg/png/webp and should match target size.
  • Download URLs expire after about 1 hour; copy assets to your own storage.
  • Prefer the bundled CLI and never modify
    scripts/sora.py
    unless the user asks.
  • Sora can generate audio; if a user requests voiceover/audio, specify it explicitly in the
    Audio:
    and
    Dialogue:
    lines and keep it short.
  • 默认模型:
    sora-2
    (如需更高画质,使用
    sora-2-pro
    )。
  • 默认尺寸:
    1280x720
  • 默认时长:
    4
    秒(允许值:字符串类型的"4"、"8"、"12")。
  • 必须通过API参数设置尺寸和时长;自然语言描述不会修改这些参数。
  • 使用OpenAI Python SDK(
    openai
    包);请勿使用原始HTTP请求。
  • 调用实时API前必须确保
    OPENAI_API_KEY
    已设置。
  • 若uv缓存权限失败,设置
    UV_CACHE_DIR=/tmp/uv-cache
  • 输入参考图片必须为jpg/png/webp格式,且应与目标尺寸匹配。
  • 下载URL约1小时后过期;请将资源复制到自有存储中。
  • 优先使用捆绑的CLI,除非用户要求,否则切勿修改
    scripts/sora.py
  • Sora可生成音频;若用户要求旁白/音频,需在
    Audio:
    Dialogue:
    行中明确指定,并保持内容简短。

API limitations

API限制

  • Models are limited to
    sora-2
    and
    sora-2-pro
    .
  • API access to Sora models requires an organization-verified account.
  • Duration is limited to 4/8/12 seconds and must be set via the
    seconds
    parameter.
  • The API expects
    seconds
    as a string enum ("4", "8", "12").
  • Output sizes are limited by model (see
    references/video-api.md
    for the supported sizes).
  • Video creation is async; you must poll for completion before downloading.
  • Rate limits apply by usage tier (do not list specific limits).
  • Content restrictions are enforced by the API (see Guardrails below).
  • 仅支持
    sora-2
    sora-2-pro
    模型。
  • 访问Sora模型API需要经过组织验证的账户。
  • 时长限制为4/8/12秒,必须通过
    seconds
    参数设置。
  • API要求
    seconds
    为字符串枚举值("4"、"8"、"12")。
  • 输出尺寸受模型限制(支持的尺寸详见
    references/video-api.md
    )。
  • 视频创建为异步操作;下载前必须轮询直至任务完成。
  • 存在基于使用层级的速率限制(无需列出具体限制数值)。
  • API会执行内容限制(详见下方安全准则)。

Guardrails (must enforce)

安全准则(必须执行)

  • Only content suitable for audiences under 18.
  • No copyrighted characters or copyrighted music.
  • No real people (including public figures).
  • Input images with human faces are rejected.
  • 仅生成适合18岁以下受众的内容。
  • 不得使用受版权保护的角色或音乐。
  • 不得包含真实人物(包括公众人物)。
  • 包含人脸的输入图片会被拒绝。

Prompt augmentation

提示词增强

Reformat prompts into a structured, production-oriented spec. Only make implicit details explicit; do not invent new creative requirements.
Template (include only relevant lines):
Use case: <where the clip will be used>
Primary request: <user's main prompt>
Scene/background: <location, time of day, atmosphere>
Subject: <main subject>
Action: <single clear action>
Camera: <shot type, angle, motion>
Lighting/mood: <lighting + mood>
Color palette: <3-5 color anchors>
Style/format: <film/animation/format cues>
Timing/beats: <counts or beats>
Audio: <ambient cue / music / voiceover if requested>
Text (verbatim): "<exact text>"
Dialogue:
<dialogue>
- Speaker: "Short line."
</dialogue>
Constraints: <must keep/must avoid>
Avoid: <negative constraints>
Augmentation rules:
  • Keep it short; add only details the user already implied or provided elsewhere.
  • For remixes, explicitly list invariants ("same shot, change only X").
  • If any critical detail is missing and blocks success, ask a question; otherwise proceed.
  • If you pass a structured prompt file to the CLI, add
    --no-augment
    to avoid the tool re-wrapping it.
将提示词重新格式化为结构化、面向生产的规范。仅将隐含细节明确化;不得新增创意要求。
模板(仅包含相关行):
Use case: <where the clip will be used>
Primary request: <user's main prompt>
Scene/background: <location, time of day, atmosphere>
Subject: <main subject>
Action: <single clear action>
Camera: <shot type, angle, motion>
Lighting/mood: <lighting + mood>
Color palette: <3-5 color anchors>
Style/format: <film/animation/format cues>
Timing/beats: <counts or beats>
Audio: <ambient cue / music / voiceover if requested>
Text (verbatim): "<exact text>"
Dialogue:
<dialogue>
- Speaker: "Short line."
</dialogue>
Constraints: <must keep/must avoid>
Avoid: <negative constraints>
增强规则:
  • 保持简洁;仅添加用户已暗示或在其他地方提供的细节。
  • 对于重制任务,需明确列出不变项(例如“相同镜头,仅修改X”)。
  • 若缺少关键细节导致无法成功执行,需向用户提问;否则继续执行。
  • 若将结构化提示词文件传递给CLI,需添加
    --no-augment
    参数以避免工具重复包装。

Examples

示例

Generation example (single shot)

生成示例(单个镜头)

Use case: product teaser
Primary request: a close-up of a matte black camera on a pedestal
Action: slow 30-degree orbit over 4 seconds
Camera: 85mm, shallow depth of field, gentle handheld drift
Lighting/mood: soft key light, subtle rim, premium studio feel
Constraints: no logos, no text
Use case: product teaser
Primary request: a close-up of a matte black camera on a pedestal
Action: slow 30-degree orbit over 4 seconds
Camera: 85mm, shallow depth of field, gentle handheld drift
Lighting/mood: soft key light, subtle rim, premium studio feel
Constraints: no logos, no text

Remix example (invariants)

重制示例(不变项)

Primary request: same shot and framing, switch palette to teal/sand/rust with warmer backlight
Constraints: keep the subject and camera move unchanged
Primary request: same shot and framing, switch palette to teal/sand/rust with warmer backlight
Constraints: keep the subject and camera move unchanged

Prompting best practices (short list)

提示词最佳实践(简短列表)

  • One main action + one camera move per shot.
  • Use counts or beats for timing ("two steps, pause, turn").
  • Keep text short and the camera locked-off for UI or on-screen text.
  • Add a brief avoid line when artifacts appear (flicker, jitter, fast motion).
  • Shorter prompts are more creative; longer prompts are more controlled.
  • Put dialogue in a dedicated block; keep lines short for 4-8s clips.
  • State invariants explicitly for remixes (same shot, same camera move).
  • Iterate with single-change follow-ups to preserve continuity.
  • 每个镜头仅包含一个主要动作和一个相机运动。
  • 使用计数或节拍表示时间(例如“两步,停顿,转身”)。
  • 若包含UI或屏幕文本,需保持文本简短并锁定相机。
  • 当出现 artifacts(闪烁、抖动、快速运动)时,添加简短的避免项。
  • 较短的提示词更具创意;较长的提示词控制性更强。
  • 将对话放在专用块中;对于4-8秒的片段,需保持台词简短。
  • 重制时需明确说明不变项(相同镜头、相同相机运动)。
  • 通过单次修改的后续请求逐步迭代,以保持连续性。

Guidance by asset type

按资源类型提供的指导

Use these modules when the request is for a specific artifact. They provide targeted templates and defaults.
  • Cinematic shots:
    references/cinematic-shots.md
  • Social ads:
    references/social-ads.md
当请求特定产物时,使用以下模块。它们提供针对性的模板和默认值。
  • 电影镜头:
    references/cinematic-shots.md
  • 社交广告:
    references/social-ads.md

CLI + environment notes

CLI与环境说明

  • CLI commands + examples:
    references/cli.md
  • API parameter quick reference:
    references/video-api.md
  • Prompting guidance:
    references/prompting.md
  • Sample prompts:
    references/sample-prompts.md
  • Troubleshooting:
    references/troubleshooting.md
  • Network/sandbox tips:
    references/codex-network.md
  • CLI命令及示例:
    references/cli.md
  • API参数速查:
    references/video-api.md
  • 提示词指导:
    references/prompting.md
  • 示例提示词:
    references/sample-prompts.md
  • 故障排除:
    references/troubleshooting.md
  • 网络/沙箱提示:
    references/codex-network.md

Reference map

参考地图

  • references/cli.md
    : how to run create/poll/remix/download/batch via
    scripts/sora.py
    .
  • references/video-api.md
    : API-level knobs (models, sizes, duration, variants, status).
  • references/prompting.md
    : prompt structure and iteration guidance.
  • references/sample-prompts.md
    : copy/paste prompt recipes (examples only; no extra theory).
  • references/cinematic-shots.md
    : templates for filmic shots.
  • references/social-ads.md
    : templates for short social ad beats.
  • references/troubleshooting.md
    : common errors and fixes.
  • references/codex-network.md
    : network/approval troubleshooting.
  • references/cli.md
    :如何通过
    scripts/sora.py
    运行创建/轮询/重制/下载/批量处理。
  • references/video-api.md
    :API级别的参数(模型、尺寸、时长、变体、状态)。
  • references/prompting.md
    :提示词结构与迭代指导。
  • references/sample-prompts.md
    :可复制粘贴的提示词模板(仅示例;无额外理论)。
  • references/cinematic-shots.md
    :电影镜头模板。
  • references/social-ads.md
    :短社交广告节拍模板。
  • references/troubleshooting.md
    :常见错误与修复方法。
  • references/codex-network.md
    :网络/审批故障排除。