gpt-image-2
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGPT Image 2 — Interactive Image Generation
GPT Image 2 — 交互式图片生成
Generate and edit images via OpenAI's GPT Image 2 API with an interactive, guided workflow.
通过OpenAI的GPT Image 2 API,以交互式引导工作流生成和编辑图片。
Interactive Flow
交互式流程
When the user invokes this skill, guide them through these steps using AskUserQuestion. Do not skip steps — the interactive flow is the core experience.
当用户调用此技能时,使用AskUserQuestion引导他们完成以下步骤。请勿跳过步骤——交互式流程是核心体验。
Step 1: What are we making?
步骤1:我们要制作什么?
Ask the user what they want to create. Offer these options:
- Single image — one image from a text prompt
- Photo edit — transform an existing photo into a style
- Carousel — 5-10 cohesive slides for LinkedIn/Instagram
- Variants — multiple versions of the same concept
- Quick generate — skip questions, just run the prompt
If the user already provided a clear prompt (e.g. "generate an editorial image of a rocket"), skip to Step 3.
询问用户想要创建的内容。提供以下选项:
- 单张图片 — 根据文本提示生成一张图片
- 照片编辑 — 将现有照片转换为指定风格
- 轮播图 — 5-10张风格统一的幻灯片,适用于LinkedIn/Instagram
- 变体图 — 同一概念的多个版本
- 快速生成 — 跳过问题,直接运行提示词
如果用户已经提供了明确的提示词(例如“生成一张火箭的社论风格图片”),则跳至步骤3。
Step 2: Style selection
步骤2:风格选择
Show the user available presets grouped by category. Read and present them:
presets.yamlVisual styles (no text in image):
editorial, blueprint, ink, risograph, wireframe, constellation, brutalist, grain
Text-heavy (leverages GPT Image 2 text rendering):
infographic, slide, diagram, poster, menu, manga
Community favorites:
trading-card, pixar, app-mockup, isometric, action-figure, cinematic, panorama
Custom — user describes their own style
Ask: "Which style? Or describe your own."
向用户展示按类别分组的可用预设。读取并呈现如下:
presets.yaml视觉风格(图片中无文字):
editorial, blueprint, ink, risograph, wireframe, constellation, brutalist, grain
文字密集型(利用GPT Image 2的文字渲染功能):
infographic, slide, diagram, poster, menu, manga
社区热门风格:
trading-card, pixar, app-mockup, isometric, action-figure, cinematic, panorama
自定义 — 用户描述自己想要的风格
询问:“选择哪种风格?或者描述您自定义的风格。”
Step 3: Platform & sizing
步骤3:平台与尺寸
Ask where this will be used:
- YouTube thumbnail (1280×720)
- Instagram square (1080×1080)
- Slides/presentation (1920×1080)
- Blog hero (1200×630)
- X/Twitter (1600×900)
- Story (1080×1920)
- Custom size
- No resize (use API default)
询问图片的使用场景:
- YouTube缩略图(1280×720)
- Instagram正方形图(1080×1080)
- 幻灯片/演示文稿(1920×1080)
- 博客首图(1200×630)
- X/Twitter(1600×900)
- 快拍(1080×1920)
- 自定义尺寸
- 不调整尺寸(使用API默认值)
Step 4: Draft first, then final
步骤4:先生成草稿,再生成最终版
Always generate a draft first unless the user says "skip draft" or uses .
--draft false- Generate with (quality=low, ~$0.006/image)
--draft - Show the image to the user using the Read tool
- Ask: "Like this direction? I can: (a) generate final quality, (b) adjust the prompt, (c) try a different style, (d) regenerate with a new seed"
- If approved, generate final with (~$0.21/image)
--quality high - Use from the draft to maintain composition when upgrading to final
--seed
This draft→final flow saves ~97% on iteration costs.
始终先生成草稿,除非用户说“跳过草稿”或使用参数。
--draft false- 使用参数生成(质量=低,约0.006美元/张)
--draft - 使用Read工具向用户展示图片
- 询问:“您喜欢这个方向吗?我可以:(a) 生成最终质量版本,(b) 调整提示词,(c) 尝试不同风格,(d) 使用新种子重新生成”
- 如果获得批准,使用参数生成最终版(约0.21美元/张)
--quality high - 使用草稿的参数,在升级到最终版时保持构图一致
--seed
这种草稿→最终版的流程可节省约97%的迭代成本。
Step 5: Show result and offer next actions
步骤5:展示结果并提供后续操作选项
After generation, always:
- Show the image using the Read tool
- Open it with for full-resolution preview
open <path> - Report the cost
- Offer: "Want to (a) generate variants, (b) edit this further, (c) use as reference for more images, (d) done?"
生成完成后,始终执行以下操作:
- 使用Read工具展示图片
- 使用命令打开图片以查看全分辨率预览
open <path> - 报告生成成本
- 询问:“您想要(a) 生成变体图,(b) 进一步编辑此图,(c) 将此图作为参考生成更多图片,(d) 完成操作?”
Carousel Workflow
轮播图工作流
When the user wants a carousel (5-10 slides):
当用户想要制作轮播图(5-10张幻灯片)时:
1. Story arc
1. 故事脉络
Ask: "What's the story? Give me the key message and I'll draft a 10-slide arc."
Then propose a slide-by-slide plan like:
Slide 1: [Cover] — hook headline + hero image
Slide 2: [Problem] — bold statement
Slide 3: [Context] — illustration + explanation
...
Slide 10: [CTA] — call to action with URLAsk the user to approve or modify the plan.
询问:“故事内容是什么?告诉我核心信息,我会草拟一个10页的幻灯片脉络。”
然后提出逐页的规划,例如:
Slide 1: [封面] — 吸睛标题 + 主视觉图片
Slide 2: [问题] — 醒目陈述
Slide 3: [背景] — 插图 + 解释
...
Slide 10: [行动号召] — 带有URL的行动号召询问用户是否批准或修改该规划。
2. Style consistency
2. 风格一致性
Use the same preset + seed range across all slides. For carousels:
- Pick one visual style for all slides
- Use to lock composition patterns
--seed - Include pagination dots in prompts (e.g., "10 small dots at bottom, third dot highlighted orange")
- Maintain consistent color palette and typography
所有幻灯片使用相同的预设+种子范围。制作轮播图时:
- 为所有幻灯片选择一种视觉风格
- 使用参数锁定构图模式
--seed - 在提示词中包含分页点(例如“底部有10个小点,第三个点高亮为橙色”)
- 保持一致的调色板和排版
3. Draft batch
3. 批量生成草稿
Generate all slides as drafts first ($0.006 × 10 = $0.06 total). Show them all to the user as a contact sheet or one by one. Ask which ones to regenerate or adjust.
先批量生成所有幻灯片的草稿(0.006美元×10 = 总计0.06美元)。以联系表形式或逐一展示给用户,询问哪些需要重新生成或调整。
4. Final batch
4. 批量生成最终版
Only generate finals for approved slides. Offer to generate all at once with flag.
-y仅为获得批准的幻灯片生成最终版。提供使用参数一次性生成所有最终版的选项。
-yPhoto Edit Workflow
照片编辑工作流
When the user wants to transform a photo:
- Ask for the source image (file path or clipboard)
- For clipboard: save with to a temp file
osascript - Show available styles and ask which to try
- Generate a draft edit first
- Show result, ask if they want adjustments
- Generate final when approved
Use for the API call.
--edit <path>当用户想要转换照片风格时:
- 询问源图片(文件路径或剪贴板)
- 若为剪贴板图片:使用保存到临时文件
osascript - 展示可用风格并询问尝试哪种
- 先生成草稿编辑版
- 展示结果,询问是否需要调整
- 获得批准后生成最终版
在API调用中使用参数。
--edit <path>Cost Awareness
成本意识
Always communicate costs before generating:
| Quality | Per image | 10-slide carousel |
|---|---|---|
| $0.006 | $0.06 |
| medium | $0.05 | $0.50 |
| high (default) | $0.21 | $2.10 |
| high + thinking | $0.25-0.42 | $2.50-4.20 |
Thinking mode adds 20-100% cost. Only suggest it for text-heavy or complex compositions.
The script auto-confirms when cost < $0.50. Above that, it prompts the user.
生成前始终告知用户成本:
| 质量 | 单张图片成本 | 10页轮播图成本 |
|---|---|---|
| $0.006 | $0.06 |
| medium | $0.05 | $0.50 |
| high(默认) | $0.21 | $2.10 |
| high + thinking | $0.25-0.42 | $2.50-4.20 |
思考模式会增加20-100%的成本。仅在处理文字密集型或复杂构图时建议使用。
当成本低于0.50美元时,脚本会自动确认。超过该金额时,会提示用户确认。
Prompt Engineering Tips
提示词工程技巧
When helping users write prompts, apply these patterns:
- Structure: Scene → Subject → Detail → Lighting → Constraint
- Front-load the subject: put the main thing first
- For text in images: quote exact text with single quotes:
'with the headline "Hello World"' - Character consistency: maintain a 5-tuple: age + appearance + hairstyle + distinctive features + clothing
- Style tags at end: append tags like ,
editorial-magazineto converge batchesstudio-product - Use for iteration: lock composition, vary only the prompt details
--seed
帮助用户编写提示词时,应用以下模式:
- 结构:场景 → 主体 → 细节 → 光线 → 约束
- 主体前置:将主要对象放在最前面
- 图片中的文字:用单引号引用确切文字:
'with the headline "Hello World"' - 角色一致性:保持5元组信息:年龄 + 外貌 + 发型 + 独特特征 + 服装
- 风格标签后置:添加、
editorial-magazine等标签,使批量生成的风格更统一studio-product - 使用进行迭代:锁定构图,仅调整提示词细节
--seed
CLI Reference
CLI参考
bash
undefinedbash
undefinedBasic generation
基础生成
scripts/gpt_image_2.py "prompt" output.png
scripts/gpt_image_2.py "prompt" output.png
With preset and platform
使用预设和平台参数
scripts/gpt_image_2.py --preset editorial --platform square "subject" out.png
scripts/gpt_image_2.py --preset editorial --platform square "subject" out.png
Draft mode (~$0.006/image)
草稿模式(约0.006美元/张)
scripts/gpt_image_2.py --draft "prompt" out.png
scripts/gpt_image_2.py --draft "prompt" out.png
With thinking for complex layouts
针对复杂布局启用思考模式
scripts/gpt_image_2.py --thinking medium --preset diagram "OAuth flow" out.png
scripts/gpt_image_2.py --thinking medium --preset diagram "OAuth flow" out.png
Seed for reproducibility
使用种子确保可复现
scripts/gpt_image_2.py --seed 42 "prompt" out.png
scripts/gpt_image_2.py --seed 42 "prompt" out.png
Edit existing photo
编辑现有照片
scripts/gpt_image_2.py --edit photo.png "transform into constellation style" out.png
scripts/gpt_image_2.py --edit photo.png "transform into constellation style" out.png
Variants with contact sheet
生成变体图并以联系表展示
scripts/gpt_image_2.py --n 4 --preset ink "mountain" out.png
scripts/gpt_image_2.py --n 4 --preset ink "mountain" out.png
Cost estimate
成本估算
scripts/gpt_image_2.py --estimate --n 10 --quality high "batch test"
scripts/gpt_image_2.py --estimate --n 10 --quality high "batch test"
Skip confirmation
跳过确认
scripts/gpt_image_2.py -y --n 10 "batch" out.png
scripts/gpt_image_2.py -y --n 10 "batch" out.png
Dry run (show prompt without API call)
试运行(仅展示提示词,不调用API)
scripts/gpt_image_2.py --dry-run --preset editorial "test" out.png
undefinedscripts/gpt_image_2.py --dry-run --preset editorial "test" out.png
undefinedFiles
文件说明
- — main CLI (Python, requires PyYAML)
scripts/gpt_image_2.py - — 21 style presets (visual + text-heavy + community)
presets.yaml - — 8 platform sizing presets
platforms.yaml - — full API documentation
references/api_reference.md - — user defaults
~/.config/gpt-image-2/config.yaml - — generation log
~/.config/gpt-image-2/history.jsonl - — last run (for
~/.config/gpt-image-2/last.json)again
- — 主CLI工具(Python编写,需依赖PyYAML)
scripts/gpt_image_2.py - — 21种风格预设(视觉风格+文字密集型+社区热门)
presets.yaml - — 8种平台尺寸预设
platforms.yaml - — 完整API文档
references/api_reference.md - — 用户默认配置
~/.config/gpt-image-2/config.yaml - — 生成日志
~/.config/gpt-image-2/history.jsonl - — 上一次运行记录(用于
~/.config/gpt-image-2/last.json命令)again