gpt-image-2
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese🪞 GPT Image 2 — Image Generation via Your ChatGPT Subscription
🪞 GPT Image 2 — 通过您的ChatGPT订阅生成图片
Generate images with GPT Image 2 (ChatGPT Images 2.0) inside your agent, using your existing ChatGPT Plus or Pro subscription — no separate OpenAI access, no Fal or Replicate tokens, no per-image billing.
Text-to-image, image-to-image editing, style transfer, and multi-reference composition. Runs entirely through the local CLI you're already logged into.
codexHeads up — this skill requires a ChatGPT Plus or Pro subscription plus the Codex CLI installed locally. If you have neither, you can use GPT Image 2 in the browser via RunComfy instead — hosted, no ChatGPT subscription or local install needed (RunComfy account required):
- Text-to-image: https://www.runcomfy.com/models/openai/gpt-image-2/text-to-image
- Image edit (i2i): https://www.runcomfy.com/models/openai/gpt-image-2/edit
The rest of this document covers the local Codex CLI flow for agents whose user has a ChatGPT subscription.

Example output: a plain flat-color icon repainted via in ukiyo-e style — composition preserved, rendering swapped, period-appropriate red seal added by the model unprompted.
--ref在您的Agent中使用GPT Image 2(ChatGPT Images 2.0)生成图片,只需您已有的ChatGPT Plus或Pro订阅——无需单独的OpenAI权限,无需Fal或Replicate令牌,也无需按图片计费。
支持文本生成图片、图片编辑(图生图)、风格迁移以及多参考图合成。完全通过您已登录的本地 CLI运行。
codex注意——该Skill需要ChatGPT Plus或Pro订阅,同时本地已安装Codex CLI。如果您两者都没有,可以通过RunComfy在浏览器中使用GPT Image 2——托管式服务,无需ChatGPT订阅或本地安装(需RunComfy账户):
- 文本生成图片: https://www.runcomfy.com/models/openai/gpt-image-2/text-to-image
- 图片编辑(图生图): https://www.runcomfy.com/models/openai/gpt-image-2/edit
本文档剩余部分介绍适用于拥有ChatGPT订阅用户的本地Codex CLI使用流程。

示例输出:一个纯色扁平图标通过参数重新绘制为浮世绘风格——构图保留,渲染风格替换,模型还自动添加了符合该时期风格的红色印章。
--refWhen to trigger
触发时机
Trigger when the user explicitly asks for GPT Image 2 via their ChatGPT subscription, for example:
- "use GPT Image 2" / "use gpt-image-2" / "use ChatGPT Images 2.0"
- "use Image 2" / "image 2 this"
- attached a reference image and asked to remix / edit / restyle it
Do not auto-trigger for a plain "generate an image" request if the user didn't specify this route. If they did specify it, do not silently fall back to HTML mockups, screenshots, or a different image model.
当用户明确要求通过其ChatGPT订阅使用GPT Image 2时触发,例如:
- "use GPT Image 2" / "use gpt-image-2" / "use ChatGPT Images 2.0"
- "use Image 2" / "image 2 this"
- 附上参考图片并要求重新混合/编辑/更改风格
如果用户未指定此方式,请勿针对普通的“生成图片”请求自动触发。如果用户已指定,请不要默认切换为HTML原型、截图或其他图片模型。
How to invoke
调用方式
A single bash script handles everything: runs with the right flags, then decodes the generated image from the persisted session rollout.
codex execText-to-image:
bash
bash scripts/gen.sh \
--prompt "<user's raw prompt>" \
--out <absolute/path/to/output.png>Image-to-image (reference flag is repeatable for multi-reference composition):
bash
bash scripts/gen.sh \
--prompt "<user's raw prompt, e.g. 'repaint in watercolor'>" \
--ref /absolute/path/to/reference.png \
--out <absolute/path/to/output.png>Optional: (default 300).
--timeout-sec 300单个bash脚本即可处理所有操作:使用正确的标志运行,然后从持久化会话输出中解码生成的图片。
codex exec文本生成图片:
bash
bash scripts/gen.sh \
--prompt "<用户的原始提示词>" \
--out <绝对路径/输出文件名.png>图片编辑(图生图)(标志可重复使用以实现多参考图合成):
--refbash
bash scripts/gen.sh \
--prompt "<用户的原始提示词,例如'重新绘制为水彩风格'>" \
--ref /绝对路径/参考图片.png \
--out <绝对路径/输出文件名.png>可选参数:(默认值为300)。
--timeout-sec 300Default behavior
默认行为
- Pass the user's prompt through raw. Do not translate, polish, or add style modifiers unless the user asked for it.
- Choose the output path. Default to in the current working directory if the user didn't specify.
./image-<YYYYMMDD-HHMMSS>.png - Deliver the image. After the script succeeds, display / attach the output file. Do not stop at "done, see path X".
- Text-heavy layouts are fine. Image 2 handles infographics and timeline prompts well. Do not preemptively warn just because a prompt has a lot of text.
- 直接传递用户的原始提示词。除非用户要求,否则不要翻译、润色或添加风格修饰词。
- 选择输出路径。如果用户未指定,默认在当前工作目录生成。
./image-<YYYYMMDD-HHMMSS>.png - 交付图片。脚本成功运行后,展示/附加输出文件。不要仅停留在“已完成,请查看路径X”的提示。
- 支持文本密集型布局。Image 2可很好地处理信息图表和时间线类提示词。不要仅因为提示词包含大量文本就提前发出警告。
Hard constraints
硬性约束
- Do not switch routes without permission. If the user said "use GPT Image 2", do not substitute DALL·E, Midjourney, an HTML mockup, or a manual screenshot workflow.
- Do not rewrite the prompt unless asked.
- Do not imply this skill works without a local login and a valid ChatGPT subscription with image-generation entitlement.
codex
- 未经许可不得切换方式。如果用户说“使用GPT Image 2”,请勿替换为DALL·E、Midjourney、HTML原型或手动截图流程。
- 除非用户要求,否则不要重写提示词。
- 不要暗示该Skill无需本地登录和包含图像生成权限的有效ChatGPT订阅即可使用。
codex
Prerequisites
前置条件
- CLI installed —
codexor see openai/codex.brew install codex - Logged in with a ChatGPT plan that includes Image 2 — .
codex login - on PATH (ships with macOS;
python3on Linux).apt install python3
This skill does not grant image-generation capability on its own. It exposes the capability the user already has through their ChatGPT subscription.
- 已安装CLI —— 使用
codex或查看openai/codex。brew install codex - 使用包含Image 2的ChatGPT计划登录 —— 运行。
codex login - PATH中存在(macOS自带;Linux系统可执行
python3安装)。apt install python3
该Skill本身不提供图像生成能力。它只是将用户通过ChatGPT订阅已拥有的能力开放出来。
Exit codes
退出码
| code | meaning |
|---|---|
| 0 | success — output path printed on stdout |
| 2 | bad args |
| 3 | |
| 4 | |
| 5 | |
| 6 | no new session file detected |
| 7 | imagegen did not produce an image payload (feature not enabled, quota, or capability refused) |
On failure, name the layer in one sentence instead of dumping the full stderr at the user.
| 代码 | 含义 |
|---|---|
| 0 | 成功——标准输出打印输出路径 |
| 2 | 参数错误 |
| 3 | 缺少 |
| 4 | |
| 5 | |
| 6 | 未检测到新的会话文件 |
| 7 | 图像生成未产生图片 payload(功能未启用、配额不足或权限被拒绝) |
失败时,用一句话说明问题所在,不要向用户输出完整的标准错误信息。
How it works
工作原理
The CLI reuses the logged-in ChatGPT session and exposes an tool (gated behind the feature flag). The script:
codeximagegenimage_generation- snapshots before the run
~/.codex/sessions/ - runs (with
codex exec --enable image_generation --sandbox read-only ...for each reference image)-i <file> - diffs the sessions directory, then invokes to scan every new rollout JSONL for a base64 image payload (PNG / JPEG / WebP magic-header match)
scripts/extract_image.py - decodes the largest matching blob and writes it to
--out
Two non-obvious flags other wrappers get wrong on codex-cli 0.111.0+:
- is required; the feature is still under-development and off by default.
--enable image_generation - must not be used — ephemeral sessions aren't persisted, so the image payload has nowhere to live.
--ephemeral
codeximagegenimage_generation- 在运行前对目录创建快照
~/.codex/sessions/ - 运行(每个参考图片使用
codex exec --enable image_generation --sandbox read-only ...参数)-i <file> - 对比会话目录的前后差异,然后调用扫描所有新的rollout JSONL文件,查找base64编码的图片 payload(匹配PNG/JPEG/WebP的魔术头)
scripts/extract_image.py - 解码最大的匹配数据块并写入指定的路径
--out
在codex-cli 0.111.0+版本中,有两个容易被其他封装工具忽略的关键标志:
- 是必须的;该功能仍在开发中,默认处于关闭状态。
--enable image_generation - 禁止使用——临时会话不会被持久化,因此图片 payload无处存储。
--ephemeral
Data handling
数据处理
The script is narrowly scoped on purpose:
- It reads only session rollout files created by its own invocation. The sessions directory is snapshotted before the call and diffed after, so any prior
codex execfiles (which may contain unrelated Codex conversations) are never touched, read, or transmitted.~/.codex/sessions/* - It writes only two kinds of file: the output PNG at the caller's path, and short-lived
--outlogs that are auto-deleted on exit via a trap.mktemp - No environment variables are read. No credentials are requested. No other paths under are accessed.
~/.codex/ - No network calls leave this skill. The only outbound traffic is the one made by the CLI itself (to OpenAI, using the user's existing ChatGPT login) — this skill does not add endpoints, telemetry, or callbacks.
codex
该脚本的作用范围经过严格限定:
- 仅读取由自身调用创建的会话rollout文件。调用前会对会话目录创建快照,调用后进行对比,因此任何之前的
codex exec文件(可能包含无关的Codex对话)都不会被触碰、读取或传输。~/.codex/sessions/* - 仅写入两类文件:调用者指定路径的输出PNG,以及通过陷阱自动在退出时删除的临时
--out日志。mktemp - 不读取任何环境变量。不请求任何凭据。不访问下的其他路径。
~/.codex/ - 该Skill本身不发起网络请求。唯一的出站流量是CLI自身发起的(使用用户已有的ChatGPT登录信息访问OpenAI)——该Skill不会添加任何端点、遥测或回调。
codex
What this skill is not
该Skill的定位
Not a direct OpenAI API client. Not a capability grant — it depends on the user's working Codex CLI login. Not a multi-tenant service (one call per invocation; concurrent calls are serialized by the filesystem-snapshot diff).
并非直接的OpenAI API客户端。不提供能力授权——依赖用户已正常登录的Codex CLI。并非多租户服务(每次调用对应一次请求;并发调用通过文件系统快照对比实现序列化)。