gpt-image-2

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

🪞 GPT Image 2 — Image Generation via Your ChatGPT Subscription

🪞 GPT Image 2 — 通过您的ChatGPT订阅生成图片

Generate images with GPT Image 2 (ChatGPT Images 2.0) inside your agent, using your existing ChatGPT Plus or Pro subscription — no separate OpenAI access, no Fal or Replicate tokens, no per-image billing.
Text-to-image, image-to-image editing, style transfer, and multi-reference composition. Runs entirely through the local
codex
CLI you're already logged into.
Heads up — this skill requires a ChatGPT Plus or Pro subscription plus the Codex CLI installed locally. If you have neither, you can use GPT Image 2 in the browser via RunComfy instead — hosted, no ChatGPT subscription or local install needed (RunComfy account required):
The rest of this document covers the local Codex CLI flow for agents whose user has a ChatGPT subscription.
GPT Image 2 example — flat-color lobster repainted as a 1950s ukiyo-e woodblock print
Example output: a plain flat-color icon repainted via
--ref
in ukiyo-e style — composition preserved, rendering swapped, period-appropriate red seal added by the model unprompted.
在您的Agent中使用GPT Image 2(ChatGPT Images 2.0)生成图片,只需您已有的ChatGPT Plus或Pro订阅——无需单独的OpenAI权限,无需Fal或Replicate令牌,也无需按图片计费
支持文本生成图片、图片编辑(图生图)、风格迁移以及多参考图合成。完全通过您已登录的本地
codex
CLI运行。
注意——该Skill需要ChatGPT Plus或Pro订阅,同时本地已安装Codex CLI。如果您两者都没有,可以通过RunComfy在浏览器中使用GPT Image 2——托管式服务,无需ChatGPT订阅或本地安装(需RunComfy账户):
本文档剩余部分介绍适用于拥有ChatGPT订阅用户的本地Codex CLI使用流程。
GPT Image 2示例——纯色龙虾重新绘制为20世纪50年代浮世绘木版画风格
示例输出:一个纯色扁平图标通过
--ref
参数重新绘制为浮世绘风格——构图保留,渲染风格替换,模型还自动添加了符合该时期风格的红色印章。

When to trigger

触发时机

Trigger when the user explicitly asks for GPT Image 2 via their ChatGPT subscription, for example:
  • "use GPT Image 2" / "use gpt-image-2" / "use ChatGPT Images 2.0"
  • "use Image 2" / "image 2 this"
  • attached a reference image and asked to remix / edit / restyle it
Do not auto-trigger for a plain "generate an image" request if the user didn't specify this route. If they did specify it, do not silently fall back to HTML mockups, screenshots, or a different image model.
当用户明确要求通过其ChatGPT订阅使用GPT Image 2时触发,例如:
  • "use GPT Image 2" / "use gpt-image-2" / "use ChatGPT Images 2.0"
  • "use Image 2" / "image 2 this"
  • 附上参考图片并要求重新混合/编辑/更改风格
如果用户未指定此方式,请勿针对普通的“生成图片”请求自动触发。如果用户已指定,请不要默认切换为HTML原型、截图或其他图片模型。

How to invoke

调用方式

A single bash script handles everything: runs
codex exec
with the right flags, then decodes the generated image from the persisted session rollout.
Text-to-image:
bash
bash scripts/gen.sh \
  --prompt "<user's raw prompt>" \
  --out <absolute/path/to/output.png>
Image-to-image (reference flag is repeatable for multi-reference composition):
bash
bash scripts/gen.sh \
  --prompt "<user's raw prompt, e.g. 'repaint in watercolor'>" \
  --ref /absolute/path/to/reference.png \
  --out <absolute/path/to/output.png>
Optional:
--timeout-sec 300
(default 300).
单个bash脚本即可处理所有操作:使用正确的标志运行
codex exec
,然后从持久化会话输出中解码生成的图片。
文本生成图片:
bash
bash scripts/gen.sh \
  --prompt "<用户的原始提示词>" \
  --out <绝对路径/输出文件名.png>
图片编辑(图生图)
--ref
标志可重复使用以实现多参考图合成):
bash
bash scripts/gen.sh \
  --prompt "<用户的原始提示词,例如'重新绘制为水彩风格'>" \
  --ref /绝对路径/参考图片.png \
  --out <绝对路径/输出文件名.png>
可选参数:
--timeout-sec 300
(默认值为300)。

Default behavior

默认行为

  • Pass the user's prompt through raw. Do not translate, polish, or add style modifiers unless the user asked for it.
  • Choose the output path. Default to
    ./image-<YYYYMMDD-HHMMSS>.png
    in the current working directory if the user didn't specify.
  • Deliver the image. After the script succeeds, display / attach the output file. Do not stop at "done, see path X".
  • Text-heavy layouts are fine. Image 2 handles infographics and timeline prompts well. Do not preemptively warn just because a prompt has a lot of text.
  • 直接传递用户的原始提示词。除非用户要求,否则不要翻译、润色或添加风格修饰词。
  • 选择输出路径。如果用户未指定,默认在当前工作目录生成
    ./image-<YYYYMMDD-HHMMSS>.png
  • 交付图片。脚本成功运行后,展示/附加输出文件。不要仅停留在“已完成,请查看路径X”的提示。
  • 支持文本密集型布局。Image 2可很好地处理信息图表和时间线类提示词。不要仅因为提示词包含大量文本就提前发出警告。

Hard constraints

硬性约束

  • Do not switch routes without permission. If the user said "use GPT Image 2", do not substitute DALL·E, Midjourney, an HTML mockup, or a manual screenshot workflow.
  • Do not rewrite the prompt unless asked.
  • Do not imply this skill works without a local
    codex
    login and a valid ChatGPT subscription with image-generation entitlement.
  • 未经许可不得切换方式。如果用户说“使用GPT Image 2”,请勿替换为DALL·E、Midjourney、HTML原型或手动截图流程。
  • 除非用户要求,否则不要重写提示词。
  • 不要暗示该Skill无需本地
    codex
    登录和包含图像生成权限的有效ChatGPT订阅即可使用。

Prerequisites

前置条件

  1. codex
    CLI installed —
    brew install codex
    or see openai/codex.
  2. Logged in with a ChatGPT plan that includes Image 2 —
    codex login
    .
  3. python3
    on PATH (ships with macOS;
    apt install python3
    on Linux).
This skill does not grant image-generation capability on its own. It exposes the capability the user already has through their ChatGPT subscription.
  1. 已安装
    codex
    CLI —— 使用
    brew install codex
    或查看openai/codex
  2. 使用包含Image 2的ChatGPT计划登录 —— 运行
    codex login
  3. PATH中存在
    python3
    (macOS自带;Linux系统可执行
    apt install python3
    安装)。
该Skill本身不提供图像生成能力。它只是将用户通过ChatGPT订阅已拥有的能力开放出来。

Exit codes

退出码

codemeaning
0success — output path printed on stdout
2bad args
3
codex
or
python3
CLI missing
4
--ref
file does not exist
5
codex exec
failed (auth? network? model?)
6no new session file detected
7imagegen did not produce an image payload (feature not enabled, quota, or capability refused)
On failure, name the layer in one sentence instead of dumping the full stderr at the user.
代码含义
0成功——标准输出打印输出路径
2参数错误
3缺少
codex
python3
命令行工具
4
--ref
指定的文件不存在
5
codex exec
执行失败(认证问题?网络问题?模型问题?)
6未检测到新的会话文件
7图像生成未产生图片 payload(功能未启用、配额不足或权限被拒绝)
失败时,用一句话说明问题所在,不要向用户输出完整的标准错误信息。

How it works

工作原理

The
codex
CLI reuses the logged-in ChatGPT session and exposes an
imagegen
tool (gated behind the
image_generation
feature flag). The script:
  1. snapshots
    ~/.codex/sessions/
    before the run
  2. runs
    codex exec --enable image_generation --sandbox read-only ...
    (with
    -i <file>
    for each reference image)
  3. diffs the sessions directory, then invokes
    scripts/extract_image.py
    to scan every new rollout JSONL for a base64 image payload (PNG / JPEG / WebP magic-header match)
  4. decodes the largest matching blob and writes it to
    --out
Two non-obvious flags other wrappers get wrong on codex-cli 0.111.0+:
  • --enable image_generation
    is required; the feature is still under-development and off by default.
  • --ephemeral
    must not be used — ephemeral sessions aren't persisted, so the image payload has nowhere to live.
codex
CLI复用已登录的ChatGPT会话,并提供
imagegen
工具(受
image_generation
功能标志限制)。该脚本执行以下步骤:
  1. 在运行前对
    ~/.codex/sessions/
    目录创建快照
  2. 运行
    codex exec --enable image_generation --sandbox read-only ...
    (每个参考图片使用
    -i <file>
    参数)
  3. 对比会话目录的前后差异,然后调用
    scripts/extract_image.py
    扫描所有新的rollout JSONL文件,查找base64编码的图片 payload(匹配PNG/JPEG/WebP的魔术头)
  4. 解码最大的匹配数据块并写入
    --out
    指定的路径
在codex-cli 0.111.0+版本中,有两个容易被其他封装工具忽略的关键标志:
  • --enable image_generation
    必须的;该功能仍在开发中,默认处于关闭状态。
  • 禁止使用
    --ephemeral
    ——临时会话不会被持久化,因此图片 payload无处存储。

Data handling

数据处理

The script is narrowly scoped on purpose:
  • It reads only session rollout files created by its own
    codex exec
    invocation. The sessions directory is snapshotted before the call and diffed after, so any prior
    ~/.codex/sessions/*
    files (which may contain unrelated Codex conversations) are never touched, read, or transmitted.
  • It writes only two kinds of file: the output PNG at the caller's
    --out
    path, and short-lived
    mktemp
    logs that are auto-deleted on exit via a trap.
  • No environment variables are read. No credentials are requested. No other paths under
    ~/.codex/
    are accessed.
  • No network calls leave this skill. The only outbound traffic is the one made by the
    codex
    CLI itself (to OpenAI, using the user's existing ChatGPT login) — this skill does not add endpoints, telemetry, or callbacks.
该脚本的作用范围经过严格限定:
  • 仅读取由自身
    codex exec
    调用创建的会话rollout文件。调用前会对会话目录创建快照,调用后进行对比,因此任何之前的
    ~/.codex/sessions/*
    文件(可能包含无关的Codex对话)都不会被触碰、读取或传输。
  • 仅写入两类文件:调用者指定
    --out
    路径的输出PNG,以及通过陷阱自动在退出时删除的临时
    mktemp
    日志。
  • 不读取任何环境变量。不请求任何凭据。不访问
    ~/.codex/
    下的其他路径。
  • 该Skill本身不发起网络请求。唯一的出站流量是
    codex
    CLI自身发起的(使用用户已有的ChatGPT登录信息访问OpenAI)——该Skill不会添加任何端点、遥测或回调。

What this skill is not

该Skill的定位

Not a direct OpenAI API client. Not a capability grant — it depends on the user's working Codex CLI login. Not a multi-tenant service (one call per invocation; concurrent calls are serialized by the filesystem-snapshot diff).
并非直接的OpenAI API客户端。不提供能力授权——依赖用户已正常登录的Codex CLI。并非多租户服务(每次调用对应一次请求;并发调用通过文件系统快照对比实现序列化)。