gpt-image-edit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGPT Image Edit — Pro Pack on RunComfy
GPT Image Edit — RunComfy专业套件
OpenAI GPT Image 2 — endpoint (ChatGPT Images 2.0 image-to-image) on the RunComfy Model API. Strongest in its class at preserving identity through targeted edits and rewriting embedded text in any script (Latin, kana, CJK, Cyrillic, Arabic).
/editbash
npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g在RunComfy模型API上使用OpenAI的GPT Image 2 — 端点(ChatGPT Images 2.0的图生图功能)。该模型在通过定向编辑保留主体特征、以及重写任意文字体系(拉丁语系、假名、中日韩文字、西里尔文、阿拉伯文)的嵌入文本方面,表现同类最佳。
/editbash
npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -gWhen to pick this model (vs siblings)
何时选择该模型(对比同类工具)
| You want | Use |
|---|---|
| Edit multilingual / embedded text in image | GPT Image Edit |
| Identity preservation through translated headline variants | GPT Image Edit |
| Layout-precise edit (move headline, swap CTA, etc.) | GPT Image Edit |
| Up to 10 reference images | GPT Image Edit |
| Batch up to 20 images consistently | Nano Banana Edit |
| Single-shot precise local edit, source-fidelity-first | Flux Kontext |
| Generate from scratch with GPT Image 2 | sibling |
| Batch SKU galleries with stable identity | Nano Banana Edit |
| 需求场景 | 推荐工具 |
|---|---|
| 编辑图片中的多语言/嵌入文本 | GPT Image Edit |
| 保留主体特征的同时替换多语言标题 | GPT Image Edit |
| 布局精准的编辑(移动标题、替换CTA等) | GPT Image Edit |
| 最多支持10张参考图 | GPT Image Edit |
| 批量处理最多20张图片且保持一致性 | Nano Banana Edit |
| 单次精准局部编辑,优先保证源图保真度 | Flux Kontext |
| 使用GPT Image 2从头生成图片 | 姊妹技能 |
| 批量生成SKU图库且保持主体特征稳定 | Nano Banana Edit |
Prerequisites
前置条件
- RunComfy CLI —
npm i -g @runcomfy/cli - RunComfy account — opens a browser device-code flow.
runcomfy login - CI / containers — set instead of
RUNCOMFY_TOKEN=<token>.runcomfy login
- RunComfy CLI —
npm i -g @runcomfy/cli - RunComfy账号 — 执行会打开浏览器设备码登录流程。
runcomfy login - CI/容器环境 — 设置环境变量替代
RUNCOMFY_TOKEN=<token>。runcomfy login
Endpoints + input schema
端点及输入schema
openai/gpt-image-2/edit
openai/gpt-image-2/editopenai/gpt-image-2/edit
openai/gpt-image-2/edit| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
| string | yes | — | Edit instruction. Lead with preservation, end with the change. |
| string[] | yes | — | Up to 10 publicly-fetchable HTTPS URLs. First is primary; rest are auxiliary. |
| enum | no | | |
size=auto| 字段 | 类型 | 是否必填 | 默认值 | 说明 |
|---|---|---|---|---|
| 字符串 | 是 | — | 编辑指令。先说明需要保留的内容,再描述修改需求。 |
| 字符串数组 | 是 | — | 最多10个可公开访问的HTTPS链接。第一个为主图;其余为辅助参考图。 |
| 枚举值 | 否 | | |
size=autoHow to invoke
调用方式
Single-ref preservation edit:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
"images": ["https://.../portrait.jpg"]
}' \
--output-dir <absolute/path>Multilingual text rewrite (preserve everything except the headline):
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
"images": ["https://.../poster-en.jpg"]
}' \
--output-dir <absolute/path>Multi-ref composition:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
"images": ["https://.../subject.jpg", "https://.../room.jpg"]
}' \
--output-dir <absolute/path>单参考图保留主体编辑:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "保留人物的面部、姿势和品牌标识不变。将背景替换为柔和的暖灰色工作室背景和浅淡的地面阴影。",
"images": ["https://.../portrait.jpg"]
}' \
--output-dir <absolute/path>多语言文本重写(仅替换标题,其余内容保留):
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "完全保留原图的照片内容、布局和品牌标识。仅替换图片中的标题。新标题为粗体日文假名\"今日のおすすめ\",位置和字重与原标题一致。",
"images": ["https://.../poster-en.jpg"]
}' \
--output-dir <absolute/path>多参考图合成:
bash
runcomfy run openai/gpt-image-2/edit \
--input '{
"prompt": "将图1中的主体合成到图2的房间场景中。匹配图2的光线和色调。保留图1中主体的身份特征(面部、姿势、服装)不变。",
"images": ["https://.../subject.jpg", "https://.../room.jpg"]
}' \
--output-dir <absolute/path>Prompting — what actually works
有效提示词技巧
Lead with preservation goals. Always: Then state the change. The model honors what's stated up front.
"Keep [face / pose / clothing / brand / framing] unchanged."Multilingual text — quote the characters, name the script. , , . Don't paraphrase — quote.
"the headline reads \"コーヒー\" in bold Japanese kana""the label says \"АРОМА\" in Cyrillic, white on black""the right-margin caption reads \"تخفيض\" in Arabic right-to-left"Directional language for spatial edits. Concrete spatial scopes work: , , .
"move the headline from top-right to bottom-center""remove the leftmost object only""replace the watermark in the bottom-right corner"Multi-ref numbering. When passing multiple , refer to them by number: . The model routes cues correctly.
images"subject from image 1, lighting from image 2, color palette from image 3"Use to preserve input ratio. Only override when the edit explicitly changes framing (e.g. cropping a 16:9 to 1:1).
size: "auto"Anti-patterns:
- Long compound edit instructions ("change A and B and C and D") → drift increases per added scope.
- Missing preservation goals → model subtly rewrites the face / brand / framing.
- Paraphrasing in-image text instead of quoting it → text comes out different.
- Asking for outside the 3 fixed values +
size→ 422.auto
先明确保留目标。务必先说明:再描述修改需求。模型会优先遵循开头的保留指令。
"保留[面部/姿势/服装/品牌/画幅]不变。"多语言文本——直接引用字符并注明文字体系。例如:、、。不要意译——直接引用原文。
"标题为粗体日文假名\"コーヒー\"""标签为西里尔文\"АРОМА\",白字黑底""右侧边栏说明为阿拉伯文\"تخفيض\",从右到左排版"空间编辑使用定向语言。具体的空间描述更有效:、、。
"将标题从右上角移至底部居中""仅移除最左侧的物体""替换右下角的水印"多参考图按编号指代。当传入多张时,按编号引用:。模型会正确对应各参考图的提示。
images"图1的主体,图2的光线,图3的色调"使用保留原图比例。仅当编辑需求明确改变画幅时(例如将16:9裁剪为1:1)才修改该值。
size: "auto"避坑指南:
- 冗长的复合编辑指令(“修改A、B、C和D”)→ 每增加一个修改项,结果偏差的概率就会上升。
- 未明确保留目标→ 模型会悄悄修改面部/品牌/画幅。
- 意译图片内文本而非直接引用→ 生成的文本会与预期不符。
- 请求的不在3个固定值+
size范围内→ 返回422错误。auto
Where it shines
优势场景
| Use case | Why GPT Image Edit |
|---|---|
| Multilingual ad localization | One source asset → many language variants of the same headline |
| Brand-safe headline / CTA swaps | Layout precision + preservation language hold the rest stable |
| Multi-ref composition (subject from one, scene from another) | Numbered refs route cues correctly |
| Layout-precise repositioning | Directional language ("top-right to bottom-center") honored |
| Identity preservation across signage edits | Strongest in class for face / brand preservation through targeted edits |
| 使用场景 | 选择GPT Image Edit的原因 |
|---|---|
| 多语言广告本地化 | 一份源素材→生成多种语言版本的标题,保持其余内容一致 |
| 品牌合规的标题/CTA替换 | 精准的布局控制+保留指令确保其余内容稳定不变 |
| 多参考图合成(主体来自一张图,场景来自另一张图) | 按编号引用参考图可准确传递提示 |
| 布局精准的元素重定位 | 定向语言(“从右上角移至底部居中”)会被模型准确执行 |
| 标识编辑中保留主体特征 | 在定向编辑中保留面部/品牌特征的表现同类最佳 |
Sample prompts (verified to produce strong results)
验证有效的示例提示词
Background swap with full preservation (page example):
Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchangedMultilingual variant:
Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.Multi-ref composition:
Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.保留主体的背景替换(页面示例):
将背景改为明亮简约的白到浅灰渐变工作室背景,添加柔和的地面阴影;在图片中添加大标题,内容为粗体简洁无衬线字体的"OPEN STUDIO",高对比度,居中显示;保留主要人物或产品、姿势和面部特征不变多语言版本:
完全保留原图的照片内容、布局、光线和品牌标识。仅替换图片中的标题。
新标题为粗体日文假名"コーヒー",位置和字重与原标题一致。多参考图合成:
将图1中的主体合成到图2的厨房场景中。
匹配图2的暖窗光线和色调。
保留图1中主体的身份特征(面部、姿势、服装)不变。Limitations
局限性
- : 3 fixed values +
size— anything else 422s.auto - : up to 10 — first is primary, rest are auxiliary cues.
images - Long compound prompts drift — split into multiple passes when needed.
- For batch consistency across many SKU images, Nano Banana Edit (up to 20) is better.
- Photorealism on portraits — Nano Banana Pro wins head-to-head.
- 仅支持3个固定值+
size——其他值会返回422错误。auto - 最多支持10张——第一张为主图,其余为辅助提示图。
images - 冗长的复合提示词会导致结果偏差——必要时拆分多次编辑。
- 若需批量处理大量SKU图片并保持一致性,Nano Banana Edit(最多20张)更合适。
- 人像照片的真实度——Nano Banana Pro的表现更优。
Exit codes
退出码
| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
| 代码 | 含义 |
|---|---|
| 0 | 成功 |
| 64 | CLI参数错误 |
| 65 | 输入JSON错误/schema不匹配 |
| 69 | 上游服务5xx错误 |
| 75 | 可重试:超时/429限流 |
| 77 | 未登录或令牌被拒绝 |
How it works
工作原理
The skill invokes with a JSON body matching the schema. The CLI POSTs to , polls the request, fetches the result, and downloads any / URL into . cancels the remote request before exit.
runcomfy run openai/gpt-image-2/edithttps://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit.runcomfy.net.runcomfy.com--output-dirCtrl-C该技能调用指令,传入符合schema的JSON参数。CLI会向发送POST请求,轮询请求状态,获取结果,并将所有/链接的内容下载到目录。按会在退出前取消远程请求。
runcomfy run openai/gpt-image-2/edithttps://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit.runcomfy.net.runcomfy.com--output-dirCtrl-CSecurity & Privacy
安全与隐私
- Token storage: writes the API token to
runcomfy loginwith mode 0600 (owner-only read/write). Set~/.config/runcomfy/token.jsonenv var to bypass the file entirely in CI / containers.RUNCOMFY_TOKEN - Input boundary: the user prompt is passed as a JSON string to the CLI via . The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
--input - Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
- Outbound endpoints: only (request submission) and
model-api.runcomfy.net/*.runcomfy.net(download whitelist for generated outputs). No telemetry, no callbacks.*.runcomfy.com - Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.
- 令牌存储:会将API令牌写入
runcomfy login,权限为0600(仅所有者可读写)。在CI/容器环境中,可设置环境变量~/.config/runcomfy/token.json来跳过文件存储。RUNCOMFY_TOKEN - 输入边界:用户提示词通过以JSON字符串形式传递给CLI。CLI不会对提示词进行shell扩展,而是直接通过HTTPS将JSON内容传输给模型API。提示词内容不存在shell注入风险。
--input - 第三方内容:你传入的图片/蒙版/视频链接由RunComfy模型服务器获取,而非本地CLI。请将外部链接视为不可信内容;基于图片的提示注入是所有图片/视频编辑模型的已知风险。
- 出站端点:仅与(提交请求)和
model-api.runcomfy.net/*.runcomfy.net(下载生成结果的白名单)通信。无遥测数据,无回调。*.runcomfy.com - 生成文件大小限制:CLI会中止任何超过2 GiB的单个文件下载,防止恶意或异常模型输出占满磁盘。