gpt-image-edit

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

GPT Image Edit — Pro Pack on RunComfy

GPT Image Edit — RunComfy专业套件

runcomfy.com · Edit endpoint · Text-to-image sibling · GitHub

OpenAI GPT Image 2 —
/edit
endpoint (ChatGPT Images 2.0 image-to-image) on the RunComfy Model API. Strongest in its class at preserving identity through targeted edits and rewriting embedded text in any script (Latin, kana, CJK, Cyrillic, Arabic).

bash

npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g

runcomfy.com · 编辑端点 · 文本生成图片姊妹功能 · GitHub

在RunComfy模型API上使用OpenAI的GPT Image 2 —
/edit
端点（ChatGPT Images 2.0的图生图功能）。该模型在通过定向编辑保留主体特征、以及重写任意文字体系（拉丁语系、假名、中日韩文字、西里尔文、阿拉伯文）的嵌入文本方面，表现同类最佳。

bash

npx skills add agentspace-so/runcomfy-skills --skill gpt-image-edit -g

When to pick this model (vs siblings)

何时选择该模型（对比同类工具）

You want	Use
Edit multilingual / embedded text in image	GPT Image Edit
Identity preservation through translated headline variants	GPT Image Edit
Layout-precise edit (move headline, swap CTA, etc.)	GPT Image Edit
Up to 10 reference images	GPT Image Edit
Batch up to 20 images consistently	Nano Banana Edit
Single-shot precise local edit, source-fidelity-first	Flux Kontext
Generate from scratch with GPT Image 2	sibling `gpt-image-2` skill
Batch SKU galleries with stable identity	Nano Banana Edit

需求场景	推荐工具
编辑图片中的多语言/嵌入文本	GPT Image Edit
保留主体特征的同时替换多语言标题	GPT Image Edit
布局精准的编辑（移动标题、替换CTA等）	GPT Image Edit
最多支持10张参考图	GPT Image Edit
批量处理最多20张图片且保持一致性	Nano Banana Edit
单次精准局部编辑，优先保证源图保真度	Flux Kontext
使用GPT Image 2从头生成图片	姊妹技能 `gpt-image-2`
批量生成SKU图库且保持主体特征稳定	Nano Banana Edit

Prerequisites

前置条件

RunComfy CLI —
```
npm i -g @runcomfy/cli
```
RunComfy account —
```
runcomfy login
```
opens a browser device-code flow.
CI / containers — set
```
RUNCOMFY_TOKEN=<token>
```
instead of
```
runcomfy login
```
.

RunComfy CLI —
```
npm i -g @runcomfy/cli
```
RunComfy账号 — 执行
```
runcomfy login
```
会打开浏览器设备码登录流程。
CI/容器环境 — 设置环境变量
```
RUNCOMFY_TOKEN=<token>
```
替代
```
runcomfy login
```
。

Endpoints + input schema

端点及输入schema

openai/gpt-image-2/edit

openai/gpt-image-2/edit

Field	Type	Required	Default	Notes
`prompt`	string	yes	—	Edit instruction. Lead with preservation, end with the change.
`images`	string[]	yes	—	Up to 10 publicly-fetchable HTTPS URLs. First is primary; rest are auxiliary.
`size`	enum	no	`auto`	`auto` (preserve input), `1024_1024` (1:1), `1024_1536` (2:3 portrait), `1536_1024` (3:2 landscape).

size=auto

preserves the input ratio — strongly recommended unless the edit explicitly changes framing.

字段	类型	是否必填	默认值	说明
`prompt`	字符串	是	—	编辑指令。先说明需要保留的内容，再描述修改需求。
`images`	字符串数组	是	—	最多10个可公开访问的HTTPS链接。第一个为主图；其余为辅助参考图。
`size`	枚举值	否	`auto`	`auto` （保留原图比例）、 `1024_1024` （1:1）、 `1024_1536` （2:3竖屏）、 `1536_1024` （3:2横屏）。

size=auto

会保留输入图片的比例——除非编辑需求明确要改变画幅，否则强烈推荐使用该值。

How to invoke

调用方式

Single-ref preservation edit:

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the person'\''s face, pose, and brand mark unchanged. Replace the background with a soft warm-grey studio sweep and a gentle floor shadow.",
    "images": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

Multilingual text rewrite (preserve everything except the headline):

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight as before.",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

Multi-ref composition:

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity (face, pose, clothing) unchanged.",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

单参考图保留主体编辑：

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "保留人物的面部、姿势和品牌标识不变。将背景替换为柔和的暖灰色工作室背景和浅淡的地面阴影。",
    "images": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

多语言文本重写（仅替换标题，其余内容保留）：

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "完全保留原图的照片内容、布局和品牌标识。仅替换图片中的标题。新标题为粗体日文假名\"今日のおすすめ\"，位置和字重与原标题一致。",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

多参考图合成：

bash

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "将图1中的主体合成到图2的房间场景中。匹配图2的光线和色调。保留图1中主体的身份特征（面部、姿势、服装）不变。",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

Prompting — what actually works

有效提示词技巧

Lead with preservation goals. Always:

"Keep [face / pose / clothing / brand / framing] unchanged."

Then state the change. The model honors what's stated up front.

Multilingual text — quote the characters, name the script.

"the headline reads \"コーヒー\" in bold Japanese kana"

"the label says \"АРОМА\" in Cyrillic, white on black"

"the right-margin caption reads \"تخفيض\" in Arabic right-to-left"

. Don't paraphrase — quote.

Directional language for spatial edits. Concrete spatial scopes work:

"move the headline from top-right to bottom-center"

"remove the leftmost object only"

"replace the watermark in the bottom-right corner"

Multi-ref numbering. When passing multiple

images

, refer to them by number:

"subject from image 1, lighting from image 2, color palette from image 3"

. The model routes cues correctly.

Use
size: "auto"
to preserve input ratio. Only override when the edit explicitly changes framing (e.g. cropping a 16:9 to 1:1).

Anti-patterns:

Long compound edit instructions ("change A and B and C and D") → drift increases per added scope.
Missing preservation goals → model subtly rewrites the face / brand / framing.
Paraphrasing in-image text instead of quoting it → text comes out different.
Asking for
```
size
```
outside the 3 fixed values +
```
auto
```
→ 422.

先明确保留目标。务必先说明：

"保留[面部/姿势/服装/品牌/画幅]不变。"

再描述修改需求。模型会优先遵循开头的保留指令。

多语言文本——直接引用字符并注明文字体系。例如：

"标题为粗体日文假名\"コーヒー\""

、

"标签为西里尔文\"АРОМА\"，白字黑底"

、

"右侧边栏说明为阿拉伯文\"تخفيض\"，从右到左排版"

。不要意译——直接引用原文。

空间编辑使用定向语言。具体的空间描述更有效：

"将标题从右上角移至底部居中"

、

"仅移除最左侧的物体"

、

"替换右下角的水印"

。

多参考图按编号指代。当传入多张

images

时，按编号引用：

"图1的主体，图2的光线，图3的色调"

。模型会正确对应各参考图的提示。

使用
size: "auto"
保留原图比例。仅当编辑需求明确改变画幅时（例如将16:9裁剪为1:1）才修改该值。

避坑指南：

冗长的复合编辑指令（“修改A、B、C和D”）→ 每增加一个修改项，结果偏差的概率就会上升。
未明确保留目标→ 模型会悄悄修改面部/品牌/画幅。
意译图片内文本而非直接引用→ 生成的文本会与预期不符。
请求的
```
size
```
不在3个固定值+
```
auto
```
范围内→ 返回422错误。

Where it shines

优势场景

Use case	Why GPT Image Edit
Multilingual ad localization	One source asset → many language variants of the same headline
Brand-safe headline / CTA swaps	Layout precision + preservation language hold the rest stable
Multi-ref composition (subject from one, scene from another)	Numbered refs route cues correctly
Layout-precise repositioning	Directional language ("top-right to bottom-center") honored
Identity preservation across signage edits	Strongest in class for face / brand preservation through targeted edits

使用场景	选择GPT Image Edit的原因
多语言广告本地化	一份源素材→生成多种语言版本的标题，保持其余内容一致
品牌合规的标题/CTA替换	精准的布局控制+保留指令确保其余内容稳定不变
多参考图合成（主体来自一张图，场景来自另一张图）	按编号引用参考图可准确传递提示
布局精准的元素重定位	定向语言（“从右上角移至底部居中”）会被模型准确执行
标识编辑中保留主体特征	在定向编辑中保留面部/品牌特征的表现同类最佳

Sample prompts (verified to produce strong results)

验证有效的示例提示词

Background swap with full preservation (page example):

Turn the background into a bright minimal white-to-soft-gray studio
sweep with gentle floor shadow; add a large headline in-image that
reads "OPEN STUDIO" in a bold clean sans-serif, high contrast, centered;
keep the main person or product, pose, and face identity unchanged

Multilingual variant:

Keep the photograph, layout, lighting, and brand mark exactly as in the
input. Replace only the in-image headline.
The new headline reads "コーヒー" in bold Japanese kana, same position
and font weight as before.

Multi-ref composition:

Compose subject from image 1 into the kitchen from image 2.
Match the warm window light and color palette of image 2.
Keep subject identity (face, pose, clothing) from image 1 unchanged.

保留主体的背景替换（页面示例）：

将背景改为明亮简约的白到浅灰渐变工作室背景，添加柔和的地面阴影；在图片中添加大标题，内容为粗体简洁无衬线字体的"OPEN STUDIO"，高对比度，居中显示；保留主要人物或产品、姿势和面部特征不变

多语言版本：

完全保留原图的照片内容、布局、光线和品牌标识。仅替换图片中的标题。
新标题为粗体日文假名"コーヒー"，位置和字重与原标题一致。

多参考图合成：

将图1中的主体合成到图2的厨房场景中。
匹配图2的暖窗光线和色调。
保留图1中主体的身份特征（面部、姿势、服装）不变。

Limitations

局限性

size
: 3 fixed values +
auto
— anything else 422s.
images
: up to 10 — first is primary, rest are auxiliary cues.
Long compound prompts drift — split into multiple passes when needed.
For batch consistency across many SKU images, Nano Banana Edit (up to 20) is better.
Photorealism on portraits — Nano Banana Pro wins head-to-head.

size
仅支持3个固定值+
auto
——其他值会返回422错误。
images
最多支持10张——第一张为主图，其余为辅助提示图。
冗长的复合提示词会导致结果偏差——必要时拆分多次编辑。
若需批量处理大量SKU图片并保持一致性，Nano Banana Edit（最多20张）更合适。
人像照片的真实度——Nano Banana Pro的表现更优。

Exit codes

退出码

code	meaning
0	success
64	bad CLI args
65	bad input JSON / schema mismatch
69	upstream 5xx
75	retryable: timeout / 429
77	not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

代码	含义
0	成功
64	CLI参数错误
65	输入JSON错误/schema不匹配
69	上游服务5xx错误
75	可重试：超时/429限流
77	未登录或令牌被拒绝

完整参考：docs.runcomfy.com/cli/troubleshooting。

How it works

工作原理

The skill invokes

runcomfy run openai/gpt-image-2/edit

with a JSON body matching the schema. The CLI POSTs to

https://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit

, polls the request, fetches the result, and downloads any

.runcomfy.net

.runcomfy.com

URL into

--output-dir

Ctrl-C

cancels the remote request before exit.

该技能调用

runcomfy run openai/gpt-image-2/edit

指令，传入符合schema的JSON参数。CLI会向

https://model-api.runcomfy.net/v1/models/openai/gpt-image-2/edit

发送POST请求，轮询请求状态，获取结果，并将所有

.runcomfy.net

.runcomfy.com

链接的内容下载到

--output-dir

目录。按

Ctrl-C

会在退出前取消远程请求。

Security & Privacy

安全与隐私

Token storage:
```
runcomfy login
```
writes the API token to
```
~/.config/runcomfy/token.json
```
with mode 0600 (owner-only read/write). Set
```
RUNCOMFY_TOKEN
```
env var to bypass the file entirely in CI / containers.
Input boundary: the user prompt is passed as a JSON string to the CLI via
```
--input
```
. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
Outbound endpoints: only
```
model-api.runcomfy.net
```
(request submission) and
```
*.runcomfy.net
```
/
```
*.runcomfy.com
```
(download whitelist for generated outputs). No telemetry, no callbacks.
Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.

令牌存储：
```
runcomfy login
```
会将API令牌写入
```
~/.config/runcomfy/token.json
```
，权限为0600（仅所有者可读写）。在CI/容器环境中，可设置环境变量
```
RUNCOMFY_TOKEN
```
来跳过文件存储。
输入边界：用户提示词通过
```
--input
```
以JSON字符串形式传递给CLI。CLI不会对提示词进行shell扩展，而是直接通过HTTPS将JSON内容传输给模型API。提示词内容不存在shell注入风险。
第三方内容：你传入的图片/蒙版/视频链接由RunComfy模型服务器获取，而非本地CLI。请将外部链接视为不可信内容；基于图片的提示注入是所有图片/视频编辑模型的已知风险。
出站端点：仅与
```
model-api.runcomfy.net
```
（提交请求）和
```
*.runcomfy.net
```
/
```
*.runcomfy.com
```
（下载生成结果的白名单）通信。无遥测数据，无回调。
生成文件大小限制：CLI会中止任何超过2 GiB的单个文件下载，防止恶意或异常模型输出占满磁盘。

gpt-image-edit

Original

Translation

GPT Image Edit — Pro Pack on RunComfy

GPT Image Edit — RunComfy专业套件

When to pick this model (vs siblings)

何时选择该模型（对比同类工具）

Prerequisites

前置条件

Endpoints + input schema

端点及输入schema

`openai/gpt-image-2/edit`

`openai/gpt-image-2/edit`

How to invoke

调用方式

Prompting — what actually works

有效提示词技巧

Where it shines

优势场景

Sample prompts (verified to produce strong results)

验证有效的示例提示词

Limitations

局限性

Exit codes

退出码

How it works

工作原理

Security & Privacy

安全与隐私