openrouter-images

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

OpenRouter Images

OpenRouter 图像生成与编辑

Generate images from text prompts and edit existing images via OpenRouter's chat completions API with image modalities.
通过OpenRouter支持图像模态的聊天补全API,根据文本提示生成图像并编辑现有图像。

Prerequisites

前置条件

The
OPENROUTER_API_KEY
environment variable must be set. Get a key at https://openrouter.ai/keys
必须设置
OPENROUTER_API_KEY
环境变量。可前往https://openrouter.ai/keys 获取密钥。

First-Time Setup

首次设置

bash
cd <skill-path>/scripts && npm install
bash
cd <skill-path>/scripts && npm install

Decision Tree

决策树

Pick the right script based on what the user is asking:
User wants to...ScriptExample
Generate an image from a text description
generate.ts "prompt"
"Create an image of a sunset over mountains"
Generate with specific aspect ratio
generate.ts "prompt" --aspect-ratio 16:9
"Make a wide landscape image of a forest"
Generate with a different model
generate.ts "prompt" --model <id>
"Generate using gemini-2.5-flash-image"
Edit or modify an existing image
edit.ts path "prompt"
"Make the sky purple in photo.png"
Transform an image with instructions
edit.ts path "prompt"
"Add a party hat to the animal in this image"
根据用户需求选择合适的脚本:
用户想要...脚本示例
根据文本描述生成图像
generate.ts "prompt"
"创建一幅山脉日落的图像"
生成指定宽高比的图像
generate.ts "prompt" --aspect-ratio 16:9
"制作一幅宽幅森林景观图"
使用其他模型生成图像
generate.ts "prompt" --model <id>
"使用gemini-2.5-flash-image生成图像"
编辑或修改现有图像
edit.ts path "prompt"
"将photo.png中的天空改为紫色"
根据指令转换图像
edit.ts path "prompt"
"给这张图片里的动物添加派对帽"

Generate Image

生成图像

Create a new image from a text prompt:
bash
cd <skill-path>/scripts && npx tsx generate.ts "a red panda wearing sunglasses"
cd <skill-path>/scripts && npx tsx generate.ts "a futuristic cityscape at night" --aspect-ratio 16:9
cd <skill-path>/scripts && npx tsx generate.ts "pixel art of a dragon" --output dragon.png
cd <skill-path>/scripts && npx tsx generate.ts "a watercolor painting" --model google/gemini-2.5-flash-image
根据文本提示创建新图像:
bash
cd <skill-path>/scripts && npx tsx generate.ts "a red panda wearing sunglasses"
cd <skill-path>/scripts && npx tsx generate.ts "a futuristic cityscape at night" --aspect-ratio 16:9
cd <skill-path>/scripts && npx tsx generate.ts "pixel art of a dragon" --output dragon.png
cd <skill-path>/scripts && npx tsx generate.ts "a watercolor painting" --model google/gemini-2.5-flash-image

Options

参数选项

FlagDescriptionDefault
--model <id>
OpenRouter model ID
google/gemini-3.1-flash-image-preview
--output <path>
Output file path
image-YYYYMMDD-HHmmss.png
--aspect-ratio <r>
Aspect ratio (e.g.
16:9
,
1:1
,
4:3
)
Model default
--image-size <s>
Image size (e.g.
1K
,
2K
)
Model default
参数说明默认值
--model <id>
OpenRouter模型ID
google/gemini-3.1-flash-image-preview
--output <path>
输出文件路径
image-YYYYMMDD-HHmmss.png
--aspect-ratio <r>
宽高比(例如
16:9
,
1:1
,
4:3
模型默认值
--image-size <s>
图像尺寸(例如
1K
,
2K
模型默认值

Edit Image

编辑图像

Modify an existing image with a text prompt:
bash
cd <skill-path>/scripts && npx tsx edit.ts photo.png "make the sky purple"
cd <skill-path>/scripts && npx tsx edit.ts avatar.jpg "add a party hat" --output avatar-hat.png
cd <skill-path>/scripts && npx tsx edit.ts scene.png "convert to watercolor style" --model google/gemini-2.5-flash-image
通过文本提示修改现有图像:
bash
cd <skill-path>/scripts && npx tsx edit.ts photo.png "make the sky purple"
cd <skill-path>/scripts && npx tsx edit.ts avatar.jpg "add a party hat" --output avatar-hat.png
cd <skill-path>/scripts && npx tsx edit.ts scene.png "convert to watercolor style" --model google/gemini-2.5-flash-image

Options

参数选项

FlagDescriptionDefault
--model <id>
OpenRouter model ID
google/gemini-3.1-flash-image-preview
--output <path>
Output file path
image-YYYYMMDD-HHmmss.png
--aspect-ratio <r>
Aspect ratio (e.g.
16:9
,
1:1
,
4:3
)
Model default
--image-size <s>
Image size (e.g.
1K
,
2K
)
Model default
Supported input formats:
.png
,
.jpg
,
.jpeg
,
.webp
,
.gif
参数说明默认值
--model <id>
OpenRouter模型ID
google/gemini-3.1-flash-image-preview
--output <path>
输出文件路径
image-YYYYMMDD-HHmmss.png
--aspect-ratio <r>
宽高比(例如
16:9
,
1:1
,
4:3
模型默认值
--image-size <s>
图像尺寸(例如
1K
,
2K
模型默认值
支持的输入格式:
.png
,
.jpg
,
.jpeg
,
.webp
,
.gif

Output Format

输出格式

generate.ts

generate.ts

json
{
  "model": "google/gemini-3.1-flash-image-preview",
  "prompt": "a red panda wearing sunglasses",
  "images_saved": ["/absolute/path/to/image-20260305-143022.png"],
  "count": 1
}
json
{
  "model": "google/gemini-3.1-flash-image-preview",
  "prompt": "a red panda wearing sunglasses",
  "images_saved": ["/absolute/path/to/image-20260305-143022.png"],
  "count": 1
}

edit.ts

edit.ts

json
{
  "model": "google/gemini-3.1-flash-image-preview",
  "source_image": "photo.png",
  "prompt": "make the sky purple",
  "images_saved": ["/absolute/path/to/image-20260305-143055.png"],
  "count": 1
}
json
{
  "model": "google/gemini-3.1-flash-image-preview",
  "source_image": "photo.png",
  "prompt": "make the sky purple",
  "images_saved": ["/absolute/path/to/image-20260305-143055.png"],
  "count": 1
}

API Response Shapes

API响应格式

Image generation uses
POST /api/v1/responses
with
modalities: ["image", "text"]
. See the Responses API reference and image generation guide for full request details.
The image-specific output item type is
image_generation_call
— this is not obvious from the general Responses API docs:
json
{
  "type": "image_generation_call",
  "id": "imagegen-abc123",
  "status": "completed",
  "result": "<base64-encoded image data>"
}
This appears alongside standard
message
output items in the
output
array. Text and image outputs may each be absent depending on the model and prompt.
图像生成使用
POST /api/v1/responses
接口,并指定
modalities: ["image", "text"]
。如需完整请求详情,请查看Responses API参考文档图像生成指南
图像专属的输出项类型为
image_generation_call
——这在通用Responses API文档中并未明确说明:
json
{
  "type": "image_generation_call",
  "id": "imagegen-abc123",
  "status": "completed",
  "result": "<base64-encoded image data>"
}
该输出项会与标准
message
输出项一同出现在
output
数组中。根据模型和提示的不同,文本和图像输出可能会缺失其一。

Using a Different Model

使用其他模型

The default model is
google/gemini-3.1-flash-image-preview
(Nano Banana 2). To use a different model, pass
--model <id>
with any OpenRouter model ID that supports image output modalities.
Use the
openrouter-models
skill to discover image-capable models:
bash
cd <openrouter-models-skill-path>/scripts && npx tsx search-models.ts --modality image
默认模型为
google/gemini-3.1-flash-image-preview
(Nano Banana 2)。如需使用其他模型,可通过
--model <id>
参数传入任何支持图像输出模态的OpenRouter模型ID。
可使用
openrouter-models
工具发现支持图像功能的模型:
bash
cd <openrouter-models-skill-path>/scripts && npx tsx search-models.ts --modality image

Presenting Results

结果展示

  • After generating or editing, display the saved image to the user
  • Include the model used and any text response the model provided (printed to stderr)
  • If multiple images are returned, show all of them
  • When the user doesn't specify an output path, tell them where the file was saved
  • For edit operations, mention the source image that was modified
  • 生成或编辑完成后,向用户展示保存的图像
  • 说明使用的模型以及模型返回的任何文本响应(会打印至stderr)
  • 如果返回多张图像,需全部展示
  • 当用户未指定输出路径时,告知用户文件的保存位置
  • 对于编辑操作,提及被修改的源图像