baoyu-imagine
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseImage Generation (AI SDK)
图像生成(AI SDK)
Official API-based image generation. Supports OpenAI, Azure OpenAI, Google, OpenRouter, DashScope (阿里通义万象), MiniMax, Jimeng (即梦), Seedream (豆包) and Replicate providers.
基于官方API的图像生成工具,支持OpenAI、Azure OpenAI、Google、OpenRouter、DashScope(阿里通义万象)、MiniMax、即梦(Jimeng)、豆包(Seedream)和Replicate等服务商。
Script Directory
脚本目录
Agent Execution:
- = this SKILL.md file's directory
{baseDir} - Script path =
{baseDir}/scripts/main.ts - Resolve runtime: if
${BUN_X}installed →bun; ifbunavailable →npx; else suggest installing bunnpx -y bun
Agent 执行步骤:
- = 本SKILL.md文件所在目录
{baseDir} - 脚本路径 =
{baseDir}/scripts/main.ts - 解析运行时:若已安装
${BUN_X}则使用bun;若有bun则使用npx;否则建议安装bunnpx -y bun
Step 0: Load Preferences ⛔ BLOCKING
步骤0:加载偏好设置 ⛔ 阻塞操作
CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
bash
undefined关键注意事项:此步骤必须在任何图像生成操作前完成,请勿跳过或延迟。
检查EXTEND.md文件是否存在(优先级:项目配置 → 用户配置):
bash
undefinedmacOS, Linux, WSL, Git Bash
macOS、Linux、WSL、Git Bash
test -f .baoyu-skills/baoyu-imagine/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-imagine/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md" && echo "user"
```powershelltest -f .baoyu-skills/baoyu-imagine/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-imagine/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md" && echo "user"
```powershellPowerShell (Windows)
PowerShell(Windows)
if (Test-Path .baoyu-skills/baoyu-imagine/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-imagine/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md") { "user" }
| Result | Action |
|--------|--------|
| Found | Load, parse, apply settings. If `default_model.[provider]` is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup ([references/config/first-time-setup.md](references/config/first-time-setup.md)) → Save EXTEND.md → Then continue |
**CRITICAL**: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|------|----------|
| `.baoyu-skills/baoyu-imagine/EXTEND.md` | Project directory |
| `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | User home |
Legacy compatibility: if `.baoyu-skills/baoyu-image-gen/EXTEND.md` exists and the new path does not, runtime renames it to `baoyu-imagine`. If both files exist, runtime leaves them unchanged and uses the new path.
**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits
Schema: `references/config/preferences-schema.md`if (Test-Path .baoyu-skills/baoyu-imagine/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-imagine/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md") { "user" }
| 结果 | 操作 |
|--------|--------|
| 找到文件 | 加载、解析并应用设置。若`default_model.[provider]`为null → 仅询问模型(流程2) |
| 未找到文件 | ⛔ 运行首次设置([references/config/first-time-setup.md](references/config/first-time-setup.md))→ 保存EXTEND.md → 然后继续 |
**关键注意事项**:若未找到文件,必须先通过AskUserQuestion完成完整设置(服务商 + 模型 + 画质 + 保存位置),之后才能生成图像。在EXTEND.md创建完成前,生成操作会被阻塞。
| 路径 | 位置 |
|------|----------|
| `.baoyu-skills/baoyu-imagine/EXTEND.md` | 项目目录 |
| `$HOME/.baoyu-skills/baoyu-imagine/EXTEND.md` | 用户主目录 |
旧版本兼容:若`.baoyu-skills/baoyu-image-gen/EXTEND.md`存在但新路径下无该文件,运行时会将其重命名为`baoyu-imagine`。若两个路径下的文件都存在,运行时不会修改它们,优先使用新路径下的文件。
**EXTEND.md支持配置项**:默认服务商 | 默认画质 | 默认宽高比 | 默认图像尺寸 | 默认模型 | 批量任务上限 | 服务商专属批量限制
配置 schema:`references/config/preferences-schema.md`Usage
使用方法
bash
undefinedbash
undefinedBasic
基础用法
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image cat.png
With aspect ratio
指定宽高比
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
${BUN_X} {baseDir}/scripts/main.ts --prompt "一幅风景画" --image out.png --ar 16:9
High quality
高质量生成
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --quality 2k
From prompt files
从提示词文件生成
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
With reference images (Google, OpenAI, Azure OpenAI, OpenRouter, Replicate, MiniMax, or Seedream 4.0/4.5/5.0)
使用参考图(支持Google、OpenAI、Azure OpenAI、OpenRouter、Replicate、MiniMax或Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --ref source.png
With reference images (explicit provider/model)
指定服务商/模型并使用参考图
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
Azure OpenAI (model means deployment name)
Azure OpenAI(model指部署名称)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider azure --model gpt-image-1.5
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider azure --model gpt-image-1.5
OpenRouter (recommended default model)
OpenRouter(推荐使用默认模型)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider openrouter
OpenRouter with reference images
OpenRouter结合参考图
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "改成蓝色" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
Specific provider
指定服务商
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider openai
DashScope (阿里通义万象)
DashScope(阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
DashScope Qwen-Image 2.0 Pro(推荐用于自定义尺寸和文字渲染)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张21:9的横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
DashScope legacy Qwen fixed-size model
DashScope旧版Qwen固定尺寸模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
MiniMax
MiniMax
${BUN_X} {baseDir}/scripts/main.ts --prompt "A fashion editorial portrait by a bright studio window" --image out.jpg --provider minimax
${BUN_X} {baseDir}/scripts/main.ts --prompt "明亮工作室窗边的时尚人像" --image out.jpg --provider minimax
MiniMax with subject reference (best for character/portrait consistency)
MiniMax结合主体参考图(最适合角色/人像一致性生成)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A girl stands by the library window, cinematic lighting" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9
${BUN_X} {baseDir}/scripts/main.ts --prompt "女孩站在图书馆窗边,电影级灯光" --image out.jpg --provider minimax --model image-01 --ref portrait.png --ar 16:9
MiniMax with custom size (documented for image-01)
MiniMax自定义尺寸(仅image-01支持)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cinematic poster" --image out.jpg --provider minimax --model image-01 --size 1536x1024
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影海报" --image out.jpg --provider minimax --model image-01 --size 1536x1024
Replicate (google/nano-banana-pro)
Replicate(google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate
Replicate with specific model
Replicate指定模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate --model google/nano-banana
Batch mode with saved prompt files
批量模式(使用已保存的提示词文件)
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
Batch mode with explicit worker count
指定任务数的批量模式
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
undefined${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
undefinedBatch File Format
批量文件格式
json
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}Paths in , , and are resolved relative to the batch file's directory. is optional (overridden by CLI ). Top-level array format (without wrapper) is also accepted.
promptFilesimagerefjobs--jobsjobsjson
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}promptFilesimagerefjobs--jobsjobsOptions
可选参数
| Option | Description |
|---|---|
| Prompt text |
| Read prompt from files (concatenated) |
| Output image path (required in single-image mode) |
| JSON batch file for multi-image generation |
| Worker count for batch mode (default: auto, max from config, built-in default 10) |
| Force provider (default: auto-detect) |
| Model ID (Google: |
| Aspect ratio (e.g., |
| Size (e.g., |
| Quality preset (default: |
| Image size for Google/OpenRouter (default: from quality) |
| Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Azure OpenAI edits (PNG/JPG only), OpenRouter multimodal models, Replicate, MiniMax subject-reference, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0 |
| Number of images |
| JSON output |
| 参数 | 描述 |
|---|---|
| 提示词文本 |
| 从文件读取提示词(多文件内容会拼接) |
| 输出图像路径(单图模式必填) |
| 用于多图像生成的JSON批量文件 |
| 批量模式的任务数(默认:自动,上限由配置决定,内置默认值为10) |
| 强制指定服务商(默认:自动检测) |
| 模型ID(Google: |
| 宽高比(例如 |
| 图像尺寸(例如 |
| 画质预设(默认: |
| Google/OpenRouter的图像尺寸(默认:由画质决定) |
| 参考图。支持Google多模态、OpenAI GPT Image编辑、Azure OpenAI编辑(仅支持PNG/JPG)、OpenRouter多模态模型、Replicate、MiniMax主体参考、Seedream 5.0/4.5/4.0。不支持Jimeng、Seedream 3.0或已移除的SeedEdit 3.0 |
| 生成图像数量 |
| 以JSON格式输出结果 |
Environment Variables
环境变量
| Variable | Description |
|---|---|
| OpenAI API key |
| Azure OpenAI API key |
| OpenRouter API key |
| Google API key |
| DashScope API key (阿里云) |
| MiniMax API key |
| Replicate API token |
| Jimeng (即梦) Volcengine access key |
| Jimeng (即梦) Volcengine secret key |
| Seedream (豆包) Volcengine ARK API key |
| OpenAI model override |
| Azure default deployment name |
| Backward-compatible alias for Azure default deployment/model name |
| OpenRouter model override (default: |
| Google model override |
| DashScope model override (default: |
| MiniMax model override (default: |
| Replicate model override (default: google/nano-banana-pro) |
| Jimeng model override (default: jimeng_t2i_v40) |
| Seedream model override (default: doubao-seedream-5-0-260128) |
| Custom OpenAI endpoint |
| Azure resource endpoint or deployment endpoint |
| Azure image API version (default: |
| Custom OpenRouter endpoint (default: |
| Optional app/site URL for OpenRouter attribution |
| Optional app name for OpenRouter attribution |
| Custom Google endpoint |
| Custom DashScope endpoint |
| Custom MiniMax endpoint (default: |
| Custom Replicate endpoint |
| Custom Jimeng endpoint (default: |
| Jimeng region (default: |
| Custom Seedream endpoint (default: |
| Override batch worker cap |
| Override provider concurrency, e.g. |
| Override provider start gap, e.g. |
Load Priority: CLI args > EXTEND.md > env vars > >
<cwd>/.baoyu-skills/.env~/.baoyu-skills/.env| 变量 | 描述 |
|---|---|
| OpenAI API密钥 |
| Azure OpenAI API密钥 |
| OpenRouter API密钥 |
| Google API密钥 |
| DashScope API密钥(阿里云) |
| MiniMax API密钥 |
| Replicate API令牌 |
| 即梦(Jimeng)火山引擎访问密钥 |
| 即梦(Jimeng)火山引擎秘密密钥 |
| 豆包(Seedream)火山引擎ARK API密钥 |
| 覆盖OpenAI默认模型 |
| Azure默认部署名称 |
| Azure默认部署/模型名称的向后兼容别名 |
| 覆盖OpenRouter默认模型(默认: |
| 覆盖Google默认模型 |
| 覆盖DashScope默认模型(默认: |
| 覆盖MiniMax默认模型(默认: |
| 覆盖Replicate默认模型(默认:google/nano-banana-pro) |
| 覆盖即梦默认模型(默认:jimeng_t2i_v40) |
| 覆盖豆包默认模型(默认:doubao-seedream-5-0-260128) |
| 自定义OpenAI端点 |
| Azure资源端点或部署端点 |
| Azure图像API版本(默认: |
| 自定义OpenRouter端点(默认: |
| OpenRouter可选的应用/网站URL归因 |
| OpenRouter可选的应用名称归因 |
| 自定义Google端点 |
| 自定义DashScope端点 |
| 自定义MiniMax端点(默认: |
| 自定义Replicate端点 |
| 自定义即梦端点(默认: |
| 即梦服务区域(默认: |
| 自定义豆包端点(默认: |
| 覆盖批量任务上限 |
| 覆盖服务商并发数,例如 |
| 覆盖服务商任务启动间隔,例如 |
加载优先级:CLI参数 > EXTEND.md > 环境变量 > >
<cwd>/.baoyu-skills/.env~/.baoyu-skills/.envModel Resolution
模型解析优先级
Model priority (highest → lowest), applies to all providers:
- CLI flag:
--model <id> - EXTEND.md:
default_model.[provider] - Env var: (e.g.,
<PROVIDER>_IMAGE_MODEL)GOOGLE_IMAGE_MODEL - Built-in default
For Azure, / should be the Azure deployment name. is the preferred env var, and remains as a backward-compatible alias.
--modeldefault_model.azureAZURE_OPENAI_DEPLOYMENTAZURE_OPENAI_IMAGE_MODELEXTEND.md overrides env vars. If both EXTEND.md and env var exist, EXTEND.md wins.
default_model.google: "gemini-3-pro-image-preview"GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-previewAgent MUST display model info before each generation:
- Show:
Using [provider] / [model] - Show switch hint:
Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL
模型优先级(从高到低),适用于所有服务商:
- CLI参数:
--model <id> - EXTEND.md:
default_model.[provider] - 环境变量:(例如
<PROVIDER>_IMAGE_MODEL)GOOGLE_IMAGE_MODEL - 内置默认值
对于Azure, / 应为Azure部署名称。是推荐的环境变量,作为向后兼容的别名保留。
--modeldefault_model.azureAZURE_OPENAI_DEPLOYMENTAZURE_OPENAI_IMAGE_MODELEXTEND.md会覆盖环境变量。若EXTEND.md中设置了,同时环境变量存在,会优先使用EXTEND.md中的配置。
default_model.google: "gemini-3-pro-image-preview"GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-previewAgent必须在每次生成前显示模型信息:
- 显示内容:
正在使用 [服务商] / [模型] - 显示切换提示:
切换模型方式:--model <id> | EXTEND.md default_model.[provider] | 环境变量 <PROVIDER>_IMAGE_MODEL
DashScope Models
DashScope模型
Use or set / when the user wants official Qwen-Image behavior.
--model qwen-image-2.0-prodefault_model.dashscopeDASHSCOPE_IMAGE_MODELOfficial DashScope model families:
- ,
qwen-image-2.0-pro,qwen-image-2.0-pro-2026-03-03,qwen-image-2.0qwen-image-2.0-2026-03-03- Free-form in
sizeformat宽*高 - Total pixels must stay between and
512*5122048*2048 - Default size is approximately
1024*1024 - Best choice for custom ratios such as and text-heavy Chinese/English layouts
21:9
- Free-form
- ,
qwen-image-max,qwen-image-max-2025-12-30,qwen-image-plus,qwen-image-plus-2026-01-09qwen-image- Fixed sizes only: ,
1664*928,1472*1104,1328*1328,1104*1472928*1664 - Default size is
1664*928 - currently has the same capability as
qwen-imageqwen-image-plus
- Fixed sizes only:
- Legacy DashScope models such as ,
z-image-turbo,z-image-ultrawanx-v1- Keep using them only when the user explicitly asks for legacy behavior or compatibility
When translating CLI args into DashScope behavior:
- wins over
--size--ar - For , prefer explicit
qwen-image-2.0*; otherwise infer from--sizeand use the official recommended resolutions below--ar - For , only use the five official fixed sizes; if the requested ratio is not covered, switch to
qwen-image-max/plus/imageqwen-image-2.0-pro - is a baoyu-imagine compatibility preset, not a native DashScope API field. Mapping
--quality/normalonto the2ktable below is an implementation inference, not an official API guaranteeqwen-image-2.0*
Recommended sizes for common aspect ratios:
qwen-image-2.0*| Ratio | | |
|---|---|---|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
DashScope official APIs also expose , , and , but does not expose them as dedicated CLI flags today.
negative_promptprompt_extendwatermarkbaoyu-imagineOfficial references:
当用户需要官方Qwen-Image特性时,使用,或在EXTEND.md中设置 / 环境变量。
--model qwen-image-2.0-prodefault_model.dashscopeDASHSCOPE_IMAGE_MODEL官方DashScope模型系列:
- 、
qwen-image-2.0-pro、qwen-image-2.0-pro-2026-03-03、qwen-image-2.0qwen-image-2.0-2026-03-03- 支持格式的自由尺寸设置
宽*高 - 总像素数需在到
512*512之间2048*2048 - 默认尺寸约为
1024*1024 - 最适合自定义比例(如21:9)和包含大量中英文文字的布局
- 支持
- 、
qwen-image-max、qwen-image-max-2025-12-30、qwen-image-plus、qwen-image-plus-2026-01-09qwen-image- 仅支持固定尺寸:、
1664*928、1472*1104、1328*1328、1104*1472928*1664 - 默认尺寸为
1664*928 - 当前与
qwen-image功能一致qwen-image-plus
- 仅支持固定尺寸:
- 旧版DashScope模型如、
z-image-turbo、z-image-ultrawanx-v1- 仅当用户明确要求旧版特性或需要兼容时使用
将CLI参数转换为DashScope行为的规则:
- 优先级高于
--size--ar - 对于系列,优先使用明确的
qwen-image-2.0*;否则根据--size推断尺寸,并使用下方官方推荐的分辨率--ar - 对于系列,仅使用官方提供的5种固定尺寸;若请求的比例不匹配,自动切换到
qwen-image-max/plus/imageqwen-image-2.0-pro - 是baoyu-imagine的兼容预设,并非DashScope原生API字段。将
--quality/normal映射到2k的下表是实现层面的推断,不代表官方API保证qwen-image-2.0*
qwen-image-2.0*| 比例 | | |
|---|---|---|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
DashScope官方API还支持、和参数,但目前baoyu-imagine未将它们作为独立CLI参数暴露。
negative_promptprompt_extendwatermark官方参考文档:
MiniMax Models
MiniMax模型
Use or set / when the user wants MiniMax image generation.
--model image-01default_model.minimaxMINIMAX_IMAGE_MODELOfficial MiniMax image model options currently documented in the API reference:
- (recommended default)
image-01- Supports text-to-image and subject-reference image generation
- Supports official values:
aspect_ratio,1:1,16:9,4:3,3:2,2:3,3:4,9:1621:9 - Supports documented custom /
widthoutput sizes when usingheight--size <WxH> - and
widthmust both be betweenheightand512, and both must be divisible by20488
image-01-live- Lower-latency variant
- Use for sizing; MiniMax documents custom
--ar/widthas only effective forheightimage-01
MiniMax subject reference notes:
- files are sent as MiniMax
--refsubject_reference - MiniMax docs currently describe as
subject_reference[].typecharacter - Official docs say supports public URLs or Base64 Data URLs;
image_filesends local refs as Data URLsbaoyu-imagine - Official docs recommend front-facing portrait references in JPG/JPEG/PNG under 10MB
Official references:
当用户需要MiniMax图像生成功能时,使用,或在EXTEND.md中设置 / 环境变量。
--model image-01default_model.minimaxMINIMAX_IMAGE_MODEL目前API参考文档中记录的官方MiniMax图像模型选项:
- (推荐默认值)
image-01- 支持文本生成图像和主体参考图像生成
- 支持官方值:
aspect_ratio、1:1、16:9、4:3、3:2、2:3、3:4、9:1621:9 - 使用时,支持文档中记录的自定义输出尺寸
--size <WxH> - 和
width必须在height到512之间,且均需能被2048整除8
image-01-live- 低延迟版本
- 使用设置尺寸;MiniMax文档说明自定义
--ar/width仅对height有效image-01
MiniMax主体参考注意事项:
- 文件会作为MiniMax的
--ref发送subject_reference - MiniMax文档中目前将描述为
subject_reference[].typecharacter - 官方文档说明支持公共URL或Base64 Data URL;baoyu-imagine会将本地参考图转换为Data URL发送
image_file - 官方文档建议使用正面人像参考图,格式为JPG/JPEG/PNG,文件大小不超过10MB
官方参考文档:
OpenRouter Models
OpenRouter模型
Use full OpenRouter model IDs, e.g.:
- (recommended, supports image output and reference-image workflows)
google/gemini-3.1-flash-image-preview google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-pro- Other OpenRouter image-capable model IDs
Notes:
- OpenRouter image generation uses , not the OpenAI
/chat/completionsendpoints/images - If is used, choose a multimodal model that supports image input and image output
--ref - maps to OpenRouter
--imageSize;imageGenerationOptions.sizeis converted to the nearest OpenRouter size and inferred aspect ratio when possible--size <WxH>
使用完整的OpenRouter模型ID,例如:
- (推荐,支持图像输出和参考图工作流)
google/gemini-3.1-flash-image-preview google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-pro- 其他支持图像生成的OpenRouter模型ID
注意事项:
- OpenRouter图像生成使用端点,而非OpenAI的
/chat/completions端点/images - 若使用,需选择支持图像输入和输出的多模态模型
--ref - 对应OpenRouter的
--imageSize;若仅提供imageGenerationOptions.size,会自动转换为最接近的OpenRouter尺寸,并推断宽高比--size <WxH>
Replicate Models
Replicate模型
Supported model formats:
- (recommended for official models), e.g.
owner/namegoogle/nano-banana-pro - (community models by version), e.g.
owner/name:versionstability-ai/sdxl:<version>
Examples:
bash
undefined支持的模型格式:
- (官方模型推荐格式),例如
owner/namegoogle/nano-banana-pro - (带版本的社区模型),例如
owner/name:versionstability-ai/sdxl:<version>
示例:
bash
undefinedUse Replicate default model
使用Replicate默认模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate
Override model explicitly
显式覆盖模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
undefined${BUN_X} {baseDir}/scripts/main.ts --prompt "一只猫" --image out.png --provider replicate --model google/nano-banana
undefinedProvider Selection
服务商选择逻辑
- provided + no
--ref→ auto-select Google first, then OpenAI, then Azure, then OpenRouter, then Replicate, then Seedream, then MiniMax (MiniMax subject reference is more specialized toward character/portrait consistency)--provider - specified → use it (if
--provider, must be--ref,google,openai,azure,openrouter,replicate, orseedream)minimax - Only one API key available → use that provider
- Multiple available → default to Google
- 提供了但未指定
--ref→ 自动优先选择Google,其次是OpenAI、Azure、OpenRouter、Replicate、Seedream,最后是MiniMax(MiniMax主体参考更专注于角色/人像一致性)--provider - 指定了→ 使用该服务商(若使用
--provider,服务商必须是--ref、google、openai、azure、openrouter、replicate或seedream)minimax - 仅存在一个API密钥 → 使用该服务商
- 存在多个API密钥 → 默认使用Google
Quality Presets
画质预设
| Preset | Google imageSize | OpenAI Size | OpenRouter size | Replicate resolution | Use Case |
|---|---|---|---|---|---|
| 1K | 1024px | 1K | 1K | Quick previews |
| 2K | 2048px | 2K | 2K | Covers, illustrations, infographics |
Google/OpenRouter imageSize: Can be overridden with
--imageSize 1K|2K|4K| 预设 | Google imageSize | OpenAI尺寸 | OpenRouter尺寸 | Replicate分辨率 | 使用场景 |
|---|---|---|---|---|---|
| 1K | 1024px | 1K | 1K | 快速预览 |
| 2K | 2048px | 2K | 2K | 封面图、插画、信息图 |
Google/OpenRouter imageSize:可通过覆盖
--imageSize 1K|2K|4KAspect Ratios
宽高比
Supported: , , , , ,
1:116:99:164:33:42.35:1- Google multimodal: uses
imageConfig.aspectRatio - OpenAI: maps to closest supported size
- OpenRouter: sends ; if only
imageGenerationOptions.aspect_ratiois given, aspect ratio is inferred automatically--size <WxH> - Replicate: passes to model; when
aspect_ratiois provided without--ref, defaults to--armatch_input_image - MiniMax: sends official values directly; if
aspect_ratiois given without--size <WxH>,--ar/widthare sent forheightimage-01
支持的宽高比:、、、、、
1:116:99:164:33:42.35:1- Google多模态:使用
imageConfig.aspectRatio - OpenAI:映射到最接近的支持尺寸
- OpenRouter:发送;若仅提供
imageGenerationOptions.aspect_ratio,会自动推断宽高比--size <WxH> - Replicate:将传递给模型;若提供
aspect_ratio但未指定--ref,默认使用--armatch_input_image - MiniMax:直接发送官方值;若提供
aspect_ratio但未指定--size <WxH>,会为--ar发送image-01/width参数height
Generation Mode
生成模式
Default: Sequential generation.
Batch Parallel Generation: When contains 2 or more pending tasks, the script automatically enables parallel generation.
--batchfile| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel batch | Batch mode with 2+ tasks |
Execution choice:
| Situation | Preferred approach | Why |
|---|---|---|
| One image, or 1-2 simple images | Sequential | Lower coordination overhead and easier debugging |
| Multiple images already have saved prompt files | Batch ( | Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput |
| Each image still needs separate reasoning, prompt writing, or style exploration | Subagents | The work is still exploratory, so each image may need independent analysis before generation |
Output comes from | Batch ( | That workflow already produces prompt files, so direct batch execution is the intended path |
Rule of thumb:
- Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
- Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration
Parallel behavior:
- Default worker count is automatic, capped by config, built-in default 10
- Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
- You can override worker count with
--jobs <count> - Each image retries automatically up to 3 attempts
- Final output includes success count, failure count, and per-image failure reasons
默认模式:顺序生成
批量并行生成:当包含2个或更多待处理任务时,脚本会自动启用并行生成
--batchfile| 模式 | 使用场景 |
|---|---|
| 顺序生成(默认) | 常规使用、单图生成、小批量生成 |
| 批量并行生成 | 包含2个以上任务的批量模式 |
执行方式选择:
| 场景 | 推荐方式 | 原因 |
|---|---|---|
| 单图生成,或1-2张简单图像 | 顺序生成 | 协调开销更低,调试更简单 |
| 已有多个保存好的提示词文件,需生成多图 | 批量模式( | 可复用已确定的提示词,应用统一的限流/重试机制,吞吐量可预测 |
| 每张图像仍需单独推理、编写提示词或探索风格 | 子Agent | 工作仍处于探索阶段,每张图像在生成前可能需要独立分析 |
输出来自 | 批量模式( | 该工作流已生成提示词文件,直接批量执行是设计的预期路径 |
经验法则:
- 一旦提示词文件已保存,且任务是“生成所有这些图像”,优先使用批量模式而非子Agent
- 仅当生成过程需要结合单图思考、重写提示词或发散性创意探索时,才使用子Agent
并行行为:
- 默认任务数为自动分配,上限由配置决定,内置默认值为10
- 仅在批量模式下应用服务商专属限流,内置默认值经过调优,可在避免明显RPM突增的同时提升吞吐量
- 可通过覆盖任务数
--jobs <count> - 每张图像自动重试最多3次
- 最终输出包含成功数、失败数和单图失败原因
Error Handling
错误处理
- Missing API key → error with setup instructions
- Generation failure → auto-retry up to 3 attempts per image
- Invalid aspect ratio → warning, proceed with default
- Reference images with unsupported provider/model → error with fix hint
- 缺少API密钥 → 抛出错误并给出设置说明
- 生成失败 → 每张图像自动重试最多3次
- 无效宽高比 → 发出警告,使用默认值继续
- 参考图与服务商/模型不兼容 → 抛出错误并给出修复提示
Extension Support
扩展支持
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
可通过EXTEND.md进行自定义配置,详见偏好设置部分的路径和支持选项。