baoyu-image-gen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseImage Generation (AI SDK)
图像生成(AI SDK)
Official API-based image generation. Supports OpenAI, Google, OpenRouter, DashScope (阿里通义万象), Jimeng (即梦), Seedream (豆包) and Replicate providers.
基于官方API实现的图像生成工具,支持OpenAI、Google、OpenRouter、DashScope(阿里通义万象)、Jimeng(即梦)、Seedream(豆包)和Replicate服务提供商。
Script Directory
脚本目录
Agent Execution:
- = this SKILL.md file's directory
{baseDir} - Script path =
{baseDir}/scripts/main.ts - Resolve runtime: if
${BUN_X}installed →bun; ifbunavailable →npx; else suggest installing bunnpx -y bun
Agent 执行规则:
- = 本SKILL.md文件所在目录
{baseDir} - 脚本路径 =
{baseDir}/scripts/main.ts - 解析运行时:如果已安装
${BUN_X}→ 直接使用bun;如果可用bun→ 使用npx;否则提示用户安装bunnpx -y bun
Step 0: Load Preferences ⛔ BLOCKING
步骤0:加载偏好设置 ⛔ 阻塞步骤
CRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence from the current working directory:
bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "found"| Result | Action |
|---|---|
| Found | Load, parse, apply settings. If |
| Not found | ⛔ STOP. Do NOT generate any images. Read references/config/first-time-setup.md and follow its Flow 1 checklist step by step. This is a multi-turn interactive setup that requires asking the user multiple questions. Resume image generation only after Step 5 (verify) passes. |
CRITICAL: The first-time setup is a multi-step interactive workflow, NOT a single action. You must ask the user questions and wait for answers at each step.
| Path | Location |
|---|---|
| Relative to current working directory |
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits
Schema:
references/config/preferences-schema.md重要提示:本步骤必须在任何图像生成前完成,不得跳过或延迟执行。
检查当前工作目录下是否存在EXTEND.md:
bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "found"| 结果 | 操作 |
|---|---|
| 存在 | 加载、解析并应用配置。如果 |
| 不存在 | ⛔ 停止操作,禁止生成任何图像。请阅读references/config/first-time-setup.md并逐步遵循其流程1的检查清单操作。这是一个多轮交互的设置流程,需要多次向用户询问信息,仅在步骤5(验证)通过后才可恢复图像生成。 |
重要提示:首次设置是多步骤交互工作流,而非单次操作。你必须每一步都向用户询问问题并等待回复。
| 路径 | 位置 |
|---|---|
| 相对于当前工作目录的路径 |
EXTEND.md支持配置项:默认服务提供商 | 默认画质 | 默认宽高比 | 默认图像尺寸 | 默认模型 | 批量并发上限 | 服务商专属批量限制
配置Schema:
references/config/preferences-schema.mdUsage
使用方法
bash
undefinedbash
undefinedBasic
基础用法
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
With aspect ratio
指定宽高比
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
High quality
高清画质
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
From prompt files
从prompt文件读取提示词
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
With reference images (Google, OpenAI, OpenRouter, Replicate, or Seedream 4.0/4.5/5.0)
带参考图生成(支持Google、OpenAI、OpenRouter、Replicate或Seedream 4.0/4.5/5.0)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
With reference images (explicit provider/model)
带参考图+显式指定服务商/模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
OpenRouter (recommended default model)
使用OpenRouter(推荐默认模型)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openrouter
OpenRouter with reference images
OpenRouter带参考图生成
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider openrouter --model google/gemini-3.1-flash-image-preview --ref source.png
Specific provider
指定服务提供商
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
DashScope (阿里通义万象)
使用DashScope(阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
DashScope Qwen-Image 2.0 Pro (recommended for custom sizes and text rendering)
DashScope Qwen-Image 2.0 Pro(推荐用于自定义尺寸和文字渲染场景)
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
${BUN_X} {baseDir}/scripts/main.ts --prompt "为咖啡品牌设计一张 21:9 横幅海报,包含清晰中文标题" --image out.png --provider dashscope --model qwen-image-2.0-pro --size 2048x872
DashScope legacy Qwen fixed-size model
DashScope旧版Qwen固定尺寸模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
${BUN_X} {baseDir}/scripts/main.ts --prompt "一张电影感海报" --image out.png --provider dashscope --model qwen-image-max --size 1664x928
Replicate (google/nano-banana-pro)
使用Replicate(google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
Replicate with specific model
Replicate指定模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
Batch mode with saved prompt files
使用保存的prompt文件批量生成
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
Batch mode with explicit worker count
批量模式显式指定并发数
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
undefined${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
undefinedBatch File Format
批量文件格式
json
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}Paths in , , and are resolved relative to the batch file's directory. is optional (overridden by CLI ). Top-level array format (without wrapper) is also accepted.
promptFilesimagerefjobs--jobsjobsjson
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}promptFilesimagerefjobs--jobsjobsOptions
可选参数
| Option | Description |
|---|---|
| Prompt text |
| Read prompt from files (concatenated) |
| Output image path (required in single-image mode) |
| JSON batch file for multi-image generation |
| Worker count for batch mode (default: auto, max from config, built-in default 10) |
| Force provider (default: auto-detect) |
| Model ID (Google: |
| Aspect ratio (e.g., |
| Size (e.g., |
| Quality preset (default: |
| Image size for Google/OpenRouter (default: from quality) |
| Reference images. Supported by Google multimodal, OpenAI GPT Image edits, OpenRouter multimodal models, Replicate, and Seedream 5.0/4.5/4.0. Not supported by Jimeng, Seedream 3.0, or removed SeedEdit 3.0 |
| Number of images |
| JSON output |
| 参数 | 说明 |
|---|---|
| 提示词文本 |
| 从文件读取提示词(会自动拼接内容) |
| 输出图像路径(单图模式下必填) |
| 用于多图生成的JSON批量配置文件路径 |
| 批量模式下的并发工作进程数(默认:自动配置,上限为配置中的最大值,内置默认值为10) |
| 强制指定服务提供商(默认:自动检测) |
| 模型ID(Google: |
| 宽高比(例如: |
| 图像尺寸(例如: |
| 画质预设(默认: |
| Google/OpenRouter的图像尺寸(默认:从画质参数继承) |
| 参考图。支持Google多模态模型、OpenAI GPT Image编辑、OpenRouter多模态模型、Replicate和Seedream 5.0/4.5/4.0。Jimeng、Seedream 3.0和已下线的SeedEdit 3.0不支持该功能 |
| 生成图像数量 |
| 以JSON格式输出结果 |
Environment Variables
环境变量
| Variable | Description |
|---|---|
| OpenAI API key |
| OpenRouter API key |
| Google API key |
| DashScope API key (阿里云) |
| Replicate API token |
| Jimeng (即梦) Volcengine access key |
| Jimeng (即梦) Volcengine secret key |
| Seedream (豆包) Volcengine ARK API key |
| OpenAI model override |
| OpenRouter model override (default: |
| Google model override |
| DashScope model override (default: |
| Replicate model override (default: google/nano-banana-pro) |
| Jimeng model override (default: jimeng_t2i_v40) |
| Seedream model override (default: doubao-seedream-5-0-260128) |
| Custom OpenAI endpoint |
| Custom OpenRouter endpoint (default: |
| Optional app/site URL for OpenRouter attribution |
| Optional app name for OpenRouter attribution |
| Custom Google endpoint |
| Custom DashScope endpoint |
| Custom Replicate endpoint |
| Custom Jimeng endpoint (default: |
| Jimeng region (default: |
| Custom Seedream endpoint (default: |
| Override batch worker cap |
| Override provider concurrency, e.g. |
| Override provider start gap, e.g. |
Load Priority: CLI args > EXTEND.md > env vars > >
<cwd>/.baoyu-skills/.env~/.baoyu-skills/.env| 变量 | 说明 |
|---|---|
| OpenAI API密钥 |
| OpenRouter API密钥 |
| Google API密钥 |
| DashScope API密钥(阿里云) |
| Replicate API令牌 |
| Jimeng(即梦)火山引擎access key |
| Jimeng(即梦)火山引擎secret key |
| Seedream(豆包)火山引擎ARK API密钥 |
| OpenAI模型覆盖配置 |
| OpenRouter模型覆盖配置(默认: |
| Google模型覆盖配置 |
| DashScope模型覆盖配置(默认: |
| Replicate模型覆盖配置(默认:google/nano-banana-pro) |
| Jimeng模型覆盖配置(默认:jimeng_t2i_v40) |
| Seedream模型覆盖配置(默认:doubao-seedream-5-0-260128) |
| 自定义OpenAI接口地址 |
| 自定义OpenRouter接口地址(默认: |
| 可选,OpenRouter归因用的应用/站点URL |
| 可选,OpenRouter归因用的应用名称 |
| 自定义Google接口地址 |
| 自定义DashScope接口地址 |
| 自定义Replicate接口地址 |
| 自定义Jimeng接口地址(默认: |
| Jimeng服务区域(默认: |
| 自定义Seedream接口地址(默认: |
| 批量并发上限覆盖配置 |
| 指定服务商的并发数覆盖配置,例如 |
| 指定服务商的请求间隔覆盖配置,例如 |
加载优先级:CLI参数 > EXTEND.md配置 > 环境变量 > >
<cwd>/.baoyu-skills/.env~/.baoyu-skills/.envModel Resolution
模型选择规则
Model priority (highest → lowest), applies to all providers:
- CLI flag:
--model <id> - EXTEND.md:
default_model.[provider] - Env var: (e.g.,
<PROVIDER>_IMAGE_MODEL)GOOGLE_IMAGE_MODEL - Built-in default
EXTEND.md overrides env vars. If both EXTEND.md and env var exist, EXTEND.md wins.
default_model.google: "gemini-3-pro-image-preview"GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-previewAgent MUST display model info before each generation:
- Show:
Using [provider] / [model] - Show switch hint:
Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL
模型优先级从高到低,适用于所有服务商:
- CLI参数:
--model <id> - EXTEND.md配置:
default_model.[provider] - 环境变量:(例如
<PROVIDER>_IMAGE_MODEL)GOOGLE_IMAGE_MODEL - 内置默认值
EXTEND.md配置优先级高于环境变量。如果同时存在EXTEND.md的和环境变量,则以EXTEND.md的配置为准。
default_model.google: "gemini-3-pro-image-preview"GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-previewAgent必须在每次生成前展示模型信息:
- 展示内容:
Using [provider] / [model] - 展示切换提示:
Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODEL
DashScope Models
DashScope模型
Use or set / when the user wants official Qwen-Image behavior.
--model qwen-image-2.0-prodefault_model.dashscopeDASHSCOPE_IMAGE_MODELOfficial DashScope model families:
- ,
qwen-image-2.0-pro,qwen-image-2.0-pro-2026-03-03,qwen-image-2.0qwen-image-2.0-2026-03-03- Free-form in
sizeformat宽*高 - Total pixels must stay between and
512*5122048*2048 - Default size is approximately
1024*1024 - Best choice for custom ratios such as and text-heavy Chinese/English layouts
21:9
- Free-form
- ,
qwen-image-max,qwen-image-max-2025-12-30,qwen-image-plus,qwen-image-plus-2026-01-09qwen-image- Fixed sizes only: ,
1664*928,1472*1104,1328*1328,1104*1472928*1664 - Default size is
1664*928 - currently has the same capability as
qwen-imageqwen-image-plus
- Fixed sizes only:
- Legacy DashScope models such as ,
z-image-turbo,z-image-ultrawanx-v1- Keep using them only when the user explicitly asks for legacy behavior or compatibility
When translating CLI args into DashScope behavior:
- wins over
--size--ar - For , prefer explicit
qwen-image-2.0*; otherwise infer from--sizeand use the official recommended resolutions below--ar - For , only use the five official fixed sizes; if the requested ratio is not covered, switch to
qwen-image-max/plus/imageqwen-image-2.0-pro - is a baoyu-image-gen compatibility preset, not a native DashScope API field. Mapping
--quality/normalonto the2ktable below is an implementation inference, not an official API guaranteeqwen-image-2.0*
Recommended sizes for common aspect ratios:
qwen-image-2.0*| Ratio | | |
|---|---|---|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
DashScope official APIs also expose , , and , but does not expose them as dedicated CLI flags today.
negative_promptprompt_extendwatermarkbaoyu-image-genOfficial references:
当用户需要官方Qwen-Image行为时,请使用,或者设置 / 。
--model qwen-image-2.0-prodefault_model.dashscopeDASHSCOPE_IMAGE_MODEL官方DashScope模型系列:
- 、
qwen-image-2.0-pro、qwen-image-2.0-pro-2026-03-03、qwen-image-2.0qwen-image-2.0-2026-03-03- 支持格式的自定义
宽*高size - 总像素需保持在到
512*512之间2048*2048 - 默认尺寸约为
1024*1024 - 是等自定义比例和中英文文字密集型布局的最佳选择
21:9
- 支持
- 、
qwen-image-max、qwen-image-max-2025-12-30、qwen-image-plus、qwen-image-plus-2026-01-09qwen-image- 仅支持固定尺寸:、
1664*928、1472*1104、1328*1328、1104*1472928*1664 - 默认尺寸为
1664*928 - 当前的能力与
qwen-image一致qwen-image-plus
- 仅支持固定尺寸:
- 旧版DashScope模型,例如、
z-image-turbo、z-image-ultrawanx-v1- 仅当用户明确要求旧版行为或兼容性时才继续使用
将CLI参数转换为DashScope行为时:
- 优先级高于
--size--ar - 对于系列,优先使用显式指定的
qwen-image-2.0*;否则根据--size推断并使用下方官方推荐分辨率--ar - 对于系列,仅可使用5种官方固定尺寸;如果请求的比例不在支持范围内,请切换到
qwen-image-max/plus/imageqwen-image-2.0-pro - 是baoyu-image-gen的兼容性预设,并非DashScope原生API字段。将
--quality/normal映射到下方2k表格的尺寸是实现层面的推断,并非官方API保证qwen-image-2.0*
常见宽高比对应的推荐尺寸:
qwen-image-2.0*| 比例 | | |
|---|---|---|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
DashScope官方API还提供、和参数,但当前baoyu-image-gen未将其作为专用CLI参数开放。
negative_promptprompt_extendwatermark官方参考文档:
OpenRouter Models
OpenRouter模型
Use full OpenRouter model IDs, e.g.:
- (recommended, supports image output and reference-image workflows)
google/gemini-3.1-flash-image-preview google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-pro- Other OpenRouter image-capable model IDs
Notes:
- OpenRouter image generation uses , not the OpenAI
/chat/completionsendpoints/images - If is used, choose a multimodal model that supports image input and image output
--ref - maps to OpenRouter
--imageSize;imageGenerationOptions.sizeis converted to the nearest OpenRouter size and inferred aspect ratio when possible--size <WxH>
请使用完整的OpenRouter模型ID,例如:
- (推荐,支持图像输出和参考图工作流)
google/gemini-3.1-flash-image-preview google/gemini-2.5-flash-image-previewblack-forest-labs/flux.2-pro- 其他支持图像生成的OpenRouter模型ID
注意事项:
- OpenRouter图像生成使用接口,而非OpenAI的
/chat/completions接口/images - 如果使用参数,请选择同时支持图像输入和图像输出的多模态模型
--ref - 映射到OpenRouter的
--imageSize;imageGenerationOptions.size会被转换为最接近的OpenRouter尺寸,并在可能的情况下推断宽高比--size <WxH>
Replicate Models
Replicate模型
Supported model formats:
- (recommended for official models), e.g.
owner/namegoogle/nano-banana-pro - (community models by version), e.g.
owner/name:versionstability-ai/sdxl:<version>
Examples:
bash
undefined支持的模型格式:
- (官方模型推荐使用),例如
owner/namegoogle/nano-banana-pro - (按版本指定社区模型),例如
owner/name:versionstability-ai/sdxl:<version>
示例:
bash
undefinedUse Replicate default model
使用Replicate默认模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
Override model explicitly
显式覆盖模型
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
undefined${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
undefinedProvider Selection
服务商选择规则
- provided + no
--ref→ auto-select Google first, then OpenAI, then OpenRouter, then Replicate (Jimeng and Seedream do not support reference images)--provider - specified → use it (if
--provider, must be--ref,google,openai, oropenrouter)replicate - Only one API key available → use that provider
- Multiple available → default to Google
- 提供了且未指定
--ref→ 优先自动选择Google,其次是OpenAI,然后是OpenRouter,最后是Replicate(Jimeng和Seedream不支持参考图)--provider - 已指定→ 使用指定的服务商(如果使用
--provider,则服务商必须是--ref、google、openai或openrouter)replicate - 仅存在一个可用API密钥 → 使用对应服务商
- 存在多个可用API密钥 → 默认使用Google
Quality Presets
画质预设
| Preset | Google imageSize | OpenAI Size | OpenRouter size | Replicate resolution | Use Case |
|---|---|---|---|---|---|
| 1K | 1024px | 1K | 1K | Quick previews |
| 2K | 2048px | 2K | 2K | Covers, illustrations, infographics |
Google/OpenRouter imageSize: Can be overridden with
--imageSize 1K|2K|4K| 预设 | Google imageSize | OpenAI尺寸 | OpenRouter尺寸 | Replicate分辨率 | 适用场景 |
|---|---|---|---|---|---|
| 1K | 1024px | 1K | 1K | 快速预览 |
| 2K | 2048px | 2K | 2K | 封面、插画、信息图 |
Google/OpenRouter imageSize:可通过覆盖默认值
--imageSize 1K|2K|4KAspect Ratios
宽高比
Supported: , , , , ,
1:116:99:164:33:42.35:1- Google multimodal: uses
imageConfig.aspectRatio - OpenAI: maps to closest supported size
- OpenRouter: sends ; if only
imageGenerationOptions.aspect_ratiois given, aspect ratio is inferred automatically--size <WxH> - Replicate: passes to model; when
aspect_ratiois provided without--ref, defaults to--armatch_input_image
支持的比例:、、、、、
1:116:99:164:33:42.35:1- Google多模态模型:使用参数
imageConfig.aspectRatio - OpenAI:映射为最接近的支持尺寸
- OpenRouter:发送参数;如果仅提供
imageGenerationOptions.aspect_ratio,则会自动推断宽高比--size <WxH> - Replicate:向模型传递参数;如果提供了
aspect_ratio但未指定--ref,则默认使用--armatch_input_image
Generation Mode
生成模式
Default: Sequential generation.
Batch Parallel Generation: When contains 2 or more pending tasks, the script automatically enables parallel generation.
--batchfile| Mode | When to Use |
|---|---|
| Sequential (default) | Normal usage, single images, small batches |
| Parallel batch | Batch mode with 2+ tasks |
Execution choice:
| Situation | Preferred approach | Why |
|---|---|---|
| One image, or 1-2 simple images | Sequential | Lower coordination overhead and easier debugging |
| Multiple images already have saved prompt files | Batch ( | Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput |
| Each image still needs separate reasoning, prompt writing, or style exploration | Subagents | The work is still exploratory, so each image may need independent analysis before generation |
Rule of thumb:
- Prefer batch over subagents once prompt files are already saved and the task is "generate all of these"
- Use subagents only when generation is coupled with per-image thinking, rewriting, or divergent creative exploration
Parallel behavior:
- Default worker count is automatic, capped by config, built-in default 10
- Provider-specific throttling is applied only in batch mode, and the built-in defaults are tuned for faster throughput while still avoiding obvious RPM bursts
- You can override worker count with
--jobs <count> - Each image retries automatically up to 3 attempts
- Final output includes success count, failure count, and per-image failure reasons
默认:顺序生成
批量并行生成:当包含2个及以上待处理任务时,脚本自动启用并行生成。
--batchfile| 模式 | 适用场景 |
|---|---|
| 顺序生成(默认) | 常规使用、单图生成、小批量任务 |
| 并行批量生成 | 包含2个及以上任务的批量模式 |
执行方式选择:
| 场景 | 推荐方案 | 原因 |
|---|---|---|
| 单张图像,或1-2张简单图像 | 顺序生成 | 协调开销更低,更易调试 |
| 多张图像已有保存好的prompt文件 | 批量生成( | 复用已定稿的提示词,应用统一的限流/重试策略,吞吐量可预测 |
| 每张图像仍需要单独推理、撰写提示词或风格探索 | 子Agent | 工作仍处于探索阶段,每张图像在生成前可能需要独立分析 |
经验法则:
- 如果prompt文件已保存且任务为“生成所有这些图像”,优先使用批量生成而非子Agent
- 仅当生成需要与每张图像的思考、改写或发散性创意探索耦合时,才使用子Agent
并行行为说明:
- 默认并发数为自动配置,受配置上限约束,内置默认值为10
- 仅在批量模式下应用服务商专属限流策略,内置默认值已调优,可在避免明显RPM峰值的同时实现更快的吞吐量
- 可通过参数覆盖并发数
--jobs <count> - 每张图像最多自动重试3次
- 最终输出包含成功数量、失败数量和每张图像的失败原因
Error Handling
错误处理
- Missing API key → error with setup instructions
- Generation failure → auto-retry up to 3 attempts per image
- Invalid aspect ratio → warning, proceed with default
- Reference images with unsupported provider/model → error with fix hint
- 缺失API密钥 → 返回错误并附带设置指南
- 生成失败 → 每张图像最多自动重试3次
- 无效宽高比 → 发出警告,使用默认值继续执行
- 参考图对应服务商/模型不支持 → 返回错误并附带修复提示
Extension Support
扩展支持
Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
可通过EXTEND.md自定义配置,路径和支持选项见偏好设置章节。
Attribution
归属说明
Based on baoyu-image-gen by JimLiu, licensed under MIT.
Modified and adapted for the Buda.im platform.
基于JimLiu开发的baoyu-image-gen修改,采用MIT许可证。
已适配修改用于Buda.im平台。