baoyu-image-gen

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Image Generation (AI SDK)

图像生成（AI SDK）

Official API-based image generation. Supports OpenAI, Google and DashScope (阿里通义万象) providers.

基于官方API的图像生成工具，支持OpenAI、Google和DashScope（阿里通义万象）提供商。

Script Directory

脚本目录

Agent Execution:

```
SKILL_DIR
```
= this SKILL.md file's directory
Script path =
```
${SKILL_DIR}/scripts/main.ts
```

Agent执行步骤:

```
SKILL_DIR
```
= 本SKILL.md文件所在目录
脚本路径 =
```
${SKILL_DIR}/scripts/main.ts
```

Preferences (EXTEND.md)

偏好设置（EXTEND.md）

Use Bash to check EXTEND.md existence (priority order):

bash

undefined

使用Bash检查EXTEND.md文件是否存在（优先级顺序）：

bash

undefined

Check project-level first

优先检查项目级目录

test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"

Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)

然后检查用户级目录（跨平台：$HOME适用于macOS/Linux/WSL）

test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"


┌──────────────────────────────────────────────────┬───────────────────┐
│                       Path                       │     Location      │
├──────────────────────────────────────────────────┼───────────────────┤
│ .baoyu-skills/baoyu-image-gen/EXTEND.md          │ Project directory │
├──────────────────────────────────────────────────┼───────────────────┤
│ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md    │ User home         │
└──────────────────────────────────────────────────┴───────────────────┘

┌───────────┬───────────────────────────────────────────────────────────────────────────┐
│  Result   │                                  Action                                   │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Found     │ Read, parse, apply settings                                               │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Not found │ Use defaults                                                              │
└───────────┴───────────────────────────────────────────────────────────────────────────┘

**EXTEND.md Supports**: Default provider | Default quality | Default aspect ratio

test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"


┌──────────────────────────────────────────────────┬───────────────────┐
│                       路径                       │        位置        │
├──────────────────────────────────────────────────┼───────────────────┤
│ .baoyu-skills/baoyu-image-gen/EXTEND.md          │    项目目录        │
├──────────────────────────────────────────────────┼───────────────────┤
│ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md    │    用户主目录      │
└──────────────────────────────────────────────────┴───────────────────┘

┌───────────┬───────────────────────────────────────────────────────────────────────────┐
│   结果     │                                  操作                                   │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│   已找到   │ 读取、解析并应用设置                                                     │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│   未找到   │ 使用默认设置                                                              │
└───────────┴───────────────────────────────────────────────────────────────────────────┘

**EXTEND.md支持配置**：默认提供商 | 默认画质 | 默认宽高比

Usage

使用方法

bash

undefined

bash

undefined

Basic

基础用法

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

With aspect ratio

指定宽高比

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

High quality

高质量模式

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

From prompt files

从文件读取提示词

npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

With reference images (Google multimodal only)

使用参考图像（仅Google多模态模型支持）

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

Specific provider

指定提供商

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

DashScope (阿里通义万象)

DashScope（阿里通义万象）

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

undefined

npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

undefined

Options

选项参数

Option	Description
`--prompt <text>` , `-p`	Prompt text
`--promptfiles <files...>`	Read prompt from files (concatenated)
`--image <path>`	Output image path (required)
`--provider google\|openai\|dashscope`	Force provider (default: google)
`--model <id>` , `-m`	Model ID
`--ar <ratio>`	Aspect ratio (e.g., `16:9` , `1:1` , `4:3` )
`--size <WxH>`	Size (e.g., `1024x1024` )
`--quality normal\|2k`	Quality preset (default: 2k)
`--imageSize 1K\|2K\|4K`	Image size for Google (default: from quality)
`--ref <files...>`	Reference images (Google multimodal only)
`--n <count>`	Number of images
`--json`	JSON output

选项	描述
`--prompt <text>` , `-p`	提示词文本
`--promptfiles <files...>`	从文件读取提示词（多文件内容会被拼接）
`--image <path>`	输出图像路径（必填）
`--provider google\|openai\|dashscope`	强制指定提供商（默认：google）
`--model <id>` , `-m`	模型ID
`--ar <ratio>`	宽高比（例如： `16:9` , `1:1` , `4:3` ）
`--size <WxH>`	图像尺寸（例如： `1024x1024` ）
`--quality normal\|2k`	画质预设（默认：2k）
`--imageSize 1K\|2K\|4K`	Google模型的图像尺寸（默认：由画质预设决定）
`--ref <files...>`	参考图像（仅Google多模态模型支持）
`--n <count>`	生成图像数量
`--json`	以JSON格式输出结果

Environment Variables

环境变量

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`GOOGLE_API_KEY`	Google API key
`DASHSCOPE_API_KEY`	DashScope API key (阿里云)
`OPENAI_IMAGE_MODEL`	OpenAI model override
`GOOGLE_IMAGE_MODEL`	Google model override
`DASHSCOPE_IMAGE_MODEL`	DashScope model override (default: z-image-turbo)
`OPENAI_BASE_URL`	Custom OpenAI endpoint
`GOOGLE_BASE_URL`	Custom Google endpoint
`DASHSCOPE_BASE_URL`	Custom DashScope endpoint

Load Priority: CLI args > env vars >

<cwd>/.baoyu-skills/.env

~/.baoyu-skills/.env

变量名	描述
`OPENAI_API_KEY`	OpenAI API密钥
`GOOGLE_API_KEY`	Google API密钥
`DASHSCOPE_API_KEY`	DashScope API密钥（阿里云）
`OPENAI_IMAGE_MODEL`	覆盖默认的OpenAI模型
`GOOGLE_IMAGE_MODEL`	覆盖默认的Google模型
`DASHSCOPE_IMAGE_MODEL`	覆盖默认的DashScope模型（默认：z-image-turbo）
`OPENAI_BASE_URL`	自定义OpenAI端点地址
`GOOGLE_BASE_URL`	自定义Google端点地址
`DASHSCOPE_BASE_URL`	自定义DashScope端点地址

加载优先级：CLI参数 > 环境变量 >

<cwd>/.baoyu-skills/.env

~/.baoyu-skills/.env

Provider Selection

提供商选择逻辑

```
--provider
```
specified → use it
Only one API key available → use that provider
Multiple available → default to Google

若指定了
```
--provider
```
参数 → 使用该提供商
仅存在一个可用的API密钥 → 使用对应提供商
存在多个可用API密钥 → 默认使用Google

Quality Presets

画质预设

Preset	Google imageSize	OpenAI Size	Use Case
`normal`	1K	1024px	Quick previews
`2k` (default)	2K	2048px	Covers, illustrations, infographics

Google imageSize: Can be overridden with

--imageSize 1K|2K|4K

预设值	Google imageSize	OpenAI 尺寸	使用场景
`normal`	1K	1024px	快速预览
`2k` （默认）	2K	2048px	封面图、插画、信息图

Google imageSize：可通过

--imageSize 1K|2K|4K

参数覆盖预设值

Aspect Ratios

支持的宽高比

Supported:

1:1

16:9

9:16

4:3

3:4

2.35:1

Google multimodal: uses
```
imageConfig.aspectRatio
```
Google Imagen: uses
```
aspectRatio
```
parameter
OpenAI: maps to closest supported size

支持：

1:1

16:9

9:16

4:3

3:4

2.35:1

Google多模态模型：使用
```
imageConfig.aspectRatio
```
参数
Google Imagen模型：使用
```
aspectRatio
```
参数
OpenAI：自动映射到最接近的支持尺寸

Generation Mode

生成模式

Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.

Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.

Mode	When to Use
Sequential (default)	Normal usage, single images, small batches
Parallel	User explicitly requests, large batches (10+)

Parallel Settings (when requested):

Setting	Value
Recommended concurrency	4 subagents
Max concurrency	8 subagents
Use case	Large batch generation when user requests parallel

Agent Implementation (parallel mode only):

undefined

默认模式：顺序生成（一次生成一张图像）。该模式确保输出稳定，便于调试。

并行生成：仅在用户明确要求时启用。

模式	使用场景
顺序生成（默认）	常规使用、单张图像生成、小批量生成
并行生成	用户明确要求、大批量生成（10张及以上）

并行设置（仅在启用时生效）：

设置项	取值
推荐并发数	4个子Agent
最大并发数	8个子Agent
使用场景	用户要求并行的大批量图像生成

Agent实现逻辑（仅并行模式）：

undefined

Launch multiple generations in parallel using Task tool

使用Task工具启动多个并行生成任务

Each Task runs as background subagent with run_in_background=true

每个Task以后台子Agent运行，设置run_in_background=true

Collect results via TaskOutput when all complete

所有任务完成后通过TaskOutput收集结果

undefined

undefined

Error Handling

错误处理

Missing API key → error with setup instructions
Generation failure → auto-retry once
Invalid aspect ratio → warning, proceed with default
Reference images with non-multimodal model → warning, ignore refs

缺失API密钥 → 抛出错误并提供设置指引
生成失败 → 自动重试一次
无效宽高比 → 发出警告，使用默认值继续执行
非多模态模型使用参考图像 → 发出警告，忽略参考图像参数

Extension Support

扩展支持

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

可通过EXTEND.md文件进行自定义配置。详见偏好设置章节的路径说明和支持的配置项。