multi-agent-image

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Multi-Agent Image

Multi-Agent 图像生成

multi-agent-image

is a standalone Hermes skill for image generation workflows.

It is designed for cases where a simple one-line prompt is not enough. Instead of sending raw user input directly to an image model, this skill:

analyzes the request,
compiles it into a design-aware prompt,
generates through
```
gpt-image-2
```
,
archives the result,
and optionally reuses successful outputs as future style references.

This skill is independent at runtime. The design compiler is built into this repository and does not require an external skill.

multi-agent-image

是一款用于图像生成工作流的独立Hermes Skill。

它专为无法通过简单单行提示词完成的场景设计。该Skill不会直接将用户原始输入发送至图像模型，而是执行以下步骤：

分析请求，
将其编译为具备设计感知的提示词，
通过
```
gpt-image-2
```
生成图像，
归档结果，
可选地将成功输出复用为未来的风格参考。

该Skill在运行时独立。设计编译器内置在本仓库中，无需依赖外部Skill。

When to Use

使用场景

Use this skill when the user wants one or more of the following:

Design-oriented poster generation
Product images or ad visuals
PPT cover visuals or chapter art
Infographic-like or teaching/demo visuals
Style reference reuse from prior generations
Interactive “show examples first, then generate” flow
Batch generation for multiple directions or aspect ratios
Series generation where multiple images should share one visual language

Do not use this skill for:

pixel-accurate UI recreation
editable charts
exact typography output inside the image
tasks that require vector, HTML, or PPT-native assets rather than raster images

当用户有以下一项或多项需求时，可使用本Skill：

面向设计的海报生成
产品图片或广告视觉素材
PPT封面视觉图或章节插图
信息图风格或教学/演示视觉素材
复用先前生成内容作为风格参考
交互式“先展示示例，再生成”流程
针对多个方向或宽高比的批量生成
生成风格统一的系列图像，多幅图像共享同一视觉语言

请勿将本Skill用于以下场景：

像素级精准的UI还原
可编辑图表
图像内的精确排版输出
需要矢量图、HTML或PPT原生资产而非光栅图像的任务

Architecture

架构

text

User Request
    ↓
[Prompt Engineer]
    ↓
[Style Scout]
    ↓
[Internal Design Compiler]
    ↓
[GPT-Image-2 Generation]
    ↓
[QA + Archive]
    ↓
[Case Library]

Optional layers on top of the main path:

Interactive reference selection
Batch generation
Series generation

text

User Request
    ↓
[Prompt Engineer]
    ↓
[Style Scout]
    ↓
[Internal Design Compiler]
    ↓
[GPT-Image-2 Generation]
    ↓
[QA + Archive]
    ↓
[Case Library]

主流程之上的可选模块：

交互式参考选择
批量生成
系列生成

Setup

部署步骤

1. Deploy the skill

1. 部署Skill

The skill source lives in:

bash

~/.hermes/skills/multi-agent-image/

Install runtime files into the working directory:

bash

python3 ~/.hermes/skills/multi-agent-image/scripts/install.py

This prepares:

~/.hermes/agents/multi-agent-image/output/

~/.hermes/agents/multi-agent-image/case_library/

agent role folders and memory files
local runtime scripts copied from the skill

Skill源码位于：

bash

~/.hermes/skills/multi-agent-image/

将运行时文件安装至工作目录：

bash

python3 ~/.hermes/skills/multi-agent-image/scripts/install.py

该命令会准备以下内容：

~/.hermes/agents/multi-agent-image/output/

~/.hermes/agents/multi-agent-image/case_library/

Agent角色文件夹和内存文件
从Skill复制的本地运行时脚本

2. Install Python dependencies

2. 安装Python依赖

bash

pip install openai requests

bash

pip install openai requests

3. Set API key

3. 设置API密钥

bash

export OPENAI_API_KEY="sk-..."

This key is used with the apimart-compatible GPT-Image-2 endpoints in this skill.

bash

export OPENAI_API_KEY="sk-..."

该密钥用于本Skill中兼容apimart的GPT-Image-2端点。

Core Components

核心组件

scripts/design_compiler.py

scripts/design_compiler.py

Internal prompt compiler.

Responsibilities:

detect task type
choose defaults for aspect and quality
build
```
design_reasoning
```
compress it into
```
compiled_brief
```
produce the final generation prompt

This is the core logic that makes the skill independent.

内置提示词编译器。

职责：

检测任务类型
选择宽高比和质量的默认值
构建
```
design_reasoning
```
将其压缩为
```
compiled_brief
```
生成最终的图像生成提示词

这是使Skill具备独立性的核心逻辑。

scripts/design_image.py

scripts/design_image.py

CLI entrypoint for the internal compiler.

Use it when you want:

prompt-only output
a local design compilation test
direct generation without the full multi-agent workflow

Example:

bash

cd ~/.hermes/agents/multi-agent-image
python3 design_image.py \
  --task poster \
  --brief "AI训练营招生海报，强调速度、增长、实战" \
  --direction balanced \
  --aspect 3:4 \
  --prompt-only

It prints:

```
design_reasoning
```
```
compiled_brief
```
```
prompt
```
```
settings
```

内置编译器的CLI入口。

当你需要以下功能时使用：

仅输出提示词
本地设计编译测试
无需完整多Agent工作流的直接生成

示例：

bash

cd ~/.hermes/agents/multi-agent-image
python3 design_image.py \
  --task poster \
  --brief "AI训练营招生海报，强调速度、增长、实战" \
  --direction balanced \
  --aspect 3:4 \
  --prompt-only

它会输出：

```
design_reasoning
```
```
compiled_brief
```
```
prompt
```
```
settings
```

scripts/orchestrator_v2.py

scripts/orchestrator_v2.py

Main workflow entrypoint.

Responsibilities:

run prompt analysis
choose task and generation parameters
optionally select a reference from the case library
call the internal compiler
call GPT-Image-2
archive outputs
auto-save successful results into the case library

主工作流入口。

职责：

执行提示词分析
选择任务和生成参数
可选地从案例库中选择参考内容
调用内置编译器
调用GPT-Image-2
归档输出结果
将成功结果自动保存至案例库

scripts/gpt_image2_generator.py

scripts/gpt_image2_generator.py

Low-level GPT-Image-2 client.

Responsibilities:

submit async generation tasks
poll task status
download image results

Use this when you want direct API access without the full workflow.

底层GPT-Image-2客户端。

职责：

提交异步生成任务
轮询任务状态
下载图像结果

当你需要直接访问API而无需完整工作流时使用。

scripts/case_library.py

scripts/case_library.py

Persistent library of past generations.

Responsibilities:

save outputs by task type
store metadata and rating
search by brief, prompt, or tags
return image paths for reuse as references

过往生成内容的持久化库。

职责：

按任务类型保存输出结果
存储元数据和评分
按brief、prompt或标签搜索
返回可复用为参考的图像路径

scripts/case_selector.py

scripts/case_selector.py

Interactive helper for Hermes dialogue flows.

Responsibilities:

render user-facing selection text
parse replies like
```
1
```
,
```
n
```
,
```
case_001
```
, or
```
搜索蓝色
```

Hermes对话流程的交互式辅助工具。

职责：

渲染面向用户的选择文本
解析诸如
```
1
```
、
```
n
```
、
```
case_001
```
或
```
搜索蓝色
```
之类的回复

scripts/interactive_run.py

scripts/interactive_run.py

Two-phase dialogue wrapper.

Use it when the workflow needs to ask the user before generating.

两阶段对话包装器。

当工作流需要在生成前询问用户时使用。

scripts/batch_generator_v2.py

scripts/batch_generator_v2.py

Batch generation entrypoint.

Supports:

same brief, multiple directions
same brief, multiple aspect ratios
multiple briefs in one run

批量生成入口。

支持：

同一brief，多种风格方向
同一brief，多种宽高比
一次运行处理多个brief

scripts/series_generator.py

scripts/series_generator.py

Style-consistent series generator.

Workflow:

generate a master image
extract style signals from its compiled brief
generate child images that follow the same visual system

风格统一的系列图像生成器。

工作流：

生成主图像
从其编译后的brief中提取风格信号
生成遵循同一视觉系统的子图像

templates/linear_batch.py

templates/linear_batch.py

Editable template for resumable sequential runs.

Useful when you want:

explicit scene lists
filesystem-based progress monitoring
style propagation from the first generated image

可编辑的可恢复顺序运行模板。

适用于以下场景：

明确的场景列表
基于文件系统的进度监控
从第一张生成图像传播风格

Internal Design Compiler

内置设计编译器

The internal compiler produces three layers:

内置编译器生成三个层级的内容：

design_reasoning

design_reasoning

This captures design intent before generation.

Typical fields:

```
task
```
```
communication_goal
```
```
audience
```
```
channel
```
```
visual_system
```
```
hierarchy_strategy
```
```
safe_zone_strategy
```
```
lighting_strategy
```
```
palette_strategy
```
```
anti_filler_rules
```
```
anti_slop_rules
```

这部分在生成前捕捉设计意图。

典型字段：

```
task
```
```
communication_goal
```
```
audience
```
```
channel
```
```
visual_system
```
```
hierarchy_strategy
```
```
safe_zone_strategy
```
```
lighting_strategy
```
```
palette_strategy
```
```
anti_filler_rules
```
```
anti_slop_rules
```

compiled_brief

compiled_brief

This is a compressed design brief for generation.

It includes:

what the image is for
what should dominate visually
what space should remain available
what to avoid

这是用于生成的压缩版设计brief。

包含：

图像用途
视觉主导元素
需预留的空间
需要避免的内容

prompt

prompt

Final model-facing prompt used for GPT-Image-2.

The prompt is generated from design logic, not just from a list of style keywords.

用于GPT-Image-2的最终模型提示词。

该提示词由设计逻辑生成，而非仅由风格关键词列表组成。

Supported Tasks

支持的任务类型

The built-in compiler understands these task classes:

```
poster
```
```
product
```
```
ppt
```
```
infographic
```
```
teaching
```
```
auto
```

Default aspect assumptions:

```
poster
```
→
```
3:4
```
```
product
```
→
```
1:1
```
```
ppt
```
→
```
16:9
```
```
infographic
```
→
```
4:3
```
```
teaching
```
→
```
16:9
```

Direction modes:

```
conservative
```
```
balanced
```
```
bold
```

Quality modes:

```
draft
```
```
final
```
```
premium
```

Current generation channel:

```
gpt-image-2
```

内置编译器可识别以下任务类别：

```
poster
```
```
product
```
```
ppt
```
```
infographic
```
```
teaching
```
```
auto
```

默认宽高比假设：

```
poster
```
→
```
3:4
```
```
product
```
→
```
1:1
```
```
ppt
```
→
```
16:9
```
```
infographic
```
→
```
4:3
```
```
teaching
```
→
```
16:9
```

风格方向模式：

```
conservative
```
```
balanced
```
```
bold
```

质量模式：

```
draft
```
```
final
```
```
premium
```

当前生成渠道：

```
gpt-image-2
```

Usage

使用方法

Quick start

快速开始

bash

cd ~/.hermes/agents/multi-agent-image
python3 quick_start.py "AI训练营招生海报，强调速度、增长、实战"

bash

cd ~/.hermes/agents/multi-agent-image
python3 quick_start.py "AI训练营招生海报，强调速度、增长、实战"

Prompt-only compilation

仅编译提示词

bash

cd ~/.hermes/agents/multi-agent-image
python3 design_image.py \
  --task product \
  --brief "高端陶瓷咖啡杯电商首图，温暖晨光，突出釉面质感" \
  --prompt-only

bash

cd ~/.hermes/agents/multi-agent-image
python3 design_image.py \
  --task product \
  --brief "高端陶瓷咖啡杯电商首图，温暖晨光，突出釉面质感" \
  --prompt-only

Full orchestrated generation

完整编排生成

python

from orchestrator_v2 import run

run("AI训练营招生海报，强调速度增长实战")

python

from orchestrator_v2 import run

run("AI训练营招生海报，强调速度增长实战")

Force task and visual settings

强制指定任务和视觉设置

python

from orchestrator_v2 import run

run(
    "高端咖啡杯商品图",
    task="product",
    direction="balanced",
    aspect="1:1",
    quality="final",
    use_reference=False,
)

python

from orchestrator_v2 import run

run(
    "高端咖啡杯商品图",
    task="product",
    direction="balanced",
    aspect="1:1",
    quality="final",
    use_reference=False,
)

Interactive Workflow

交互式工作流

Use the two-phase pattern when Hermes should ask before generating.

当Hermes需要在生成前询问用户时，使用两阶段模式。

Phase 1: prepare text for the user

阶段1：准备面向用户的文本

python

from interactive_run import prepare

text = prepare("帮我做张 AI 训练营海报", task="poster")
print(text)

python

from interactive_run import prepare

text = prepare("帮我做张 AI 训练营海报", task="poster")
print(text)

Phase 2: execute after the user chooses

阶段2：用户选择后执行

python

from interactive_run import execute

result = execute("帮我做张 AI 训练营海报", user_choice="1", task="poster")

Supported reply patterns:

```
1
```
,
```
2
```
,
```
3
```
```
n
```
```
y
```
```
case_001
```
```
搜索蓝色
```

python

from interactive_run import execute

result = execute("帮我做张 AI 训练营海报", user_choice="1", task="poster")

支持的回复格式：

```
1
```
,
```
2
```
,
```
3
```
```
n
```
```
y
```
```
case_001
```
```
搜索蓝色
```

Batch Generation

批量生成

Same brief, multiple directions

同一brief，多种风格方向

python

from batch_generator_v2 import batch_styles

batch_styles("AI训练营海报", task="poster")

python

from batch_generator_v2 import batch_styles

batch_styles("AI训练营海报", task="poster")

Same brief, multiple aspect ratios

同一brief，多种宽高比

python

from batch_generator_v2 import batch_aspects

batch_aspects("AI训练营海报", task="poster", aspects=["1:1", "16:9", "9:16"])

python

from batch_generator_v2 import batch_aspects

batch_aspects("AI训练营海报", task="poster", aspects=["1:1", "16:9", "9:16"])

Multiple briefs

多个brief

python

from batch_generator_v2 import batch_briefs

batch_briefs(["海报A", "海报B", "海报C"], task="poster")

python

from batch_generator_v2 import batch_briefs

batch_briefs(["海报A", "海报B", "海报C"], task="poster")

Series Generation

系列生成

Use this when several outputs should feel like the same campaign or product family.

python

from series_generator import SeriesGenerator

sg = SeriesGenerator()
sg.create_series(
    master_brief="AI训练营系列视觉，科技蓝，专业商务感",
    items=[
        {"name": "主海报", "brief": "AI训练营招生主海报", "aspect": "3:4"},
        {"name": "Banner", "brief": "官网 Banner", "aspect": "16:9"},
        {"name": "朋友圈", "brief": "朋友圈推广方形图", "aspect": "1:1"},
    ],
    task="poster",
    direction="balanced",
)

当需要多个输出内容属于同一宣传活动或产品系列时使用。

python

from series_generator import SeriesGenerator

sg = SeriesGenerator()
sg.create_series(
    master_brief="AI训练营系列视觉，科技蓝，专业商务感",
    items=[
        {"name": "主海报", "brief": "AI训练营招生主海报", "aspect": "3:4"},
        {"name": "Banner", "brief": "官网 Banner", "aspect": "16:9"},
        {"name": "朋友圈", "brief": "朋友圈推广方形图", "aspect": "1:1"},
    ],
    task="poster",
    direction="balanced",
)

Case Library

案例库

Case library directory:

text

~/.hermes/agents/multi-agent-image/case_library/

Output directory:

text

~/.hermes/agents/multi-agent-image/output/

Typical case structure:

text

case_library/
├── poster/
│   └── case_001_example/
│       ├── image.png
│       └── metadata.json

Typical metadata fields:

```
case_id
```
```
task
```
```
brief
```
```
prompt
```
```
params
```
```
tags
```
```
rating
```

案例库目录：

text

~/.hermes/agents/multi-agent-image/case_library/

输出目录：

text

~/.hermes/agents/multi-agent-image/output/

典型案例结构：

text

case_library/
├── poster/
│   └── case_001_example/
│       ├── image.png
│       └── metadata.json

典型元数据字段：

```
case_id
```
```
task
```
```
brief
```
```
prompt
```
```
params
```
```
tags
```
```
rating
```

Validation Guidance

验证指南

Before generating at scale, test prompt quality first:

bash

python3 design_image.py \
  --task poster \
  --brief "AI训练营招生海报，强调速度、增长、实战" \
  --direction balanced \
  --aspect 3:4 \
  --prompt-only

What to check:

Does
```
design_reasoning
```
state a clear communication goal?
Is there an explicit safe zone?
Is hierarchy obvious?
Do
```
anti_slop_rules
```
remove HUD overlays, fog, and generic clutter?
Does the prompt describe a single strong visual idea rather than a pile of elements?

大规模生成前，先测试提示词质量：

bash

python3 design_image.py \
  --task poster \
  --brief "AI训练营招生海报，强调速度、增长、实战" \
  --direction balanced \
  --aspect 3:4 \
  --prompt-only

需要检查的内容：

```
design_reasoning
```
是否明确了沟通目标？
是否有明确的安全区域？
视觉层级是否清晰？
```
anti_slop_rules
```
是否移除了HUD覆盖层、雾气和通用杂乱元素？
提示词是否描述了一个清晰的强视觉概念，而非一堆元素的堆砌？

Current Limits

当前限制

Current image provider is centered on
```
gpt-image-2
```
QA scoring is intentionally lightweight
Series generation is heavier than one-off generation
The skill is optimized for raster outputs, not editable assets
Some reference documents remain longer than necessary, but the main runtime path is consistent

当前图像生成提供商以
```
gpt-image-2
```
为核心
QA评分设计得较为轻量化
系列生成比单次生成消耗更多资源
Skill针对光栅图像输出优化，不支持可编辑资产
部分参考文档篇幅较长，但主运行时流程保持一致

Version History

版本历史

```
v1.0.0
```
Initial multi-agent workflow for GPT-Image-2 generation
```
v2.0.0
```
Added case library, interactive reference selection, and image-to-image style reuse
```
v2.1.0
```
Added stronger download retry logic, batch workflows, and series generation
```
v2.2.0
```
Packaged as a reusable Hermes skill with install script and runtime layout
```
v3.0.0
```
Internalized the design compiler and removed external runtime dependency

```
v1.0.0
```
初始版本，支持GPT-Image-2生成的多Agent工作流
```
v2.0.0
```
添加案例库、交互式参考选择和图像到图像的风格复用功能
```
v2.1.0
```
增强下载重试逻辑，添加批量工作流和系列生成功能
```
v2.2.0
```
打包为可复用的Hermes Skill，包含安装脚本和运行时布局
```
v3.0.0
```
内置设计编译器，移除外部运行时依赖

multi-agent-image

Original

Translation

Multi-Agent Image

Multi-Agent 图像生成

When to Use

使用场景

Architecture

架构

Setup

部署步骤

1. Deploy the skill

1. 部署Skill

2. Install Python dependencies

2. 安装Python依赖

3. Set API key

3. 设置API密钥

Core Components

核心组件

scripts/design_compiler.py

scripts/design_compiler.py

scripts/design_image.py

scripts/design_image.py

scripts/orchestrator_v2.py

scripts/orchestrator_v2.py

scripts/gpt_image2_generator.py

scripts/gpt_image2_generator.py

scripts/case_library.py

scripts/case_library.py

scripts/case_selector.py

scripts/case_selector.py

scripts/interactive_run.py

scripts/interactive_run.py

scripts/batch_generator_v2.py

scripts/batch_generator_v2.py

scripts/series_generator.py

scripts/series_generator.py

templates/linear_batch.py

templates/linear_batch.py

Internal Design Compiler

内置设计编译器

1. design_reasoning

1. design_reasoning

2. compiled_brief

2. compiled_brief

3. prompt

3. prompt

Supported Tasks

支持的任务类型

Usage

使用方法

Quick start

快速开始

Prompt-only compilation

仅编译提示词

Full orchestrated generation

完整编排生成

Force task and visual settings

强制指定任务和视觉设置

Interactive Workflow

交互式工作流

Phase 1: prepare text for the user

阶段1：准备面向用户的文本

Phase 2: execute after the user chooses

阶段2：用户选择后执行

Batch Generation

批量生成

Same brief, multiple directions

同一brief，多种风格方向

Same brief, multiple aspect ratios

同一brief，多种宽高比

Multiple briefs

多个brief

Series Generation

系列生成

Case Library

案例库

Validation Guidance

验证指南

Current Limits

`scripts/design_compiler.py`

`scripts/design_compiler.py`

`scripts/design_image.py`

`scripts/design_image.py`

`scripts/orchestrator_v2.py`

`scripts/orchestrator_v2.py`

`scripts/gpt_image2_generator.py`

`scripts/gpt_image2_generator.py`

`scripts/case_library.py`

`scripts/case_library.py`

`scripts/case_selector.py`

`scripts/case_selector.py`

`scripts/interactive_run.py`

`scripts/interactive_run.py`

`scripts/batch_generator_v2.py`

`scripts/batch_generator_v2.py`

`scripts/series_generator.py`

`scripts/series_generator.py`

`templates/linear_batch.py`

`templates/linear_batch.py`

1.
`design_reasoning`

1.
`design_reasoning`

2.
`compiled_brief`

2.
`compiled_brief`

3.
`prompt`

3.
`prompt`