GPT Image 2

This is a focused Skill for GPT Image 2, which can be used in 3 runtime environments with significant behavioral differences. You must first determine the current operating mode as the first step.

It only handles two types of image tasks:

Image generation:
```
POST /images/generations
```
Image editing:
```
POST /images/edits
```

This file retains: operating modes, Skill structure, environment variables, saving/naming rules, template index, and mode-aware workflows. Detailed templates are all placed in

references/

, organized hierarchically:

Level 1: Category directories
Level 2: Individual template Markdown files

Operating Modes (Must Read, Confirm Before Any Operation)

This Skill comes with a lightweight detection script. Run it first, then decide how to proceed based on the results:

bash

node skills/gpt-image-2/scripts/check-mode.js
# To get structured results for upper-level programs:
node skills/gpt-image-2/scripts/check-mode.js --json

The output will indicate

mode = A

A?

B-or-C

along with a

recommendation

. The three modes are defined as follows:

Mode A · Garden Local Image Generation

Trigger Condition: Environment variable

ENABLE_GARDEN_IMAGEGEN

is true (

true

yes

on

) and

OPENAI_API_KEY

exists.

Behavior: Complete end-to-end workflow of "select template → write prompt → call script → generate and save image".

Use
```
scripts/generate.js
```
for text-to-image generation,
```
scripts/edit.js
```
for editing existing images.
Prompts are saved to
```
garden-gpt-image-2/prompt/
```
by default, and images are saved to
```
garden-gpt-image-2/image/
```
.
This is the most powerful mode: you are the "owner" of the image tool.

Mode B · Host-Native Delegated Image Generation

Trigger Condition: Garden is not enabled (

ENABLE_GARDEN_IMAGEGEN

is not set / false), but the current host Agent has built-in image generation tools or image MCP.

Typical Identification Signals (you should self-check):

Tools like

image_generation

imagegen

dalle

nano_banana

mcp__*image*

make_image

or similar names appear in your toolset

Users call this Skill in clients that support native image generation such as ChatGPT / Codex / Gemini / Cursor
Users explicitly say "use your own tool to generate images"

Behavior: This Skill degrades to a prompt engineering guide——

Still follow the workflow of "select template → fill in fields → render final prompt".
Do not call
node scripts/generate.js
(no API key, will definitely fail).
Directly call the host's built-in image tool, passing the rendered prompt as input.
If users wish, you can save the prompt file to
```
garden-gpt-image-2/prompt/
```
, but the image storage location is determined by the host and not mandatory.

Mode C · Advisor Pure Prompt Consultant

Trigger Condition: Garden is not enabled, and the host Agent has no image generation tools.

Behavior: This Skill degrades to a "high-quality prompt writing consultant"——

Follow the workflow of "select template → fill in fields → render final prompt", and ask users if information is missing.
Directly print the final prompt to users + save a copy to
```
garden-gpt-image-2/prompt/<task-slug>-<timestamp>.md
```
.
Attach a short "how to use" suggestion (e.g.: paste into ChatGPT / Midjourney / DALL·E / Sora / Nano Banana / your own backend / third-party GPT Image 2 gateway).
Do not pretend image generation was successful. Clearly inform users: "A high-quality reusable prompt has been generated. Please execute it with your image tool."

Mode Decision Table

Condition	Mode	Call Script?	Save Prompt?	Save Image?
`ENABLE_GARDEN_IMAGEGEN=1` + API key exists	A	✅ `generate.js` / `edit.js`	✅ Auto	✅ Auto
`ENABLE_GARDEN_IMAGEGEN=1` but no API key	A?	❌ (Ask for API key first)	—	—
Garden not enabled + host has image tools	B	❌ (Use host tools)	Optional	Determined by host
Garden not enabled + host has no image tools	C	❌	✅ Mandatory	❌ (Impossible)

When Mode is Uncertain

If you cannot determine whether you are in Mode B or C, directly ask users: "Shall I use the image tool in your environment to generate images, or just write the prompt for you?"
If Mode A script call fails (401 / network / quota) → report error and ask "Switch to Mode B / C?"

User Input Tools

When this Skill needs to ask users questions, follow these rules:

Prioritize using the user input tools provided by the current runtime.
If no corresponding tool exists, ask with short plain text numbered questions.
Combine questions as much as possible and ask them all at once.

Skill Structure

```
scripts/check-mode.js
```
: Run this first to detect the operating mode (A / B / C)
```
scripts/generate.js
```
: Text-to-image generation (only used in Mode A)
```
scripts/edit.js
```
: Image editing based on original image/mask (only used in Mode A)
```
scripts/shared.js
```
: Shared logic for requests, saving, and environment variable reading
```
references/
```
: Hierarchical structured prompt templates (used in all three modes A / B / C)

Environment Variables

Read configurations in the following order:

CLI parameters
```
process.env
```
```
<cwd>/.env
```
```
<cwd>/.gateway.env
```
```
~/.gateway.env
```

Core variables:

```
ENABLE_GARDEN_IMAGEGEN
```
— Mode switch. Enable Mode A when set to
```
1
```
/
```
true
```
/
```
yes
```
/
```
on
```
; enter Mode B / C if not set or set to other values.
```
OPENAI_API_KEY
```
— Required for Mode A; not needed for B / C.
```
OPENAI_BASE_URL
```
— Defaults to
```
https://api.openai.com/v1
```
, can point to third-party compatible gateways.
```
OPENAI_IMAGE_MODEL
```
— Defaults to
```
gpt-image-2
```
, can be replaced with models supported by the gateway (e.g.,
```
gpt-image-1
```
/
```
dall-e-3
```
).

The default implementation works with OpenAI-compatible APIs and does not hardcode any third-party gateways.

Default Output Directories

If users do not explicitly specify output paths, uniformly use the following directories in the current workspace:

Prompt directory:
```
garden-gpt-image-2/prompt/
```
(Recommended for all three modes A / B / C for easy reuse and version management)
Image directory:
```
garden-gpt-image-2/image/
```
(Only used in Mode A; determined by host in Mode B, no images generated in Mode C)

If the directories do not exist, scripts (Mode A) must create them automatically; Mode B / C should manually run

mkdir -p

before writing prompts.

Default Naming Rules

If users do not explicitly specify filenames, scripts should automatically generate filenames related to the current task and append the current timestamp to avoid duplicates.

Naming rules:

Prompt:

garden-gpt-image-2/prompt/<task-slug>-<timestamp>.md

Image:

garden-gpt-image-2/image/<task-slug>-<timestamp>.png

Where:

```
<task-slug>
```
: A relevant short name automatically extracted based on current user requirements
```
<timestamp>
```
: Current timestamp, e.g.,
```
20260424-153045
```

Examples:

garden-gpt-image-2/prompt/live-commerce-ui-20260424-153045.md

garden-gpt-image-2/image/live-commerce-ui-20260424-153045.png

garden-gpt-image-2/prompt/vr-headset-exploded-view-20260424-153102.md

garden-gpt-image-2/image/vr-headset-exploded-view-20260424-153102.png

Prompt Saving Rules

Mode	Mandatory to Save Prompt?	Description
Mode A	✅ Mandatory	Must save prompt when entering actual generation/editing workflow
Mode B	Recommended	Default to save for easy reuse; skip if users say "no"
Mode C	✅ Mandatory	Users take the prompt to execute themselves; not saving is useless

General rules (applicable to all three modes):

If users explicitly provide a prompt file path, use that file directly as input.
If users directly provide a text prompt, save the final prompt to
```
garden-gpt-image-2/prompt/
```
first.
If users explicitly specify
```
--prompt-output
```
, respect the user-specified path.
Otherwise, use the default naming rules to save automatically.

Image Saving Rules (Only Mode A)

If users explicitly specify
```
--image
```
or
```
--output
```
, respect the user-specified path.
Otherwise, save to
```
garden-gpt-image-2/image/
```
by default.
Filenames should be semantically related to the current task and appended with a timestamp.

Mode B follows the saving method determined by the host image tool; Mode C does not generate images.

Quick Usage

0. Detect Operating Mode (First Step for Any Task)

bash

node skills/gpt-image-2/scripts/check-mode.js

The output will tell you if you are in Mode A / B / C, determining whether to call

generate.js

edit.js

next. Steps 1~4 below are only for Mode A.

1. Text-to-Image Generation (Mode A)

bash

node skills/gpt-image-2/scripts/generate.js \
  --prompt "A cute baby sea otter" \
  --size 1024x1024 \
  --quality high

2. Generate Image with Prompt File (Mode A)

bash

node skills/gpt-image-2/scripts/generate.js \
  --promptfile garden-gpt-image-2/prompt/poster-20260424-153045.md

3. Edit Existing Image (Mode A)

bash

node skills/gpt-image-2/scripts/edit.js \
  --image assets/source.png \
  --prompt "Replace the background with a clean studio scene"

4. Local Editing with Mask (Mode A)

bash

node skills/gpt-image-2/scripts/edit.js \
  --image assets/source.png \
  --mask assets/mask.png \
  --prompt "Replace only the masked area with a glass vase"

5. "Usage" for Mode B / C

No command-line entry——this Skill is only a prompt engineering guide at this time:

Mode B: Render the final prompt → call the host's built-in
```
image_generation
```
-type tool (pass prompt as parameter) → get the image.
Mode C: Render the final prompt → save to
```
garden-gpt-image-2/prompt/<task-slug>-<timestamp>.md
```
→ display the content directly to users → prompt users which image tools can reuse it directly.

JSON Template Working Method

When JSON templates are provided in

references/

, follow these rules:

First find the closest category directory from
```
SKILL.md
```
.
Then locate the specific template file.
```
{argument ...}
```
in the template indicates replaceable parameters.
Values explicitly provided by users are filled in directly.
If users do not provide values but the template marks
```
default
```
, use the default value first.
If missing information will significantly affect the result, actively ask users.
Users can also explicitly say "generate randomly for me", then you can keep the default value or reasonably randomize within the scope allowed by the template.

Questioning Rules

When the template lacks key variables, do not ask generally like "What style do you want?" Instead, ask precisely based on the template fields.

For example, when the live commerce UI template lacks the main subject, prioritize asking:

Who is the host?
Use real photos, celebrity names, character descriptions, or generate completely randomly?

When missing product information, ask:

What is the product name?
Is the product price specified?
Do you want me to automatically complete comments and gift content?

Template Index

Only read the closest specific template file by task type; do not read the entire

references/

at once.

1. Methodology Master Document

Read first:

```
references/prompt-writing.md
```

Applicable to:

You haven't decided how to construct JSON templates
You need to judge which fields to ask, which can be default, and which can be randomized
You need to abstract cases into reusable templates

2. UI Mockups (

references/ui-mockups/

)

Suitable for various "interface + content" mockup visuals. Currently implemented:

```
live-commerce-ui.md
```
— E-commerce live streaming screenshot mockup (host + chat area + gift area + product card)
```
social-interface-mockup.md
```
— Social platform dynamic detail page mockup (Twitter/X, Xiaohongshu, Weibo, Threads, etc.)
```
product-card-overlay.md
```
— Landing page hero / detail page main image (character + product + selling points + price)
```
chat-interface-scene.md
```
— Chat / dialogue interface mockup (iMessage, WeChat, group chat, AI assistant)
```
short-video-cover-ui.md
```
— Short video cover / live streaming thumbnail (YouTube, Douyin, Bilibili, VTuber stream)
```
landing-page-case-study.md
```
— Dark-mode SaaS / marketing case study long page UI mockup (multiple sections + scroll narrative + data cards + CTA)

3. Product Visuals (

references/product-visuals/

)

Suitable for visuals with "products as the visual center". Currently implemented:

```
exploded-view-poster.md
```
— Product exploded view poster (vertical stacked main body + callout + top logo + bottom brand area)
```
white-background-product.md
```
— E-commerce pure white background main image (single product / multi-angle / minimalist marketing overlay)
```
premium-studio-product.md
```
— High-end studio commercial product image (magazine advertisement-level atmosphere)
```
packaging-showcase.md
```
— Gift box / packaging display image (outer box + content display)
```
lifestyle-product-scene.md
```
— Lifestyle product scene image (product appears in real scenarios)
```
ecommerce-marketing-board.md
```
— Chinese-style e-commerce super composite sales board (main image + detail page + selling points + usage steps + scenarios + TVC storyboard combination in one image)

4. Maps (

references/maps/

)

Suitable for "map-style visuals" (infographics have been extracted to independent category 17). Currently implemented:

```
food-map.md
```
— Hand-drawn city food map (numbered spots + legend + central mascot)
```
travel-route-map.md
```
— Travel route map (multi-day itinerary / single-day city walk / outdoor route)
```
illustrated-city-map.md
```
— Illustrated city style map (landmarks + landscapes + cultural elements)
```
store-distribution-map.md
```
— Brand store / service coverage distribution map
```
itinerary-day-trip-map.md
```
— One-day trip split poster (left parchment itinerary card + right fantasy realistic map, 5-7 stations strictly aligned)

5. Slides & Visual Docs (

references/slides-and-visual-docs/

)

Suitable for visual documents that "explain one thing clearly on one page". Currently implemented:

```
dense-explainer-slides.md
```
— Irasutoya × Kasumigaseki hybrid high-density explanation Slide
```
policy-style-slide.md
```
— Policy / government announcement / white paper style explanation Slide
```
visual-report-page.md
```
— Business report executive summary / investor briefing / annual report overview page
```
educational-diagram-slide.md
```
— Educational schematic (concept / mechanism / process decomposition)

6. Poster & Campaigns (

references/poster-and-campaigns/

)

Suitable for "brand key visuals + campaigns + banners + magazine covers". Currently implemented:

```
brand-poster.md
```
— Brand main poster (product / character / pure text proposition)
```
campaign-kv.md
```
— Campaign Key Visual + derivative layout system
```
banner-hero.md
```
— Web hero / landing page / app banner (horizontal composition + CTA)
```
editorial-cover.md
```
— Magazine / journal / publication cover
```
biomimetic-concept-poster.md
```
— Biomimetic industrial design concept poster (natural prototype → evolution bar → hero render → multi-view technical drawing)
```
vintage-editorial-infographic.md
```
— Vintage archive / 1940s editorial-style infographic poster (character + formula + timeline + model, Bell Labs style)
```
character-catalog-poster.md
```
— Multi-version infographic poster of the same character (constellation / element / dynasty / personality series cards)
```
lineup-comparison-poster.md
```
— Series product lineup comparison infographic poster (30+ SKUs in one image + legend + level key)

7. Portraits & Characters (

references/portraits-and-characters/

)

Suitable for "character visuals". Currently implemented:

```
professional-portrait.md
```
— Professional business portrait (LinkedIn / team page / media illustrations)
```
founder-portrait.md
```
— Founder media blockbuster portrait (dramatic lighting + title space reserved)
```
virtual-host.md
```
— VTuber / virtual host profile card + live preview
```
character-sheet.md
```
— Comprehensive character setting sheet (three views + expressions + clothing + color palette)
```
pose-reference-sheet.md
```
— N×N pose / action dictionary reference sheet (multiple poses of the same character, dance / combat / fitness)

8. Scenes & Illustrations (

references/scenes-and-illustrations/

)

Suitable for illustration-style visuals focusing on "atmosphere + story + emotion". Currently implemented:

```
healing-scene.md
```
— Healing daily / seasonal scene illustration
```
concept-scene.md
```
— Cinematic concept large scene / IP key art
```
picture-book-scene.md
```
— Children's book / picture book inner page / holiday card
```
minimalist-mood-scene.md
```
— Minimalist blank atmosphere image / literary wallpaper

9. Editing Workflows (

references/editing-workflows/

)

Suitable for image modification tasks based on existing images (corresponding to

scripts/edit.js

). Currently implemented:

```
background-replacement.md
```
— Background replacement (product / portrait / outdoor / studio scene)
```
local-object-replacement.md
```
— Local object replacement (with or without mask)
```
object-removal.md
```
— Removal of clutter / passers-by / wires / defects
```
product-retouching.md
```
— Product retouching (gloss / label / shadow / defects)
```
portrait-local-edit.md
```
— Portrait local modification (hairstyle / clothing / makeup / accessories)

10. Avatars & Profile (

references/avatars-and-profile/

)

Suitable for "personal image" visuals such as stylized avatars / character settings / grids / stickers / series portraits. Currently implemented:

```
style-transfer-selfie.md
```
— Convert reference image characters into any style such as cosplay / gothic / retro film / idol photo
```
character-grid-portrait.md
```
— N×N grid portrait of the same character (multiple professions / expressions / dynasties / styles)
```
themed-3d-icon.md
```
— Kawaii 3D / Minecraft / skeuomorphic 3D app icon-style avatar
```
sticker-set.md
```
— Sticker set / emoji collection (independent elements + stroke + label)
```
cultural-portrait-series.md
```
— Dynasty / myth / literature / ethnic series portraits

11. Storyboards & Sequences (

references/storyboards-and-sequences/

)

Suitable for "narrative sequence" visuals such as multi-storyboard / comics / relationship diagrams / process steps. Currently implemented:

```
four-panel-comic.md
```
— 4-panel comic / satire comic / joke comic (exposition → development → climax → resolution + dialogue bubbles)
```
manga-spread-page.md
```
— Single-page / double-page manga storyboard (irregular grids + dialogue + inner thoughts)
```
anime-key-visual.md
```
— Single-image anime KV / light novel cover / IP poster
```
character-relationship-diagram.md
```
— Character relationship diagram poster (cards + relationship lines + legend)
```
recipe-process-flowchart.md
```
— Recipe / tutorial / process step diagram (numbering + illustrations + descriptions)
```
product-tvc-storyboard.md
```
— Product TVC commercial advertisement storyboard (9-panel real-shot texture + shot description + duration)
```
cinematic-storyboard-grid.md
```
— Cinematic narrative storyboard contact sheet (3×4 / 4×4, continuous narrative + cinematic still)
```
process-photo-board.md
```
— Real-person cinematic process board (equipment wearing / makeup / training / operation decomposition, numbering + step progression)

12. Grids & Collages (

references/grids-and-collages/

)

Suitable for "multi-panel grid / collage / project board" visuals. Currently implemented:

```
banner-grid-2x2.md
```
— 2×2 marketing banner set (generate 4 unified series designs at once)
```
lookbook-grid.md
```
— 7-day lookbook / 9-grid self-care / TOP N list image
```
mixed-style-multi-panel.md
```
— Mixed-style collage (same subject interpreted in different styles)
```
anime-pitch-board.md
```
— Anime / game / film project pitch board (KV + characters + worldview + copy)
```
ad-banner-multi-grid.md
```
— Multi-industry / multi-theme mixed advertisement banner grid (each grid has independent industry + style + copy)

13. Branding & Packaging (

references/branding-and-packaging/

)

Suitable for "brand identity system / mascot / packaging design" visuals. Currently implemented:

```
brand-identity-board.md
```
— Brand identity system board (logo + color scheme + font + application mockup)
```
mascot-brand-kit.md
```
— Mascot multi-panel brand identity set (main image + three views + expressions + applications)
```
cosmetic-packaging.md
```
— Cosmetic / skincare single bottle / series / gift box packaging
```
beverage-label-design.md
```
— Beverage / food / condiment label design (Chinese style / Japanese style / Western style)
```
full-mascot-brand-doc.md
```
— 18+ module large-scale brand identity + mascot full-process document (DNA / moodboard / sketch / line drawing / 3D / color scheme / material / application overview in one image)
```
character-merch-board.md
```
— IP character + peripheral / packaging / poster / social profile multi-element comprehensive brand board

14. Typography & Text Layout (

references/typography-and-text-layout/

)

Suitable for types where "text is the main visual" such as "text-first / bilingual layout". Currently implemented:

```
title-safe-poster.md
```
— Large-text proposition poster (Japanese high-energy / Swiss minimalist / retro printing)
```
bilingual-layout-visual.md
```
— Chinese-English / Chinese-Japanese bilingual layout visual (culture / academic / cross-cultural brand)

15. Assets & Props (

references/assets-and-props/

)

Suitable for "set of materials / game assets" visuals such as icon sets / game screenshots. Currently implemented:

```
retro-skeuomorphic-icons.md
```
— Skeuomorphic / Y2K / pixel icon set (unified style in a set)
```
game-screenshot-mockup.md
```
— In-game screenshot mockup (HUD + subtitles + task panel)

16. Academic Figures (

references/academic-figures/

)

Suitable for illustrations for papers / top conference submissions / academic posters / defense PPT / proposal defense / journal submission Graphical Abstract. Overall preference for white background + publication fonts + geometric precision + low-saturation engineering colors (mainly dark blue / gray-blue / black-gray, ≤3 main colors) + printable in monochrome. Strictly prohibit fictional quantitative data (values / contour lines / color scale ranges / formulas).

CS / CV / ML direction:

```
method-pipeline-overview.md
```
— Method overview diagram / pipeline figure (multi-stage blocks + data flow; variant 4 provides left/middle/right three-stage technical roadmap for engineering)
```
neural-network-architecture.md
```
— Neural network architecture diagram (layer blocks + tensor shape + skip connections)
```
qualitative-comparison-grid.md
```
— Multi-method qualitative comparison grid (rows = samples, columns = methods)

Engineering / natural sciences / general defense:

```
scientific-schematic.md
```
— Concept / principle / experimental device schematic (high degree of freedom, natural language template)
```
mechanism-diagram.md
```
— Mechanism schematic / causal link / transformation path (central object + multi-stage transformation + result area; includes three variants: three-stage causal chain / cyclic self-excitation / multi-branch competition)
```
multi-condition-comparison.md
```
— Multi-condition / multi-scenario result comparison diagram (side-by-side results of the same object under different conditions, 2×2 / 1×N / M×N; emphasizes strict uniformity between panels)
```
publication-chart.md
```
— Publication-ready data chart (bar / line / scatter / heatmap / box)

Overview / abstract / defense homepage:

```
graphical-abstract.md
```
— Journal submission Graphical Abstract / graphical abstract (four variants: horizontal 4-stage / central expansion / square / vertical)
```
research-overview-poster.md
```
— Research overview diagram for proposal / defense / presentation homepage (three layers top-middle-bottom + five modules; includes three variants: central radiation / left-right double column / minimalist)

Selection strategy: For CS/CV/ML papers, prefer
method-pipeline-overview
+
qualitative-comparison-grid
; for engineering / energy / chemical engineering / materials directions, prefer
method-pipeline-overview
variant 4 +
mechanism-diagram
+
multi-condition-comparison
; use
graphical-abstract
for journal submission abstract images; use
research-overview-poster
for defense PPT homepage.

17. Infographics (

references/infographics/

)

Suitable for "large-scale information visualization" visuals such as infographics / high-density popular science / hand-drawn infographics / KPI dashboards. Currently implemented:

```
legend-heavy-infographic.md
```
— High-legend-density popular science / causal chain / evolution / anatomical diagram (bilingual)
```
hand-drawn-infographic.md
```
— Hand-drawn style infographic (macaron / morandi / blackboard / kraft paper; natural language template)
```
bento-grid-infographic.md
```
— Bento grid modular infographic (high-density multi-module widget arrangement)
```
comparison-infographic.md
```
— Binary / multi-element comparison infographic (A vs B / package tiers / misconceptions vs correct answers)
```
step-by-step-infographic.md
```
— Step-by-step tutorial infographic (illustrative, warm; non-engineering flowchart)
```
kpi-dashboard-infographic.md
```
— KPI dashboard-style infographic (annual review / Wrapped / business dashboard)

18. Technical Diagrams (

references/technical-diagrams/

)

Suitable for engineering schematics such as system architecture / process / sequence / state machine / ER / mind map / network topology. Unified dark grid background + monospaced font + role-coded color scheme, each template comes with a light variant.

⚠️ Note: This directory generates PNG bitmaps, not editable SVG; use mermaid / draw.io / excalidraw / Figma if editable versions are needed. Currently implemented:

```
system-architecture.md
```
— System architecture diagram (frontend + backend + DB + cache + queue + external services)
```
flowchart-decision.md
```
— Flowchart / decision diagram (BPMN shape semantics + Yes/No branches)
```
sequence-diagram.md
```
— Sequence diagram (actor + lifeline + message arrow + activation bar)
```
state-machine.md
```
— State machine / lifecycle diagram (state + transition + guard / action)
```
er-diagram.md
```
— ER diagram / data model diagram (entity + fields + PK/FK + crow's foot relationships)
```
mind-map-tech.md
```
— Technical topic mind map (central + radial branches)
```
network-topology.md
```
— Network topology diagram (device glyphs + zone / VPC + bandwidth / protocol labels)

Prompt Workflow (Mode-Aware)

Regardless of A / B / C, the first 6 steps are shared; the difference lies in steps 7-8 for "image generation".

Run
check-mode.js
to determine the mode (A / B / C).
Judge whether the task is image generation or editing.
Identify which category directory it belongs to (refer to the "Template Index" below).
Only read the corresponding specific template file, do not read the entire references/ at once.
Strictly follow the template format: most templates use JSON main templates (preferred for structured tasks), a few templates (such as
```
infographics/hand-drawn-infographic.md
```
,
```
academic-figures/scientific-schematic.md
```
) use a hybrid form of "structured natural language + parameters", because forced JSON will restrict creative freedom.
Map user input to template parameters; actively initiate targeted clarification questions if key information is missing.

The prompt is now rendered. Branch by mode below:

7-A. Mode A: Save the final prompt to

garden-gpt-image-2/prompt/

, call

scripts/generate.js

scripts/edit.js

, and save images to

garden-gpt-image-2/image/

. 7-B. Mode B: Directly pass the final prompt to the host's image tool call; save a copy of the prompt to

garden-gpt-image-2/prompt/

as needed. 7-C. Mode C: Save the final prompt to

garden-gpt-image-2/prompt/<task-slug>-<timestamp>.md

, display the complete prompt to users in the conversation, and attach a short "how to use / recommended tools" suggestion.

After the task ends, tell users in one sentence: what the current mode is, where the prompt is saved, and where the image (if any) is saved.

Important Constraints

General:

JSON in template files is a prompt structure template, not an API request body template.
In all three modes, the final content passed to the image model is a "rendered prompt string"——it can be flattened JSON, structured natural language paragraphs, used exactly as per the template.
Unless explicitly requested by users, do not copy the "mode description" from SKILL.md into the final prompt——that is meta-information for the Agent.

Only applicable to Mode A:

Generation scripts use JSON body
Editing scripts use multipart form data
Responses are parsed preferentially by
```
data[0].b64_json
```
, and also compatible with
```
data[0].url
```
Do not introduce additional special query parameters unless explicitly required by the upstream interface

When to Ask Questions

Only ask questions when this information is missing and will significantly affect the result:

No prompt target
No original image for editing tasks
Subject identity or visual type determines the result direction
Product / price / copy / UI text is a core component of the画面
Users express multiple conflicting goals at the same time

Otherwise, prioritize making reasonable defaults and proceed.

gpt-image-2

NPX Install

Tags

SKILL.md Content (Chinese)

GPT Image 2

Operating Modes (Must Read, Confirm Before Any Operation)

Mode A · Garden Local Image Generation

Mode B · Host-Native Delegated Image Generation

Mode C · Advisor Pure Prompt Consultant

Mode Decision Table

When Mode is Uncertain

User Input Tools

Skill Structure

Environment Variables

Default Output Directories

Default Naming Rules

Prompt Saving Rules

Image Saving Rules (Only Mode A)

Quick Usage

0. Detect Operating Mode (First Step for Any Task)

1. Text-to-Image Generation (Mode A)

2. Generate Image with Prompt File (Mode A)

3. Edit Existing Image (Mode A)

4. Local Editing with Mask (Mode A)

5. "Usage" for Mode B / C

JSON Template Working Method

Questioning Rules

Template Index

1. Methodology Master Document

2. UI Mockups (references/ui-mockups/)

3. Product Visuals (references/product-visuals/)

4. Maps (references/maps/)

5. Slides & Visual Docs (references/slides-and-visual-docs/)

6. Poster & Campaigns (references/poster-and-campaigns/)

7. Portraits & Characters (references/portraits-and-characters/)

8. Scenes & Illustrations (references/scenes-and-illustrations/)

9. Editing Workflows (references/editing-workflows/)

10. Avatars & Profile (references/avatars-and-profile/)

11. Storyboards & Sequences (references/storyboards-and-sequences/)

12. Grids & Collages (references/grids-and-collages/)

13. Branding & Packaging (references/branding-and-packaging/)

14. Typography & Text Layout (references/typography-and-text-layout/)

15. Assets & Props (references/assets-and-props/)

16. Academic Figures (references/academic-figures/)

17. Infographics (references/infographics/)

18. Technical Diagrams (references/technical-diagrams/)

Prompt Workflow (Mode-Aware)

Important Constraints

When to Ask Questions

2. UI Mockups (
`references/ui-mockups/`
)

3. Product Visuals (
`references/product-visuals/`
)

4. Maps (
`references/maps/`
)

5. Slides & Visual Docs (
`references/slides-and-visual-docs/`
)

6. Poster & Campaigns (
`references/poster-and-campaigns/`
)

7. Portraits & Characters (
`references/portraits-and-characters/`
)

8. Scenes & Illustrations (
`references/scenes-and-illustrations/`
)

9. Editing Workflows (
`references/editing-workflows/`
)

10. Avatars & Profile (
`references/avatars-and-profile/`
)

11. Storyboards & Sequences (
`references/storyboards-and-sequences/`
)

12. Grids & Collages (
`references/grids-and-collages/`
)

13. Branding & Packaging (
`references/branding-and-packaging/`
)

14. Typography & Text Layout (
`references/typography-and-text-layout/`
)

15. Assets & Props (
`references/assets-and-props/`
)

16. Academic Figures (
`references/academic-figures/`
)

17. Infographics (
`references/infographics/`
)

18. Technical Diagrams (
`references/technical-diagrams/`
)