gemini-image-generator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Image Generator

Gemini 图像生成器

Overview

概述

Generate images using Google's Gemini API with support for text-to-image generation, image editing, and multi-image reference inputs. Supports both the fast Gemini 2.5 Flash model and the high-quality Gemini 3 Pro model with up to 4K resolution.
使用Google的Gemini API生成图像,支持文生图、图像编辑和多参考图像输入功能,同时支持运行速度更快的Gemini 2.5 Flash模型,以及最高可输出4K分辨率的高品质Gemini 3 Pro模型。

When to Use

适用场景

  • Generating app icons, logos, and UI assets
  • Creating marketing visuals and promotional graphics
  • Prototyping UI designs with AI-generated placeholders
  • Generating game sprites and 2D assets
  • Creating concept art and mood boards
  • Editing or modifying existing images with text prompts
  • Style transfer using reference images
  • 生成应用图标、Logo和UI资源
  • 制作营销视觉素材和宣传图形
  • 用AI生成的占位图快速原型化UI设计
  • 生成游戏精灵和2D资源
  • 制作概念艺术图和情绪板
  • 用文本提示词编辑或修改已有图像
  • 基于参考图像实现风格迁移

Prerequisites

前置依赖

  • Python 3.9+
  • google-genai
    package
  • GEMINI_API_KEY
    environment variable
  • Python 3.9+
  • google-genai
  • GEMINI_API_KEY
    环境变量

Installation

安装

bash
pip install google-genai
bash
pip install google-genai

Getting an API Key

获取API密钥

  1. Go to Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the key and set it as an environment variable:
bash
export GEMINI_API_KEY="your-api-key"
Add to your shell profile (
~/.zshrc
or
~/.bashrc
) for persistence:
bash
echo 'export GEMINI_API_KEY="your-api-key"' >> ~/.zshrc
  1. 访问 Google AI Studio
  2. 使用你的Google账号登录
  3. 点击「创建API密钥」
  4. 复制密钥并将其设置为环境变量:
bash
export GEMINI_API_KEY="your-api-key"
你可以将配置添加到shell配置文件(
~/.zshrc
~/.bashrc
)中实现永久生效:
bash
echo 'export GEMINI_API_KEY="your-api-key"' >> ~/.zshrc

Quick Start

快速开始

Generate a simple image:
bash
python scripts/generate_image.py -p "A fluffy orange cat sitting on a windowsill, warm sunlight, cozy atmosphere"
Generate with specific aspect ratio:
bash
python scripts/generate_image.py -p "Modern tech startup banner" -a 16:9 -o banner.png
Edit an existing image:
bash
python scripts/generate_image.py -p "Make the sky more dramatic with sunset colors" -i photo.jpg -o edited.png
生成一张简单的图像:
bash
python scripts/generate_image.py -p "A fluffy orange cat sitting on a windowsill, warm sunlight, cozy atmosphere"
生成指定宽高比的图像:
bash
python scripts/generate_image.py -p "Modern tech startup banner" -a 16:9 -o banner.png
编辑已有图像:
bash
python scripts/generate_image.py -p "Make the sky more dramatic with sunset colors" -i photo.jpg -o edited.png

Command Reference

命令参考

python scripts/generate_image.py [options]

Required:
  -p, --prompt TEXT         Text prompt describing the image

Optional:
  -o, --output PATH         Output file path (default: auto-generated)
  -m, --model MODEL         Model to use (default: gemini-3-pro-image-preview)
  -a, --aspect-ratio RATIO  Aspect ratio (default: 1:1)
  -s, --size SIZE           Image size: 1K, 2K, 4K (default: 1K, Pro only)
  -i, --input-image PATH    Input image for editing mode
  -r, --reference-images    Reference image(s), can be repeated (max 14)
  -v, --verbose             Show detailed progress
python scripts/generate_image.py [options]

必填参数:
  -p, --prompt TEXT         描述图像的文本提示词

可选参数:
  -o, --output PATH         输出文件路径(默认:自动生成)
  -m, --model MODEL         使用的模型(默认:gemini-3-pro-image-preview)
  -a, --aspect-ratio RATIO  宽高比(默认:1:1)
  -s, --size SIZE           图像尺寸:1K、2K、4K(默认:1K,仅Pro模型支持)
  -i, --input-image PATH    编辑模式下的输入图像路径
  -r, --reference-images    参考图像,可重复添加(最多14张)
  -v, --verbose             展示详细生成进度

Models

模型说明

ModelResolutionBest For
gemini-3-pro-image-preview
Up to 4KFinal assets, high quality, professional work
gemini-2.5-flash-image
1024pxQuick iterations, prototyping, batch generation
The Pro model is used by default. Use the Flash model for faster generation when quality is less critical:
bash
python scripts/generate_image.py -p "Quick concept sketch" -m gemini-2.5-flash-image
模型分辨率适用场景
gemini-3-pro-image-preview
最高4K最终产出资源、高品质要求、专业工作场景
gemini-2.5-flash-image
1024px快速迭代、原型制作、批量生成场景
默认使用Pro模型,当对画质要求不高需要更快生成速度时可以使用Flash模型:
bash
python scripts/generate_image.py -p "Quick concept sketch" -m gemini-2.5-flash-image

Aspect Ratios

宽高比说明

RatioUse Case
1:1
App icons, profile pictures, thumbnails
2:3
Portrait photos, book covers
3:2
Landscape photos, postcards
3:4
Portrait photos, social media posts
4:3
Traditional photos, presentations
4:5
Instagram posts, portrait social media
5:4
Large format prints
9:16
Stories, vertical videos, mobile wallpapers
16:9
Widescreen banners, video thumbnails, headers
21:9
Ultrawide banners, cinematic headers
比例适用场景
1:1
应用图标、头像、缩略图
2:3
竖版照片、书籍封面
3:2
横版照片、明信片
3:4
竖版照片、社交媒体帖子
4:3
传统照片、演示文稿
4:5
Instagram帖子、竖版社交内容
5:4
大幅打印素材
9:16
短视频故事、竖版视频、手机壁纸
16:9
宽屏横幅、视频缩略图、页面头部
21:9
超宽横幅、电影感页面头部

Image Sizes

图像尺寸说明

Available for Gemini 3 Pro model only:
SizeResolutionUse Case
1K
1024pxWeb graphics, thumbnails
2K
2048pxPrint materials, detailed graphics
4K
4096pxHigh-resolution prints, large displays
bash
python scripts/generate_image.py -p "Detailed landscape" -s 4K -o landscape_4k.png
仅Gemini 3 Pro模型支持自定义尺寸:
尺寸分辨率适用场景
1K
1024px网页图形、缩略图
2K
2048px打印素材、精细图形
4K
4096px高分辨率打印、大型显示设备
bash
python scripts/generate_image.py -p "Detailed landscape" -s 4K -o landscape_4k.png

Prompt Engineering Guide

提示词工程指南

Prompt Structure

提示词结构

Use this formula for effective prompts:
[Subject] + [Style] + [Details] + [Quality modifiers]
使用以下公式可以生成更有效的提示词:
[主体] + [风格] + [细节] + [画质修饰词]

Techniques

技巧

1. Be Specific About the Subject
Bad:  "a cat"
Good: "a fluffy orange tabby cat sitting on a windowsill"
2. Specify Art Style
  • Photorealistic, cartoon, anime, oil painting, watercolor
  • Digital art, 3D render, pixel art, vector illustration
  • Specific styles: "in the style of Studio Ghibli", "cyberpunk aesthetic"
3. Include Environment and Lighting
  • "golden hour lighting", "dramatic shadows", "soft ambient light"
  • "neon-lit cityscape", "cozy interior", "misty forest"
4. Add Quality Modifiers
  • "high quality", "detailed", "professional"
  • "sharp focus", "studio lighting", "cinematic"
5. Specify Composition
  • "centered composition", "rule of thirds"
  • "close-up", "wide shot", "bird's eye view", "isometric"
1. 明确描述主体
负面示例:  "a cat"
正面示例: "a fluffy orange tabby cat sitting on a windowsill"
2. 指定艺术风格
  • 照片写实、卡通、动漫、油画、水彩
  • 数字艺术、3D渲染、像素画、矢量插画
  • 特定风格:"in the style of Studio Ghibli"、"cyberpunk aesthetic"
3. 包含环境和光照描述
  • "golden hour lighting"、"dramatic shadows"、"soft ambient light"
  • "neon-lit cityscape"、"cozy interior"、"misty forest"
4. 添加画质修饰词
  • "high quality"、"detailed"、"professional"
  • "sharp focus"、"studio lighting"、"cinematic"
5. 指定构图方式
  • "centered composition"、"rule of thirds"
  • "close-up"、"wide shot"、"bird's eye view"、"isometric"

Example Prompts by Use Case

分场景提示词示例

App Icon
Minimalist app icon for a weather app, blue gradient background,
white cloud with golden sun rays, flat design, rounded corners,
iOS style, clean and modern
Marketing Banner
Professional tech startup banner, abstract geometric shapes
flowing from left to right, purple and blue gradient,
modern and clean aesthetic, corporate style
Game Sprite
Pixel art character sprite, fantasy warrior with glowing sword,
32x32 style, transparent background, retro 16-bit game aesthetic,
vibrant colors
Product Photo
Professional product photo of wireless earbuds on white background,
soft shadows, studio lighting, minimalist composition,
commercial photography style
Concept Art
Futuristic city skyline at sunset, flying vehicles between
towering skyscrapers, neon lights reflecting on wet streets,
cyberpunk atmosphere, cinematic composition, detailed
UI Mockup Asset
Abstract gradient background for mobile app, soft purple to pink
transition, subtle geometric patterns, modern and minimal,
suitable for dark text overlay
应用图标
Minimalist app icon for a weather app, blue gradient background,
white cloud with golden sun rays, flat design, rounded corners,
iOS style, clean and modern
营销横幅
Professional tech startup banner, abstract geometric shapes
flowing from left to right, purple and blue gradient,
modern and clean aesthetic, corporate style
游戏精灵
Pixel art character sprite, fantasy warrior with glowing sword,
32x32 style, transparent background, retro 16-bit game aesthetic,
vibrant colors
产品照片
Professional product photo of wireless earbuds on white background,
soft shadows, studio lighting, minimalist composition,
commercial photography style
概念艺术
Futuristic city skyline at sunset, flying vehicles between
towering skyscrapers, neon lights reflecting on wet streets,
cyberpunk atmosphere, cinematic composition, detailed
UI原型资源
Abstract gradient background for mobile app, soft purple to pink
transition, subtle geometric patterns, modern and minimal,
suitable for dark text overlay

Generation Modes

生成模式

Text-to-Image

文生图

Generate images from text descriptions:
bash
python scripts/generate_image.py -p "Your description here" -o output.png
通过文本描述生成图像:
bash
python scripts/generate_image.py -p "Your description here" -o output.png

Image Editing

图像编辑

Modify an existing image with a text prompt:
bash
python scripts/generate_image.py \
  -p "Change the background to a tropical beach at sunset" \
  -i original.jpg \
  -o edited.png
通过文本提示词修改已有图像:
bash
python scripts/generate_image.py \
  -p "Change the background to a tropical beach at sunset" \
  -i original.jpg \
  -o edited.png

Multi-Image Reference

多参考图像

Use up to 14 reference images to guide style or content:
bash
python scripts/generate_image.py \
  -p "Create a new character in this art style" \
  -r style_ref1.png \
  -r style_ref2.png \
  -o new_character.png
最多可使用14张参考图像引导生成的风格或内容:
bash
python scripts/generate_image.py \
  -p "Create a new character in this art style" \
  -r style_ref1.png \
  -r style_ref2.png \
  -o new_character.png

Examples

使用示例

Generate App Icons

生成应用图标

bash
undefined
bash
undefined

iOS-style weather icon

iOS风格天气图标

python scripts/generate_image.py
-p "Minimalist weather app icon, blue sky gradient, white fluffy cloud, sun peeking out, flat design, rounded square, iOS 17 style"
-a 1:1
-o weather_icon.png
python scripts/generate_image.py
-p "Minimalist weather app icon, blue sky gradient, white fluffy cloud, sun peeking out, flat design, rounded square, iOS 17 style"
-a 1:1
-o weather_icon.png

Fitness app icon

健身应用图标

python scripts/generate_image.py
-p "Fitness app icon, running figure silhouette, orange to red gradient background, energetic and dynamic, modern flat design"
-a 1:1
-o fitness_icon.png
undefined
python scripts/generate_image.py
-p "Fitness app icon, running figure silhouette, orange to red gradient background, energetic and dynamic, modern flat design"
-a 1:1
-o fitness_icon.png
undefined

Create Marketing Assets

制作营销资源

bash
undefined
bash
undefined

Website hero banner

网站首屏横幅

python scripts/generate_image.py
-p "Abstract tech hero banner, flowing data visualization, dark blue background with glowing cyan accents, futuristic and professional"
-a 21:9
-s 2K
-o hero_banner.png
python scripts/generate_image.py
-p "Abstract tech hero banner, flowing data visualization, dark blue background with glowing cyan accents, futuristic and professional"
-a 21:9
-s 2K
-o hero_banner.png

Social media post

社交媒体帖子背景

python scripts/generate_image.py
-p "Motivational quote background, soft sunrise gradient, minimalist mountain silhouette, peaceful and inspiring"
-a 4:5
-o social_post_bg.png
undefined
python scripts/generate_image.py
-p "Motivational quote background, soft sunrise gradient, minimalist mountain silhouette, peaceful and inspiring"
-a 4:5
-o social_post_bg.png
undefined

Generate Game Assets

生成游戏资源

bash
undefined
bash
undefined

Character sprite

角色精灵

python scripts/generate_image.py
-p "Pixel art hero character, knight with blue cape and silver armor, idle pose, transparent background, 16-bit retro style"
-a 1:1
-o knight_sprite.png
python scripts/generate_image.py
-p "Pixel art hero character, knight with blue cape and silver armor, idle pose, transparent background, 16-bit retro style"
-a 1:1
-o knight_sprite.png

Environment tile

环境瓦片

python scripts/generate_image.py
-p "Grass tile for top-down RPG, seamless pattern, vibrant green with small flowers, pixel art style, 32x32 aesthetic"
-a 1:1
-o grass_tile.png
undefined
python scripts/generate_image.py
-p "Grass tile for top-down RPG, seamless pattern, vibrant green with small flowers, pixel art style, 32x32 aesthetic"
-a 1:1
-o grass_tile.png
undefined

Edit Photos

编辑照片

bash
undefined
bash
undefined

Change background

更换背景

python scripts/generate_image.py
-p "Replace background with a cozy coffee shop interior"
-i portrait.jpg
-o portrait_coffee_shop.png
python scripts/generate_image.py
-p "Replace background with a cozy coffee shop interior"
-i portrait.jpg
-o portrait_coffee_shop.png

Style enhancement

风格优化

python scripts/generate_image.py
-p "Enhance with dramatic cinematic color grading, increase contrast, add film grain"
-i landscape.jpg
-o landscape_cinematic.png
undefined
python scripts/generate_image.py
-p "Enhance with dramatic cinematic color grading, increase contrast, add film grain"
-i landscape.jpg
-o landscape_cinematic.png
undefined

Troubleshooting

故障排查

"GEMINI_API_KEY environment variable not set"

"GEMINI_API_KEY environment variable not set"

Set your API key:
bash
export GEMINI_API_KEY="your-api-key"
设置你的API密钥:
bash
export GEMINI_API_KEY="your-api-key"

"Rate limit exceeded"

"Rate limit exceeded"

Wait a few minutes and try again. For batch operations, add delays between requests.
等待几分钟后再重试,如果是批量操作可以在请求之间添加延迟。

"Content policy violation"

"Content policy violation"

Modify your prompt to avoid content that violates Google's usage policies. Try:
  • Using more generic descriptions
  • Avoiding specific brand names or copyrighted characters
  • Removing potentially sensitive content
修改你的提示词,避免违反Google使用政策的内容。可以尝试:
  • 使用更通用的描述
  • 避免特定品牌名或受版权保护的角色
  • 移除可能的敏感内容

"No image in response"

"No image in response"

The model sometimes returns text instead of an image. Try:
  • Making your prompt more specific
  • Adding "generate an image of" to your prompt
  • Using a different aspect ratio
模型有时会返回文本而非图像,可以尝试:
  • 让提示词更具体
  • 在提示词开头添加 "generate an image of"
  • 使用不同的宽高比

"Unsupported image format"

"Unsupported image format"

Supported formats for input images: PNG, JPEG, WebP
输入图像支持的格式:PNG、JPEG、WebP

Size option not working

尺寸选项不生效

The size option (2K, 4K) is only available for
gemini-3-pro-image-preview
. The Flash model always generates 1024px images.
尺寸选项(2K、4K)仅支持
gemini-3-pro-image-preview
模型,Flash模型始终生成1024px的图像。

Best Practices

最佳实践

  • Start simple: Begin with clear, concise prompts and iterate
  • Use the right model: Flash for speed, Pro for quality
  • Match aspect ratio to use case: 16:9 for banners, 1:1 for icons
  • Save high-quality versions: Use 4K when you need detailed assets
  • Iterate on prompts: Small changes can significantly affect results
  • Use reference images: For consistent style across multiple generations
  • Add quality modifiers: "high quality", "detailed", "professional"
  • Specify what you don't want: "no text", "simple background", "no people"
  • 从简单开始:先使用清晰简洁的提示词,再逐步迭代优化
  • 选择合适的模型:追求速度用Flash,追求画质用Pro
  • 宽高比匹配场景:横幅用16:9,图标用1:1
  • 保存高版本资源:需要精细素材时使用4K分辨率
  • 迭代优化提示词:微小的改动可能会大幅影响生成结果
  • 使用参考图像:多次生成时保持风格统一
  • 添加画质修饰词:比如"high quality"、"detailed"、"professional"
  • 明确不需要的内容:比如"no text"、"simple background"、"no people"