nano-banana-pro

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Nano Banana Pro Image Generation & Editing

Nano Banana Pro 图像生成与编辑

Generate and edit images using Google's Gemini 3 Pro model with advanced transparency support.
使用Google的Gemini 3 Pro模型生成和编辑图像,支持高级透明度功能。

Prerequisites

前提条件

  1. Dependencies:
    bash
    pip install google-genai Pillow numpy python-dotenv
  2. API Key: The script loads from
    .env
    automatically. Only ask the user if the script fails with "No API key found".
  1. 依赖项:
    bash
    pip install google-genai Pillow numpy python-dotenv
  2. API密钥:脚本会自动从
    .env
    文件加载。仅当脚本因“未找到API密钥”失败时,才向用户询问。

CLI Usage (REQUIRED)

CLI使用方法(必填)

ALWAYS use the CLI script. Do NOT write Python code or create .py files.
Run
scripts/generate.py
directly:
bash
undefined
务必使用CLI脚本。请勿编写Python代码或创建.py文件。
直接运行
scripts/generate.py
:
bash
undefined

Basic generation

基础生成

python scripts/generate.py "a cute banana sticker" -o banana.png
python scripts/generate.py "a cute banana sticker" -o banana.png

With transparency (for game assets, stickers, icons)

带透明度(适用于游戏资源、贴纸、图标)

python scripts/generate.py "pixel art sword" -o sword.png --transparent
python scripts/generate.py "pixel art sword" -o sword.png --transparent

Custom size and aspect ratio

自定义尺寸和宽高比

python scripts/generate.py "game logo" -o logo.png --size 4K --ratio 16:9

**Options:**
- `-o, --output` - Output filename (default: output.png)
- `--transparent` - Extract true alpha channel using difference matting
- `--size` - 1K, 2K, or 4K (default: 2K)
- `--ratio` - Aspect ratio: 1:1, 16:9, 9:16, etc. (default: 1:1)
- `--model` - Model override (default: gemini-3-pro-image-preview)

**Note:** The script loads the API key from `.env` automatically. Do not check for API keys manually or ask the user about them - just run the script and it will error with instructions if the key is missing.
python scripts/generate.py "game logo" -o logo.png --size 4K --ratio 16:9

**选项:**
- `-o, --output` - 输出文件名(默认值:output.png)
- `--transparent` - 使用差异抠图提取真实的Alpha通道
- `--size` - 1K、2K或4K(默认值:2K)
- `--ratio` - 宽高比:1:1、16:9、9:16等(默认值:1:1)
- `--model` - 模型覆盖(默认值:gemini-3-pro-image-preview)

**注意:** 脚本会自动从`.env`文件加载API密钥。请勿手动检查API密钥或向用户询问,直接运行脚本即可,如果密钥缺失,脚本会报错并给出说明。

Intent Detection

意图识别

Analyze user request to determine:
IntentTriggersAction
Generate"create", "generate", "make", "draw", "design"Text-to-image
Edit"edit", "change", "modify", "update", "fix"Image-to-image
Transparency"transparent", "remove background", "alpha", "cutout", "PNG with transparency"Use difference matting
Text overlay"add text", "write on", "label", "caption"Use Gemini 3 Pro for accurate text
分析用户请求以确定:
意图触发词操作
生成"create"、"generate"、"make"、"draw"、"design"文本转图像
编辑"edit"、"change"、"modify"、"update"、"fix"图像转图像
透明度"transparent"、"remove background"、"alpha"、"cutout"、"PNG with transparency"使用差异抠图
文字叠加"add text"、"write on"、"label"、"caption"使用Gemini 3 Pro添加精准文字

Resolution Selection

分辨率选择

Choose resolution based on use case:
ResolutionBest ForPixel Output
1KQuick previews, thumbnails, web icons~1024px
2KSocial media, standard web images~2048px
4KPrint, professional assets, sprite sheets~4096px
Heuristics:
  • Sprite sheets, game assets, print materials → 4K
  • Social media, blog images, presentations → 2K
  • Quick tests, thumbnails, prototypes → 1K
When uncertain, ask user or default to 2K.
根据使用场景选择分辨率:
分辨率最佳用途像素输出
1K快速预览、缩略图、网页图标~1024px
2K社交媒体、标准网页图像~2048px
4K印刷、专业资源、精灵图~4096px
经验法则:
  • 精灵图、游戏资源、印刷材料 → 4K
  • 社交媒体、博客图片、演示文稿 → 2K
  • 快速测试、缩略图、原型 → 1K
不确定时,询问用户或默认使用2K

Aspect Ratios

宽高比

Available:
1:1
,
2:3
,
3:2
,
3:4
,
4:3
,
4:5
,
5:4
,
9:16
,
16:9
,
21:9
Selection guide:
  • Square content (icons, avatars, social posts) →
    1:1
  • Portrait (mobile, vertical video) →
    9:16
    or
    3:4
  • Landscape (desktop, presentations) →
    16:9
    or
    3:2
  • Cinematic/ultrawide →
    21:9
可用选项:
1:1
2:3
3:2
3:4
4:3
4:5
5:4
9:16
16:9
21:9
选择指南:
  • 方形内容(图标、头像、社交帖子) →
    1:1
  • 竖屏(移动端、竖版视频) →
    9:16
    3:4
  • 横屏(桌面端、演示文稿) →
    16:9
    3:2
  • 电影级/超宽屏 →
    21:9

Core Implementation

核心实现

Basic Generation

基础生成

python
from google import genai
from google.genai import types
from PIL import Image
import io

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Your descriptive prompt here",
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="1:1",  # or other ratio
            image_size="2K"     # 1K, 2K, or 4K
        ),
    ),
)
python
from google import genai
from google.genai import types
from PIL import Image
import io

client = genai.Client()

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents="Your descriptive prompt here",
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="1:1",  # or other ratio
            image_size="2K"     # 1K, 2K, or 4K
        ),
    ),
)

Extract image from response

Extract image from response

for part in response.parts: if part.inline_data is not None: image = Image.open(io.BytesIO(part.inline_data.data)) image.save("output.png") break
undefined
for part in response.parts: if part.inline_data is not None: image = Image.open(io.BytesIO(part.inline_data.data)) image.save("output.png") break
undefined

Image Editing

图像编辑

python
undefined
python
undefined

Load existing image

Load existing image

input_image = Image.open("input.png")
response = client.models.generate_content( model="gemini-3-pro-image-preview", contents=[ input_image, "Edit instruction: Change the background to sunset colors" ], config=types.GenerateContentConfig( response_modalities=['TEXT', 'IMAGE'], image_config=types.ImageConfig( aspect_ratio="1:1", image_size="2K" ), ), )
undefined
input_image = Image.open("input.png")
response = client.models.generate_content( model="gemini-3-pro-image-preview", contents=[ input_image, "Edit instruction: Change the background to sunset colors" ], config=types.GenerateContentConfig( response_modalities=['TEXT', 'IMAGE'], image_config=types.ImageConfig( aspect_ratio="1:1", image_size="2K" ), ), )
undefined

Multi-Turn Editing

多轮编辑

Preserve context across edits using thought signatures:
python
undefined
使用思维标记在编辑过程中保留上下文:
python
undefined

First edit

First edit

response1 = client.models.generate_content( model="gemini-3-pro-image-preview", contents=[image, "Add a red hat"], config=config, )
response1 = client.models.generate_content( model="gemini-3-pro-image-preview", contents=[image, "Add a red hat"], config=config, )

Continue editing (include previous response)

Continue editing (include previous response)

response2 = client.models.generate_content( model="gemini-3-pro-image-preview", contents=[ image, "Add a red hat", response1, # Include for context preservation "Now make the hat blue instead" ], config=config, )
undefined
response2 = client.models.generate_content( model="gemini-3-pro-image-preview", contents=[ image, "Add a red hat", response1, # Include for context preservation "Now make the hat blue instead" ], config=config, )
undefined

Transparency Extraction

透明度提取

When user needs transparent images, use difference matting. See
scripts/transparency.py
.
When to use:
  • User explicitly asks for transparency
  • Game sprites, icons, logos
  • Assets that will be composited
  • Cutouts and stickers
Process:
  1. Generate image on pure white background (#FFFFFF)
  2. Edit same image to pure black background (#000000)
  3. Calculate alpha from pixel differences
  4. Recover original colors
Key insight: Opaque pixels appear identical on both backgrounds (distance ≈ 0), transparent pixels show background color (max distance).
python
from scripts.transparency import extract_alpha_difference_matting
当用户需要透明图像时,使用差异抠图。请查看
scripts/transparency.py
使用场景:
  • 用户明确要求透明度
  • 游戏精灵、图标、标志
  • 将要合成的资源
  • 抠图和贴纸
流程:
  1. 在纯白背景(#FFFFFF)上生成图像
  2. 将同一图像编辑为纯黑背景(#000000)
  3. 根据像素差异计算Alpha通道
  4. 还原原始颜色
核心思路: 不透明像素在两种背景下显示相同(差异≈0),透明像素会显示背景颜色(差异最大)。
python
from scripts.transparency import extract_alpha_difference_matting

After generating white and black background versions

After generating white and black background versions

final_image = extract_alpha_difference_matting(img_on_white, img_on_black) final_image.save("output.png") # RGBA with true transparency
undefined
final_image = extract_alpha_difference_matting(img_on_white, img_on_black) final_image.save("output.png") # RGBA with true transparency
undefined

Prompt Engineering

提示词工程

Fundamental Principle

基本原则

"Describe the scene, don't just list keywords."
Narrative paragraphs outperform disconnected word lists.
“描述场景,不要只罗列关键词。”
叙事性段落比零散的单词列表效果更好。

Effective Prompt Structure

有效的提示词结构

[Style/Medium] of [Subject] in [Context/Setting], [Lighting], [Additional details]
Examples:
undefined
[风格/媒介]的[主体]在[环境/场景]中,[光线],[其他细节]
示例:
undefined

Photorealistic

写实风格

A professional studio photograph of a brass steampunk pocket watch, shot with a 50mm lens, soft diffused lighting from the left, shallow depth of field with bokeh background, 4K HDR quality.
一张专业工作室拍摄的黄铜蒸汽朋克怀表照片, 使用50mm镜头拍摄,左侧柔和漫射光, 浅景深背景虚化,4K HDR画质。

Illustration

插画风格

A detailed digital illustration of a medieval blacksmith's forge, isometric perspective, warm orange glow from the furnace, dieselpunk aesthetic with exposed pipes and riveted metal plates.
一幅详细的中世纪铁匠铺数字插画, 等距视角,熔炉发出温暖的橙色光芒, 柴油朋克美学,带有裸露的管道和铆钉金属板。

Product mockup

产品样机

A product photography shot of a ceramic coffee mug on a marble surface, natural window lighting, minimalist Scandinavian style, clean white background.
undefined
一张陶瓷咖啡杯放在大理石台面上的产品摄影图, 自然窗户光线,简约斯堪的纳维亚风格,干净的白色背景。
undefined

Text in Images

图像中的文字

For images containing text, use Gemini 3 Pro (not Imagen):
  • Keep text to 25 characters or less per element
  • Use 2-3 distinct text phrases maximum
  • Specify font style generally (bold, elegant, handwritten)
  • Indicate size (small, medium, large)
如需在图像中添加文字,请使用Gemini 3 Pro(而非Imagen):
  • 每个元素的文字控制在25个字符以内
  • 最多使用2-3个不同的文字短语
  • 大致指定字体风格(粗体、优雅、手写体)
  • 说明尺寸(小、中、大)

Quality Modifiers

质量修饰词

Add these for enhanced output:
  • Photography: 4K, HDR, studio photo, professional lighting
  • Art: detailed, by a professional, high-quality illustration
  • General: high-fidelity, crisp details, polished finish
添加以下内容以提升输出质量:
  • 摄影: 4K、HDR、工作室照片、专业布光
  • 艺术: 细节丰富、专业水准、高质量插画
  • 通用: 高保真、清晰细节、精致成品

Error Handling

错误处理

python
from google.genai import errors

def generate_with_retry(client, *, model, contents, config, max_attempts=5):
    for attempt in range(1, max_attempts + 1):
        try:
            return client.models.generate_content(
                model=model, contents=contents, config=config
            )
        except errors.APIError as e:
            code = getattr(e, "code", None) or getattr(e, "status", None)
            if code not in (429, 500, 502, 503, 504) or attempt >= max_attempts:
                raise
            delay = min(30, 2 ** (attempt - 1))
            time.sleep(delay)
python
from google.genai import errors

def generate_with_retry(client, *, model, contents, config, max_attempts=5):
    for attempt in range(1, max_attempts + 1):
        try:
            return client.models.generate_content(
                model=model, contents=contents, config=config
            )
        except errors.APIError as e:
            code = getattr(e, "code", None) or getattr(e, "status", None)
            if code not in (429, 500, 502, 503, 504) or attempt >= max_attempts:
                raise
            delay = min(30, 2 ** (attempt - 1))
            time.sleep(delay)

Model Selection

模型选择

ModelUse Case
gemini-3-pro-image-preview
Complex edits, text rendering, multi-turn, transparency workflows
gemini-2.5-flash-image
Quick generation, high volume, simple tasks
imagen-4.0-generate-001
Photorealistic images, no editing needed
Default to gemini-3-pro-image-preview for most tasks.
模型使用场景
gemini-3-pro-image-preview
复杂编辑、文字渲染、多轮对话、透明度工作流
gemini-2.5-flash-image
快速生成、高吞吐量、简单任务
imagen-4.0-generate-001
写实图像、无需编辑
大多数任务默认使用gemini-3-pro-image-preview

File References

文件参考

  • scripts/generate.py
    - CLI for image generation (use this instead of writing code)
  • scripts/transparency.py
    - Difference matting implementation
  • references/prompts.md
    - Extended prompt examples by category
  • scripts/generate.py
    - 图像生成的CLI(使用此脚本而非编写代码)
  • scripts/transparency.py
    - 差异抠图实现
  • references/prompts.md
    - 按类别划分的扩展提示词示例