nano-banana-pro
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNano Banana Pro Image Generation & Editing
Nano Banana Pro 图像生成与编辑
Generate and edit images using Google's Gemini 3 Pro model with advanced transparency support.
使用Google的Gemini 3 Pro模型生成和编辑图像,支持高级透明度功能。
Prerequisites
前提条件
-
Dependencies:bash
pip install google-genai Pillow numpy python-dotenv -
API Key: The script loads fromautomatically. Only ask the user if the script fails with "No API key found".
.env
-
依赖项:bash
pip install google-genai Pillow numpy python-dotenv -
API密钥:脚本会自动从文件加载。仅当脚本因“未找到API密钥”失败时,才向用户询问。
.env
CLI Usage (REQUIRED)
CLI使用方法(必填)
ALWAYS use the CLI script. Do NOT write Python code or create .py files.
Run directly:
scripts/generate.pybash
undefined务必使用CLI脚本。请勿编写Python代码或创建.py文件。
直接运行:
scripts/generate.pybash
undefinedBasic generation
基础生成
python scripts/generate.py "a cute banana sticker" -o banana.png
python scripts/generate.py "a cute banana sticker" -o banana.png
With transparency (for game assets, stickers, icons)
带透明度(适用于游戏资源、贴纸、图标)
python scripts/generate.py "pixel art sword" -o sword.png --transparent
python scripts/generate.py "pixel art sword" -o sword.png --transparent
Custom size and aspect ratio
自定义尺寸和宽高比
python scripts/generate.py "game logo" -o logo.png --size 4K --ratio 16:9
**Options:**
- `-o, --output` - Output filename (default: output.png)
- `--transparent` - Extract true alpha channel using difference matting
- `--size` - 1K, 2K, or 4K (default: 2K)
- `--ratio` - Aspect ratio: 1:1, 16:9, 9:16, etc. (default: 1:1)
- `--model` - Model override (default: gemini-3-pro-image-preview)
**Note:** The script loads the API key from `.env` automatically. Do not check for API keys manually or ask the user about them - just run the script and it will error with instructions if the key is missing.python scripts/generate.py "game logo" -o logo.png --size 4K --ratio 16:9
**选项:**
- `-o, --output` - 输出文件名(默认值:output.png)
- `--transparent` - 使用差异抠图提取真实的Alpha通道
- `--size` - 1K、2K或4K(默认值:2K)
- `--ratio` - 宽高比:1:1、16:9、9:16等(默认值:1:1)
- `--model` - 模型覆盖(默认值:gemini-3-pro-image-preview)
**注意:** 脚本会自动从`.env`文件加载API密钥。请勿手动检查API密钥或向用户询问,直接运行脚本即可,如果密钥缺失,脚本会报错并给出说明。Intent Detection
意图识别
Analyze user request to determine:
| Intent | Triggers | Action |
|---|---|---|
| Generate | "create", "generate", "make", "draw", "design" | Text-to-image |
| Edit | "edit", "change", "modify", "update", "fix" | Image-to-image |
| Transparency | "transparent", "remove background", "alpha", "cutout", "PNG with transparency" | Use difference matting |
| Text overlay | "add text", "write on", "label", "caption" | Use Gemini 3 Pro for accurate text |
分析用户请求以确定:
| 意图 | 触发词 | 操作 |
|---|---|---|
| 生成 | "create"、"generate"、"make"、"draw"、"design" | 文本转图像 |
| 编辑 | "edit"、"change"、"modify"、"update"、"fix" | 图像转图像 |
| 透明度 | "transparent"、"remove background"、"alpha"、"cutout"、"PNG with transparency" | 使用差异抠图 |
| 文字叠加 | "add text"、"write on"、"label"、"caption" | 使用Gemini 3 Pro添加精准文字 |
Resolution Selection
分辨率选择
Choose resolution based on use case:
| Resolution | Best For | Pixel Output |
|---|---|---|
| 1K | Quick previews, thumbnails, web icons | ~1024px |
| 2K | Social media, standard web images | ~2048px |
| 4K | Print, professional assets, sprite sheets | ~4096px |
Heuristics:
- Sprite sheets, game assets, print materials → 4K
- Social media, blog images, presentations → 2K
- Quick tests, thumbnails, prototypes → 1K
When uncertain, ask user or default to 2K.
根据使用场景选择分辨率:
| 分辨率 | 最佳用途 | 像素输出 |
|---|---|---|
| 1K | 快速预览、缩略图、网页图标 | ~1024px |
| 2K | 社交媒体、标准网页图像 | ~2048px |
| 4K | 印刷、专业资源、精灵图 | ~4096px |
经验法则:
- 精灵图、游戏资源、印刷材料 → 4K
- 社交媒体、博客图片、演示文稿 → 2K
- 快速测试、缩略图、原型 → 1K
不确定时,询问用户或默认使用2K。
Aspect Ratios
宽高比
Available: , , , , , , , , ,
1:12:33:23:44:34:55:49:1616:921:9Selection guide:
- Square content (icons, avatars, social posts) →
1:1 - Portrait (mobile, vertical video) → or
9:163:4 - Landscape (desktop, presentations) → or
16:93:2 - Cinematic/ultrawide →
21:9
可用选项:、、、、、、、、、
1:12:33:23:44:34:55:49:1616:921:9选择指南:
- 方形内容(图标、头像、社交帖子) →
1:1 - 竖屏(移动端、竖版视频) → 或
9:163:4 - 横屏(桌面端、演示文稿) → 或
16:93:2 - 电影级/超宽屏 →
21:9
Core Implementation
核心实现
Basic Generation
基础生成
python
from google import genai
from google.genai import types
from PIL import Image
import io
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents="Your descriptive prompt here",
config=types.GenerateContentConfig(
response_modalities=['IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="1:1", # or other ratio
image_size="2K" # 1K, 2K, or 4K
),
),
)python
from google import genai
from google.genai import types
from PIL import Image
import io
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents="Your descriptive prompt here",
config=types.GenerateContentConfig(
response_modalities=['IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="1:1", # or other ratio
image_size="2K" # 1K, 2K, or 4K
),
),
)Extract image from response
Extract image from response
for part in response.parts:
if part.inline_data is not None:
image = Image.open(io.BytesIO(part.inline_data.data))
image.save("output.png")
break
undefinedfor part in response.parts:
if part.inline_data is not None:
image = Image.open(io.BytesIO(part.inline_data.data))
image.save("output.png")
break
undefinedImage Editing
图像编辑
python
undefinedpython
undefinedLoad existing image
Load existing image
input_image = Image.open("input.png")
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
input_image,
"Edit instruction: Change the background to sunset colors"
],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="1:1",
image_size="2K"
),
),
)
undefinedinput_image = Image.open("input.png")
response = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
input_image,
"Edit instruction: Change the background to sunset colors"
],
config=types.GenerateContentConfig(
response_modalities=['TEXT', 'IMAGE'],
image_config=types.ImageConfig(
aspect_ratio="1:1",
image_size="2K"
),
),
)
undefinedMulti-Turn Editing
多轮编辑
Preserve context across edits using thought signatures:
python
undefined使用思维标记在编辑过程中保留上下文:
python
undefinedFirst edit
First edit
response1 = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[image, "Add a red hat"],
config=config,
)
response1 = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[image, "Add a red hat"],
config=config,
)
Continue editing (include previous response)
Continue editing (include previous response)
response2 = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
image,
"Add a red hat",
response1, # Include for context preservation
"Now make the hat blue instead"
],
config=config,
)
undefinedresponse2 = client.models.generate_content(
model="gemini-3-pro-image-preview",
contents=[
image,
"Add a red hat",
response1, # Include for context preservation
"Now make the hat blue instead"
],
config=config,
)
undefinedTransparency Extraction
透明度提取
When user needs transparent images, use difference matting. See .
scripts/transparency.pyWhen to use:
- User explicitly asks for transparency
- Game sprites, icons, logos
- Assets that will be composited
- Cutouts and stickers
Process:
- Generate image on pure white background (#FFFFFF)
- Edit same image to pure black background (#000000)
- Calculate alpha from pixel differences
- Recover original colors
Key insight: Opaque pixels appear identical on both backgrounds (distance ≈ 0), transparent pixels show background color (max distance).
python
from scripts.transparency import extract_alpha_difference_matting当用户需要透明图像时,使用差异抠图。请查看。
scripts/transparency.py使用场景:
- 用户明确要求透明度
- 游戏精灵、图标、标志
- 将要合成的资源
- 抠图和贴纸
流程:
- 在纯白背景(#FFFFFF)上生成图像
- 将同一图像编辑为纯黑背景(#000000)
- 根据像素差异计算Alpha通道
- 还原原始颜色
核心思路: 不透明像素在两种背景下显示相同(差异≈0),透明像素会显示背景颜色(差异最大)。
python
from scripts.transparency import extract_alpha_difference_mattingAfter generating white and black background versions
After generating white and black background versions
final_image = extract_alpha_difference_matting(img_on_white, img_on_black)
final_image.save("output.png") # RGBA with true transparency
undefinedfinal_image = extract_alpha_difference_matting(img_on_white, img_on_black)
final_image.save("output.png") # RGBA with true transparency
undefinedPrompt Engineering
提示词工程
Fundamental Principle
基本原则
"Describe the scene, don't just list keywords."
Narrative paragraphs outperform disconnected word lists.
“描述场景,不要只罗列关键词。”
叙事性段落比零散的单词列表效果更好。
Effective Prompt Structure
有效的提示词结构
[Style/Medium] of [Subject] in [Context/Setting], [Lighting], [Additional details]Examples:
undefined[风格/媒介]的[主体]在[环境/场景]中,[光线],[其他细节]示例:
undefinedPhotorealistic
写实风格
A professional studio photograph of a brass steampunk pocket watch,
shot with a 50mm lens, soft diffused lighting from the left,
shallow depth of field with bokeh background, 4K HDR quality.
一张专业工作室拍摄的黄铜蒸汽朋克怀表照片,
使用50mm镜头拍摄,左侧柔和漫射光,
浅景深背景虚化,4K HDR画质。
Illustration
插画风格
A detailed digital illustration of a medieval blacksmith's forge,
isometric perspective, warm orange glow from the furnace,
dieselpunk aesthetic with exposed pipes and riveted metal plates.
一幅详细的中世纪铁匠铺数字插画,
等距视角,熔炉发出温暖的橙色光芒,
柴油朋克美学,带有裸露的管道和铆钉金属板。
Product mockup
产品样机
A product photography shot of a ceramic coffee mug on a marble surface,
natural window lighting, minimalist Scandinavian style, clean white background.
undefined一张陶瓷咖啡杯放在大理石台面上的产品摄影图,
自然窗户光线,简约斯堪的纳维亚风格,干净的白色背景。
undefinedText in Images
图像中的文字
For images containing text, use Gemini 3 Pro (not Imagen):
- Keep text to 25 characters or less per element
- Use 2-3 distinct text phrases maximum
- Specify font style generally (bold, elegant, handwritten)
- Indicate size (small, medium, large)
如需在图像中添加文字,请使用Gemini 3 Pro(而非Imagen):
- 每个元素的文字控制在25个字符以内
- 最多使用2-3个不同的文字短语
- 大致指定字体风格(粗体、优雅、手写体)
- 说明尺寸(小、中、大)
Quality Modifiers
质量修饰词
Add these for enhanced output:
- Photography: 4K, HDR, studio photo, professional lighting
- Art: detailed, by a professional, high-quality illustration
- General: high-fidelity, crisp details, polished finish
添加以下内容以提升输出质量:
- 摄影: 4K、HDR、工作室照片、专业布光
- 艺术: 细节丰富、专业水准、高质量插画
- 通用: 高保真、清晰细节、精致成品
Error Handling
错误处理
python
from google.genai import errors
def generate_with_retry(client, *, model, contents, config, max_attempts=5):
for attempt in range(1, max_attempts + 1):
try:
return client.models.generate_content(
model=model, contents=contents, config=config
)
except errors.APIError as e:
code = getattr(e, "code", None) or getattr(e, "status", None)
if code not in (429, 500, 502, 503, 504) or attempt >= max_attempts:
raise
delay = min(30, 2 ** (attempt - 1))
time.sleep(delay)python
from google.genai import errors
def generate_with_retry(client, *, model, contents, config, max_attempts=5):
for attempt in range(1, max_attempts + 1):
try:
return client.models.generate_content(
model=model, contents=contents, config=config
)
except errors.APIError as e:
code = getattr(e, "code", None) or getattr(e, "status", None)
if code not in (429, 500, 502, 503, 504) or attempt >= max_attempts:
raise
delay = min(30, 2 ** (attempt - 1))
time.sleep(delay)Model Selection
模型选择
| Model | Use Case |
|---|---|
| Complex edits, text rendering, multi-turn, transparency workflows |
| Quick generation, high volume, simple tasks |
| Photorealistic images, no editing needed |
Default to gemini-3-pro-image-preview for most tasks.
| 模型 | 使用场景 |
|---|---|
| 复杂编辑、文字渲染、多轮对话、透明度工作流 |
| 快速生成、高吞吐量、简单任务 |
| 写实图像、无需编辑 |
大多数任务默认使用gemini-3-pro-image-preview。
File References
文件参考
- - CLI for image generation (use this instead of writing code)
scripts/generate.py - - Difference matting implementation
scripts/transparency.py - - Extended prompt examples by category
references/prompts.md
- - 图像生成的CLI(使用此脚本而非编写代码)
scripts/generate.py - - 差异抠图实现
scripts/transparency.py - - 按类别划分的扩展提示词示例
references/prompts.md