gemini-imagegen

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Image Generation (Nano Banana Pro)

Gemini 图像生成(Nano Banana Pro)

Generate and edit images using Google's Gemini API. The environment variable
GEMINI_API_KEY
must be set.
使用Google的Gemini API生成和编辑图像。必须设置环境变量
GEMINI_API_KEY

Default Model

默认模型

ModelResolutionBest For
gemini-3-pro-image-preview
1K-4KAll image generation (default)
Note: Always use this Pro model. Only use a different model if explicitly requested.
模型分辨率适用场景
gemini-3-pro-image-preview
1K-4K所有图像生成任务(默认)
注意: 请始终使用此Pro模型。仅在明确要求时才使用其他模型。

Quick Reference

快速参考

Default Settings

默认设置

  • Model:
    gemini-3-pro-image-preview
  • Resolution: 1K (default, options: 1K, 2K, 4K)
  • Aspect Ratio: 1:1 (default)
  • 模型:
    gemini-3-pro-image-preview
  • 分辨率: 1K(默认,可选:1K、2K、4K)
  • 宽高比: 1:1(默认)

Available Aspect Ratios

可用宽高比

1:1
,
2:3
,
3:2
,
3:4
,
4:3
,
4:5
,
5:4
,
9:16
,
16:9
,
21:9
1:1
,
2:3
,
3:2
,
3:4
,
4:3
,
4:5
,
5:4
,
9:16
,
16:9
,
21:9

Available Resolutions

可用分辨率

1K
(default),
2K
,
4K
1K
(默认),
2K
,
4K

Core API Pattern

核心API模式

python
import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
python
import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])

Basic generation (1K, 1:1 - defaults)

Basic generation (1K, 1:1 - defaults)

response = client.models.generate_content( model="gemini-3-pro-image-preview", contents=["Your prompt here"], config=types.GenerateContentConfig( response_modalities=['TEXT', 'IMAGE'], ), )
for part in response.parts: if part.text: print(part.text) elif part.inline_data: image = part.as_image() image.save("output.png")
undefined
response = client.models.generate_content( model="gemini-3-pro-image-preview", contents=["Your prompt here"], config=types.GenerateContentConfig( response_modalities=['TEXT', 'IMAGE'], ), )
for part in response.parts: if part.text: print(part.text) elif part.inline_data: image = part.as_image() image.save("output.png")
undefined

Custom Resolution & Aspect Ratio

自定义分辨率与宽高比

python
from google.genai import types

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[prompt],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Wide format
            image_size="2K"       # Higher resolution
        ),
    )
)
python
from google.genai import types

response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[prompt],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        image_config=types.ImageConfig(
            aspect_ratio="16:9",  # Wide format
            image_size="2K"       # Higher resolution
        ),
    )
)

Resolution Examples

分辨率示例

python
undefined
python
undefined

1K (default) - Fast, good for previews

1K (default) - Fast, good for previews

image_config=types.ImageConfig(image_size="1K")
image_config=types.ImageConfig(image_size="1K")

2K - Balanced quality/speed

2K - Balanced quality/speed

image_config=types.ImageConfig(image_size="2K")
image_config=types.ImageConfig(image_size="2K")

4K - Maximum quality, slower

4K - Maximum quality, slower

image_config=types.ImageConfig(image_size="4K")
undefined
image_config=types.ImageConfig(image_size="4K")
undefined

Aspect Ratio Examples

宽高比示例

python
undefined
python
undefined

Square (default)

Square (default)

image_config=types.ImageConfig(aspect_ratio="1:1")
image_config=types.ImageConfig(aspect_ratio="1:1")

Landscape wide

Landscape wide

image_config=types.ImageConfig(aspect_ratio="16:9")
image_config=types.ImageConfig(aspect_ratio="16:9")

Ultra-wide panoramic

Ultra-wide panoramic

image_config=types.ImageConfig(aspect_ratio="21:9")
image_config=types.ImageConfig(aspect_ratio="21:9")

Portrait

Portrait

image_config=types.ImageConfig(aspect_ratio="9:16")
image_config=types.ImageConfig(aspect_ratio="9:16")

Photo standard

Photo standard

image_config=types.ImageConfig(aspect_ratio="4:3")
undefined
image_config=types.ImageConfig(aspect_ratio="4:3")
undefined

Editing Images

编辑图像

Pass existing images with text prompts:
python
from PIL import Image

img = Image.open("input.png")
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Add a sunset to this scene", img],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)
将现有图像与文本提示一起传入:
python
from PIL import Image

img = Image.open("input.png")
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Add a sunset to this scene", img],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

Multi-Turn Refinement

多轮优化

Use chat for iterative editing:
python
from google.genai import types

chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)

response = chat.send_message("Create a logo for 'Acme Corp'")
使用对话进行迭代编辑:
python
from google.genai import types

chat = client.chats.create(
    model="gemini-3-pro-image-preview",
    config=types.GenerateContentConfig(response_modalities=['TEXT', 'IMAGE'])
)

response = chat.send_message("Create a logo for 'Acme Corp'")

Save first image...

Save first image...

response = chat.send_message("Make the text bolder and add a blue gradient")
response = chat.send_message("Make the text bolder and add a blue gradient")

Save refined image...

Save refined image...

undefined
undefined

Prompting Best Practices

提示词最佳实践

Photorealistic Scenes

写实场景

Include camera details: lens type, lighting, angle, mood.
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"
包含相机细节:镜头类型、光线、角度、氛围。
"A photorealistic close-up portrait, 85mm lens, soft golden hour light, shallow depth of field"

Stylized Art

风格化艺术

Specify style explicitly:
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"
明确指定风格:
"A kawaii-style sticker of a happy red panda, bold outlines, cel-shading, white background"

Text in Images

图像中的文字

Be explicit about font style and placement:
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"
明确说明字体样式和位置:
"Create a logo with text 'Daily Grind' in clean sans-serif, black and white, coffee bean motif"

Product Mockups

产品样机

Describe lighting setup and surface:
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"
描述灯光设置和表面:
"Studio-lit product photo on polished concrete, three-point softbox setup, 45-degree angle"

Advanced Features

高级功能

Google Search Grounding

Google搜索锚定

Generate images based on real-time data:
python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Visualize today's weather in Tokyo as an infographic"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)
根据实时数据生成图像:
python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=["Visualize today's weather in Tokyo as an infographic"],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
        tools=[{"google_search": {}}]
    )
)

Multiple Reference Images (Up to 14)

多张参考图(最多14张)

Combine elements from multiple sources:
python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        "Create a group photo of these people in an office",
        Image.open("person1.png"),
        Image.open("person2.png"),
        Image.open("person3.png"),
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)
合并多个来源的元素:
python
response = client.models.generate_content(
    model="gemini-3-pro-image-preview",
    contents=[
        "Create a group photo of these people in an office",
        Image.open("person1.png"),
        Image.open("person2.png"),
        Image.open("person3.png"),
    ],
    config=types.GenerateContentConfig(
        response_modalities=['TEXT', 'IMAGE'],
    ),
)

Important: File Format & Media Type

重要提示:文件格式与媒体类型

CRITICAL: The Gemini API returns images in JPEG format by default. When saving, always use
.jpg
extension to avoid media type mismatches.
python
undefined
关键注意事项: Gemini API默认返回JPEG格式的图像。保存时,请始终使用
.jpg
扩展名,以避免媒体类型不匹配。
python
undefined

CORRECT - Use .jpg extension (Gemini returns JPEG)

CORRECT - Use .jpg extension (Gemini returns JPEG)

image.save("output.jpg")
image.save("output.jpg")

WRONG - Will cause "Image does not match media type" errors

WRONG - Will cause "Image does not match media type" errors

image.save("output.png") # Creates JPEG with PNG extension!
undefined
image.save("output.png") # Creates JPEG with PNG extension!
undefined

Converting to PNG (if needed)

转换为PNG(如有需要)

If you specifically need PNG format:
python
from PIL import Image
如果您明确需要PNG格式:
python
from PIL import Image

Generate with Gemini

Generate with Gemini

for part in response.parts: if part.inline_data: img = part.as_image() # Convert to PNG by saving with explicit format img.save("output.png", format="PNG")
undefined
for part in response.parts: if part.inline_data: img = part.as_image() # Convert to PNG by saving with explicit format img.save("output.png", format="PNG")
undefined

Verifying Image Format

验证图像格式

Check actual format vs extension with the
file
command:
bash
file image.png
使用
file
命令检查实际格式与扩展名是否匹配:
bash
file image.png

If output shows "JPEG image data" - rename to .jpg!

If output shows "JPEG image data" - rename to .jpg!

undefined
undefined

Notes

注意事项

  • All generated images include SynthID watermarks
  • Gemini returns JPEG format by default - always use
    .jpg
    extension
  • Image-only mode (
    responseModalities: ["IMAGE"]
    ) won't work with Google Search grounding
  • For editing, describe changes conversationally—the model understands semantic masking
  • Default to 1K resolution for speed; use 2K/4K when quality is critical
  • 所有生成的图像都包含SynthID水印
  • Gemini默认返回JPEG格式 - 请始终使用
    .jpg
    扩展名
  • 仅图像模式(
    responseModalities: ["IMAGE"]
    )无法与Google搜索锚定功能配合使用
  • 编辑图像时,用对话式语言描述更改——模型理解语义遮罩
  • 默认使用1K分辨率以提升速度;当对画质有要求时使用2K/4K