nanobanana

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Nano Banana - AI Image Generation

Nano Banana - AI图像生成

Generate and edit images using Google Gemini models. Supports two models:
  • Pro (
    gemini-3-pro-image-preview
    ) — High quality, complex prompts, thinking mode
  • Flash (
    gemini-2.5-flash-image
    ) — Fast, cheap, good for iteration
使用Google Gemini模型生成和编辑图像。支持两种模型:
  • Pro (
    gemini-3-pro-image-preview
    ) — 高质量、支持复杂提示词、具备思考模式
  • Flash (
    gemini-2.5-flash-image
    ) — 速度快、成本低,适合迭代开发

Prerequisites

前置条件

Required:
  • GEMINI_API_KEY
    — Get from Google AI Studio
  • uv
    (recommended) or Python 3.10+ with
    google-genai
    installed
With
uv
(recommended — zero setup):
Dependencies are declared inline via PEP 723 and auto-installed on first run. Just use
uv run
instead of
python3
.
With pip (fallback):
bash
pip install -r <skill_dir>/requirements.txt
必填项:
  • GEMINI_API_KEY
    — 从Google AI Studio获取
  • uv
    (推荐)或已安装
    google-genai
    的Python 3.10+环境
使用
uv
(推荐 — 零配置):
依赖项通过PEP 723内联声明,首次运行时会自动安装。只需用
uv run
代替
python3
即可。
使用pip(备选方案):
bash
pip install -r <skill_dir>/requirements.txt

Quick Start

快速开始

Default output: Images save to
~/Downloads/nanobanana_<timestamp>.png
automatically. Do NOT pass
-o
unless the user specifies where to save. If the user provides a filename without a directory (e.g., "save it as robot.png"), use
-o ~/Downloads/robot.png
.
默认输出: 图像会自动保存到
~/Downloads/nanobanana_<timestamp>.png
。除非用户指定保存路径,否则不要传递
-o
参数。如果用户只提供了文件名没有目录(例如"保存为robot.png"),请使用
-o ~/Downloads/robot.png

Generate an image:

生成图像:

bash
uv run <skill_dir>/scripts/generate.py "a cute robot mascot, pixel art style"
bash
uv run <skill_dir>/scripts/generate.py "a cute robot mascot, pixel art style"

Edit an existing image:

编辑现有图像:

bash
uv run <skill_dir>/scripts/generate.py "make the background blue" -i input.jpg
bash
uv run <skill_dir>/scripts/generate.py "make the background blue" -i input.jpg

Use Flash model for fast iteration:

使用Flash模型快速迭代:

bash
uv run <skill_dir>/scripts/generate.py "quick sketch of a cat" --model flash
bash
uv run <skill_dir>/scripts/generate.py "quick sketch of a cat" --model flash

Multi-image reference (style + subject):

多图像参考(风格+主体):

bash
uv run <skill_dir>/scripts/generate.py "apply the style of the first image to the second" \
  -i style_ref.png subject.jpg
bash
uv run <skill_dir>/scripts/generate.py "apply the style of the first image to the second" \
  -i style_ref.png subject.jpg

Generate with specific aspect ratio and resolution:

生成指定宽高比和分辨率的图像:

bash
uv run <skill_dir>/scripts/generate.py "cinematic landscape" --ratio 21:9 --size 4K
bash
uv run <skill_dir>/scripts/generate.py "cinematic landscape" --ratio 21:9 --size 4K

Save to a specific location:

保存到指定位置:

bash
uv run <skill_dir>/scripts/generate.py "logo design" -o ~/Projects/brand/logo.png
bash
uv run <skill_dir>/scripts/generate.py "logo design" -o ~/Projects/brand/logo.png

Model Selection Guide

模型选择指南

Pro (default)Flash
SpeedSlower~2-3x faster
CostHigherLower
Text renderingGoodUnreliable
Complex scenesExcellentAdequate
Thinking modeYesNo
Best forFinal production imagesExploration, drafts, batch
Rule of thumb: Use Flash for exploration and batch generation, Pro for final output.
Pro(默认)Flash
速度较慢快2-3倍
成本较高较低
文本渲染效果不稳定
复杂场景处理极佳合格
思考模式支持不支持
最佳适用场景最终生产用图探索、草稿、批量生成
经验法则: 探索阶段和批量生成用Flash,最终输出用Pro。

Script Reference

脚本参考

scripts/generate.py

scripts/generate.py

Main image generation script.
Usage: generate.py [OPTIONS] PROMPT

Arguments:
  PROMPT                Text prompt for image generation

Options:
  -o, --output PATH     Output file path (default: ~/Downloads/nanobanana_<timestamp>.png)
  -i, --input PATH...   Input image(s) for editing / reference (up to 14)
  -m, --model MODEL     Model: 'pro' (default), 'flash', or full model ID
  -r, --ratio RATIO     Aspect ratio (1:1, 16:9, 9:16, 21:9, etc.)
  -s, --size SIZE       Image size: 1K, 2K, or 4K (default: standard)
  --search              Enable Google Search grounding for accuracy
  --retries N           Max retries on rate limit (default: 3)
  -v, --verbose         Show detailed output
Supported aspect ratios:
  • 1:1
    — Square (default)
  • 2:3
    ,
    3:2
    — Portrait/Landscape
  • 3:4
    ,
    4:3
    — Standard
  • 4:5
    ,
    5:4
    — Photo
  • 9:16
    ,
    16:9
    — Widescreen
  • 21:9
    — Ultra-wide/Cinematic
Image sizes:
  • 1K
    — Fast, lower detail
  • 2K
    — Enhanced detail (2048px)
  • 4K
    — Maximum quality (3840px), best for text rendering
核心图像生成脚本。
Usage: generate.py [OPTIONS] PROMPT

Arguments:
  PROMPT                图像生成的文本提示词

Options:
  -o, --output PATH     输出文件路径(默认:~/Downloads/nanobanana_<timestamp>.png)
  -i, --input PATH...   用于编辑/参考的输入图像(最多14张)
  -m, --model MODEL     模型:'pro'(默认)、'flash',或完整模型ID
  -r, --ratio RATIO     宽高比(1:1、16:9、9:16、21:9等)
  -s, --size SIZE       图像尺寸:1K、2K或4K(默认:标准尺寸)
  --search              启用Google Search grounding提升内容准确性
  --retries N           速率限制触发后的最大重试次数(默认:3)
  -v, --verbose         显示详细输出
支持的宽高比:
  • 1:1
    — 正方形(默认)
  • 2:3
    3:2
    — 竖版/横版
  • 3:4
    4:3
    — 标准比例
  • 4:5
    5:4
    — 照片比例
  • 9:16
    16:9
    — 宽屏
  • 21:9
    — 超宽屏/电影比例
图像尺寸:
  • 1K
    — 生成速度快,细节较少
  • 2K
    — 细节增强(2048px)
  • 4K
    — 最高质量(3840px),文本渲染效果最佳

scripts/batch_generate.py

scripts/batch_generate.py

Generate multiple images with sequential naming.
Usage: batch_generate.py [OPTIONS] PROMPT

Arguments:
  PROMPT                Text prompt for image generation

Options:
  -n, --count N         Number of images to generate (default: 10)
  -d, --dir PATH        Output directory (default: ~/Downloads)
  -p, --prefix STR      Filename prefix (default: "image")
  -m, --model MODEL     Model: 'pro' (default), 'flash', or full model ID
  -r, --ratio RATIO     Aspect ratio
  -s, --size SIZE       Image size (1K/2K/4K)
  --search              Enable Google Search grounding
  --retries N           Max retries per image on rate limit (default: 3)
  --delay SECONDS       Delay between generations (default: 3)
  --parallel N          Concurrent requests (default: 1, max recommended: 5)
  -q, --quiet           Suppress progress output
Example:
bash
uv run <skill_dir>/scripts/batch_generate.py "pixel art logo" -n 20 --model flash -d ./logos -p logo
生成多张图像并按顺序命名。
Usage: batch_generate.py [OPTIONS] PROMPT

Arguments:
  PROMPT                图像生成的文本提示词

Options:
  -n, --count N         要生成的图像数量(默认:10)
  -d, --dir PATH        输出目录(默认:~/Downloads)
  -p, --prefix STR      文件名前缀(默认:"image")
  -m, --model MODEL     模型:'pro'(默认)、'flash',或完整模型ID
  -r, --ratio RATIO     宽高比
  -s, --size SIZE       图像尺寸(1K/2K/4K)
  --search              启用Google Search grounding
  --retries N           单张图像速率限制触发后的最大重试次数(默认:3)
  --delay SECONDS       两次生成之间的延迟(默认:3秒)
  --parallel N          并发请求数(默认:1,推荐最大值:5)
  -q, --quiet           关闭进度输出
示例:
bash
uv run <skill_dir>/scripts/batch_generate.py "pixel art logo" -n 20 --model flash -d ./logos -p logo

Python API

Python API

Direct import (from another skill's script):

直接导入(从其他技能的脚本中调用):

Note: When importing as a Python module,
google-genai
must be available in the calling script's environment. If using
uv run
, add a PEP 723
dependencies
block to your own script (see example in Pattern 2 below).
python
import sys
from pathlib import Path
sys.path.insert(0, str(Path("<skill_dir>/scripts")))
from generate import generate_image, edit_image, batch_generate
注意: 作为Python模块导入时,调用脚本的环境中必须已安装
google-genai
。如果使用
uv run
,请在你自己的脚本中添加PEP 723
dependencies
块(参见下方模式2的示例)。
python
import sys
from pathlib import Path
sys.path.insert(0, str(Path("<skill_dir>/scripts")))
from generate import generate_image, edit_image, batch_generate

Generate image

生成图像

result = generate_image( prompt="a futuristic city at night", output_path="city.png", aspect_ratio="16:9", image_size="4K", model="pro", )
result = generate_image( prompt="a futuristic city at night", output_path="city.png", aspect_ratio="16:9", image_size="4K", model="pro", )

Edit existing image

编辑现有图像

result = edit_image( prompt="add flying cars to the sky", input_path="city.png", output_path="city_edited.png", )
result = edit_image( prompt="add flying cars to the sky", input_path="city.png", output_path="city_edited.png", )

Multi-image reference

多图像参考

result = generate_image( prompt="combine the color palette of the first with the composition of the second", input_paths=["palette_ref.png", "composition_ref.png"], output_path="combined.png", )
undefined
result = generate_image( prompt="combine the color palette of the first with the composition of the second", input_paths=["palette_ref.png", "composition_ref.png"], output_path="combined.png", )
undefined

Return structure (always present):

返回结构(所有字段始终存在):

python
{
    "success": True,       # or False
    "path": "/path/to/output.png",  # or None on failure
    "error": None,         # or error message string
    "metadata": {
        "model": "gemini-3-pro-image-preview",
        "prompt": "...",
        "aspect_ratio": "16:9",
        "image_size": "4K",
        "use_search": False,
        "input_images": None,        # or list of paths
        "text_response": "...",      # optional text from model
        "thinking": "...",           # Pro model reasoning (when available)
        "timestamp": "2025-01-26T...",
    }
}
python
{
    "success": True,       # 失败则为False
    "path": "/path/to/output.png",  # 失败则为None
    "error": None,         # 失败则为错误信息字符串
    "metadata": {
        "model": "gemini-3-pro-image-preview",
        "prompt": "...",
        "aspect_ratio": "16:9",
        "image_size": "4K",
        "use_search": False,
        "input_images": None,        # 或输入图像路径列表
        "text_response": "...",      # 模型返回的可选文本内容
        "thinking": "...",           # Pro模型的推理过程(如果可用)
        "timestamp": "2025-01-26T...",
    }
}

Downstream Skill Integration Guide

下游技能集成指南

Pattern 1: CLI wrapper (recommended for simple use)

模式1:CLI封装(简单场景推荐)

bash
undefined
bash
undefined

In your skill's script:

在你的技能脚本中:

uv run <nanobanana_dir>/scripts/generate.py "{prompt}" --model flash --ratio 16:9 -o output.png
undefined
uv run <nanobanana_dir>/scripts/generate.py "{prompt}" --model flash --ratio 16:9 -o output.png
undefined

Pattern 2: Python import with custom defaults

模式2:Python导入并自定义默认参数

python
undefined
python
undefined

/// script

/// script

requires-python = ">=3.10"

requires-python = ">=3.10"

dependencies = [

dependencies = [

"google-genai>=1.0.0",

"google-genai>=1.0.0",

]

]

///

///

import sys from pathlib import Path
NANOBANANA_DIR = Path("<nanobanana_dir>/scripts") sys.path.insert(0, str(NANOBANANA_DIR)) from generate import generate_image
def generate_thumbnail(prompt: str, output_path: str) -> dict: """Generate a YouTube thumbnail with project defaults.""" return generate_image( prompt=prompt, output_path=output_path, aspect_ratio="16:9", image_size="2K", model="flash", max_retries=3, )
undefined
import sys from pathlib import Path
NANOBANANA_DIR = Path("<nanobanana_dir>/scripts") sys.path.insert(0, str(NANOBANANA_DIR)) from generate import generate_image
def generate_thumbnail(prompt: str, output_path: str) -> dict: """使用项目默认参数生成YouTube缩略图。""" return generate_image( prompt=prompt, output_path=output_path, aspect_ratio="16:9", image_size="2K", model="flash", max_retries=3, )
undefined

Pattern 3: Batch with progress tracking

模式3:带进度跟踪的批量生成

python
from batch_generate import batch_generate

def on_progress(completed, total, result):
    print(f"Progress: {completed}/{total}")

results = batch_generate(
    prompt="logo concept",
    count=20,
    output_dir="./logos",
    prefix="logo",
    model="flash",
    aspect_ratio="1:1",
    on_progress=on_progress,
)

successful = [r for r in results if r["success"]]
python
from batch_generate import batch_generate

def on_progress(completed, total, result):
    print(f"进度:{completed}/{total}")

results = batch_generate(
    prompt="logo concept",
    count=20,
    output_dir="./logos",
    prefix="logo",
    model="flash",
    aspect_ratio="1:1",
    on_progress=on_progress,
)

successful = [r for r in results if r["success"]]

Pattern 4: Sequential generation for series

模式4:系列图像顺序生成

When a downstream skill needs multiple consistently-styled images (e.g., newsletter visuals, thumbnail A/B variants), use the anchor-and-reference pattern:
python
from generate import generate_image
当下游技能需要多张风格统一的图像时(例如 newsletter 视觉素材、缩略图A/B测试变体),可以使用锚点+参考模式:
python
from generate import generate_image

Step 1: Generate the style anchor

步骤1:生成风格锚点

anchor = generate_image( prompt="warm illustration style, earth tones, soft gradients, clean lines", output_path="anchor.png", model="pro", )
anchor = generate_image( prompt="warm illustration style, earth tones, soft gradients, clean lines", output_path="anchor.png", model="pro", )

Step 2: Generate each image in the series, referencing the anchor

步骤2:生成系列中的每张图像,参考锚点

subjects = ["laptop on desk with coffee", "person reading a book", "sunrise over mountains"] series_paths = [anchor["path"]]
for i, subject in enumerate(subjects): result = generate_image( prompt=f"{subject}, matching the visual style and color palette of the reference image exactly", input_paths=[anchor["path"]], # always include the anchor output_path=f"series_{i+1:02d}.png", model="pro", ) if result["success"]: series_paths.append(result["path"])

The full sequential generation patterns are documented in the [Sequential Generation](#sequential-generation) section above.
subjects = ["laptop on desk with coffee", "person reading a book", "sunrise over mountains"] series_paths = [anchor["path"]]
for i, subject in enumerate(subjects): result = generate_image( prompt=f"{subject}, matching the visual style and color palette of the reference image exactly", input_paths=[anchor["path"]], # 始终包含锚点 output_path=f"series_{i+1:02d}.png", model="pro", ) if result["success"]: series_paths.append(result["path"])

完整的顺序生成模式已记录在上文[顺序生成](#顺序生成)部分。

Environment Variables

环境变量

VariableDescriptionDefault
GEMINI_API_KEY
Google Gemini API keyRequired
IMAGE_OUTPUT_DIR
Default output directory
~/Downloads
变量名说明默认值
GEMINI_API_KEY
Google Gemini API密钥必填
IMAGE_OUTPUT_DIR
默认输出目录
~/Downloads

Features

功能特性

Text-to-Image Generation

文生图生成

Create images from text descriptions. Both models excel at:
  • Photorealistic images
  • Artistic styles (pixel art, illustration, etc.)
  • Product photography
  • Landscapes and scenes
从文本描述创建图像。两款模型都擅长生成:
  • 写实图像
  • 艺术风格图像(像素画、插画等)
  • 产品摄影图
  • 风景和场景图

Image Editing

图像编辑

Transform existing images with natural language:
  • Style transfer
  • Object addition/removal
  • Background changes
  • Color adjustments
通过自然语言转换现有图像:
  • 风格迁移
  • 添加/移除物体
  • 更换背景
  • 色彩调整

Multi-Image Reference

多图像参考

Provide up to 14 reference images for:
  • Style consistency across a series
  • Subject consistency (same character, different poses)
  • Brand-consistent generation
  • Style + subject combination
最多可提供14张参考图像,用于:
  • 系列图像风格统一
  • 主体一致性(同一个角色、不同姿势)
  • 品牌风格一致的生成
  • 风格+主体组合生成

High-Resolution Output

高分辨率输出

  • 1K — Fast generation, good for drafts
  • 2K — Enhanced detail (2048px)
  • 4K — Maximum quality (3840px), best for text rendering
  • 1K — 生成速度快,适合草稿
  • 2K — 细节增强(2048px)
  • 4K — 最高质量(3840px),文本渲染效果最佳

Google Search Grounding

Google Search grounding

Enable
--search
for factually accurate images involving:
  • Real people, places, landmarks
  • Current events
  • Specific products or brands
启用
--search
参数可以生成符合事实的图像,适用于:
  • 真实人物、地点、地标
  • 时事内容
  • 特定产品或品牌

Automatic Retry

自动重试

Rate limit errors are automatically retried with exponential backoff (default: 3 retries). No action needed from callers.
速率限制错误会通过指数退避自动重试(默认重试3次),调用方无需额外处理。

SynthID Watermark Notice

SynthID水印提示

All images generated by Gemini contain an invisible SynthID digital watermark. This is automatic, cannot be disabled, and survives common transformations (resize, crop, compression). Be aware of this for any use case requiring watermark-free output.
Gemini生成的所有图像都包含不可见的SynthID数字水印。该功能是自动开启的,无法关闭,且可在常见变换(调整大小、裁剪、压缩)后保留。如果你的使用场景需要无水印输出,请知悉这一限制。

Sequential Generation

顺序生成

Use sequential generation to maintain visual consistency across a series of images. The core technique: generate an anchor image first, then pass it as a reference (
-i
) for every subsequent image in the series.
使用顺序生成可以保持系列图像的视觉一致性。核心技术:首先生成锚点图像,然后将其作为参考(
-i
参数)传递给系列中后续的所有图像。

Pattern 1: Style-Board Anchoring

模式1:风格板锚定

Generate a single anchor image that establishes the visual identity for a series. Reference it for all subsequent images.
When to use: Newsletter visual series, A/B thumbnail variants, brand-consistent image batches.
Workflow:
  1. Generate the anchor image with a prompt emphasizing style, palette, and mood:
bash
uv run <skill_dir>/scripts/generate.py \
  "modern flat illustration style, warm earth tones, soft gradients, clean lines, \
  minimal detail, cozy atmosphere" \
  --model pro -o anchor.png
  1. Generate each subsequent image referencing the anchor:
bash
uv run <skill_dir>/scripts/generate.py \
  "a laptop on a desk with coffee, matching the visual style, color palette, \
  and lighting of the reference image exactly" \
  -i anchor.png --model pro -o image_01.png
  1. Repeat step 2 for each image in the series, always referencing the same anchor.
Tip: Use Flash to draft the anchor quickly, then regenerate with Pro once you find a style you like.
生成单个锚点图像,确立系列的视觉标识。所有后续图像都参考该锚点。
适用场景: Newsletter视觉系列、缩略图A/B测试变体、品牌风格一致的批量图像。
工作流:
  1. 生成锚点图像,提示词中重点强调风格、调色板和氛围:
bash
uv run <skill_dir>/scripts/generate.py \
  "modern flat illustration style, warm earth tones, soft gradients, clean lines, \
  minimal detail, cozy atmosphere" \
  --model pro -o anchor.png
  1. 生成后续每张图像时都参考锚点:
bash
uv run <skill_dir>/scripts/generate.py \
  "a laptop on a desk with coffee, matching the visual style, color palette, \
  and lighting of the reference image exactly" \
  -i anchor.png --model pro -o image_01.png
  1. 为系列中的每张图像重复步骤2,始终参考同一个锚点。
提示: 先用Flash快速草稿锚点,找到喜欢的风格后再用Pro重新生成。

Pattern 2: Subject Consistency

模式2:主体一致性

Keep the same character or subject looking consistent across different scenes and poses.
When to use: Mascot in multiple contexts, product photography series, recurring character.
Workflow:
  1. Generate the initial subject with clear, detailed appearance description:
bash
uv run <skill_dir>/scripts/generate.py \
  "a friendly robot mascot with round blue body, orange antenna, large expressive eyes, \
  simple geometric design, standing front-facing on white background" \
  --model pro -o subject_front.png
  1. Generate new scenes referencing the subject:
bash
uv run <skill_dir>/scripts/generate.py \
  "the same robot character from the reference image, now sitting at a desk typing, \
  same proportions and colors, office background" \
  -i subject_front.png --model pro -o subject_office.png
  1. For stronger consistency, reference 2-3 of the best previous outputs:
bash
uv run <skill_dir>/scripts/generate.py \
  "the same robot character from the reference images, now outdoors in a park, \
  same proportions and colors, waving at the viewer" \
  -i subject_front.png subject_office.png --model pro -o subject_park.png
保持同一个角色或主体在不同场景和姿势下的外观一致。
适用场景: 多场景下的吉祥物、产品摄影系列、 recurring角色。
工作流:
  1. 生成初始主体,提示词包含清晰详细的外观描述:
bash
uv run <skill_dir>/scripts/generate.py \
  "a friendly robot mascot with round blue body, orange antenna, large expressive eyes, \
  simple geometric design, standing front-facing on white background" \
  --model pro -o subject_front.png
  1. 生成新场景时参考该主体:
bash
uv run <skill_dir>/scripts/generate.py \
  "the same robot character from the reference image, now sitting at a desk typing, \
  same proportions and colors, office background" \
  -i subject_front.png --model pro -o subject_office.png
  1. 为了获得更强的一致性,可以参考2-3张效果最好的往期输出:
bash
uv run <skill_dir>/scripts/generate.py \
  "the same robot character from the reference images, now outdoors in a park, \
  same proportions and colors, waving at the viewer" \
  -i subject_front.png subject_office.png --model pro -o subject_park.png

Pattern 3: Progressive Accumulation

模式3:渐进式积累

Build a reference pool over a long series, adding each successful output as a reference for the next.
When to use: Series of 5+ images where consistency must compound across the full set.
Workflow:
  1. Generate the anchor (same as Pattern 1, step 1).
  2. Generate image 2 referencing the anchor.
  3. Generate image 3 referencing anchor + image 2.
  4. Continue, keeping the 3-4 strongest references in the
    -i
    list. Drop weaker outputs.
Why cap at 3-4 references: More references dilute the style signal. The model averages across all inputs — too many and the result loses coherence. Keep only the images that best represent the target style.
Reference ordering matters: Place the style anchor first in the
-i
list. The model weights earlier references slightly more.
在长系列生成过程中逐步构建参考池,将每张成功的输出添加为下一张的参考。
适用场景: 5张以上的系列图像,需要在全系列中逐步强化一致性。
工作流:
  1. 生成锚点(同模式1步骤1)。
  2. 参考锚点生成第2张图像。
  3. 参考锚点+第2张图像生成第3张图像。
  4. 继续生成,在
    -i
    列表中保留3-4张效果最好的参考,丢弃效果较差的输出。
为什么限制为3-4张参考: 更多参考会稀释风格信号,模型会对所有输入取平均,参考太多会导致结果失去连贯性。只保留最能代表目标风格的图像即可。
参考顺序很重要: 将风格锚点放在
-i
列表的首位,模型会给更早的参考赋予稍高的权重。

Best Practices

最佳实践

Prompt Writing

提示词编写

Good prompts include:
  • Subject description
  • Style/aesthetic
  • Lighting and mood
  • Composition details
  • Color palette
See references/prompts.md for detailed prompt templates by category and model-specific tips.
优秀的提示词包含:
  • 主体描述
  • 风格/美学特征
  • 光线和氛围
  • 构图细节
  • 调色板
查看references/prompts.md获取按分类划分的详细提示词模板,以及模型专属提示技巧。

Batch Generation Tips

批量生成技巧

  1. Use
    --model flash
    for exploration batches (faster, cheaper)
  2. Generate 10-20 variations to explore options
  3. Default 3-second delay between sequential requests avoids rate limits
  4. Review results and iterate on best candidates with Pro model
  1. 探索阶段批量生成使用
    --model flash
    (更快、成本更低)
  2. 生成10-20个变体探索可选方案
  3. 默认3秒的顺序请求间隔可避免触发速率限制
  4. 评审结果后,用Pro模型对最佳候选进行迭代优化

Rate Limits

速率限制

  • Gemini API has usage quotas (~10 RPM free tier)
  • Automatic retry with exponential backoff handles transient rate limits
  • For large batches, use
    --delay 5
    or
    --parallel
    with modest concurrency
  • Check your quota at Google AI Studio
  • Gemini API有使用配额(免费层约10 RPM)
  • 指数退避自动重试可处理临时速率限制
  • 大批量生成时请使用
    --delay 5
    或设置合理的
    --parallel
    并发数
  • 你可以在Google AI Studio查看你的配额

Troubleshooting

故障排除

"uv: command not found"
  • Install uv:
    curl -LsSf https://astral.sh/uv/install.sh | sh
    or
    brew install uv
"Error: google-genai package not installed"
  • Use
    uv run
    instead of
    python3
    to auto-install dependencies
  • Or install manually:
    pip install -r <skill_dir>/requirements.txt
"GEMINI_API_KEY environment variable not set"
  • Set
    GEMINI_API_KEY
    in your environment before running
"No image in response"
  • Prompt may have triggered safety filters
  • Try rephrasing to avoid sensitive content
"Rate limit exceeded after N retries"
  • Wait 30-60 seconds and try again
  • Reduce batch parallelism or add longer delays
  • Check your API quota
Import errors in batch_generate.py
  • The script handles its own path setup; run from any directory
"uv: command not found"
  • 安装uv:
    curl -LsSf https://astral.sh/uv/install.sh | sh
    brew install uv
"Error: google-genai package not installed"
  • 使用
    uv run
    代替
    python3
    自动安装依赖
  • 或手动安装:
    pip install -r <skill_dir>/requirements.txt
"GEMINI_API_KEY environment variable not set"
  • 运行前在环境中设置
    GEMINI_API_KEY
"No image in response"
  • 提示词可能触发了安全过滤
  • 尝试重新表述,避免敏感内容
"Rate limit exceeded after N retries"
  • 等待30-60秒后重试
  • 降低批量并发数或增加延迟时间
  • 检查你的API配额
batch_generate.py中的导入错误
  • 脚本已自行处理路径设置,可以从任意目录运行

Future Capabilities

未来能力规划

Multi-turn conversational editing — The Gemini API supports stateful chat sessions for iterative image editing (e.g., "make it bluer" → "now add a hat" → "zoom out"). This requires fundamentally different stateful architecture and is not currently implemented. No downstream skill currently needs this.
多轮会话式编辑 — Gemini API支持有状态的聊天会话,可用于迭代式图像编辑(例如"把它变蓝"→"现在加个帽子"→"缩小画面")。该功能需要完全不同的有状态架构,目前尚未实现,当前也没有下游技能需要该功能。

References

参考资料

  • references/prompts.md — Prompt examples, model-specific tips, multi-reference patterns
  • references/gemini-api.md — Curated API reference for agent context
  • references/prompts.md — 提示词示例、模型专属技巧、多参考模式
  • references/gemini-api.md — 精选的Agent上下文用API参考