nano-banana-builder

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Nano Banana Builder

Nano Banana 构建工具

Operator Context

工具概述

This skill operates as an operator for building production web applications powered by Google's Nano Banana image generation APIs. It implements the Phased Build architectural pattern -- Scaffold, Integrate, Polish, Verify -- with Domain Intelligence embedded in model selection, conversational editing, and production hardening.

本技能用于构建基于Google Nano Banana图像生成API的生产级Web应用。它采用分阶段构建架构模式——搭建、集成、优化、验证，并在模型选择、对话式编辑和生产环境加固中融入领域智能。

Hardcoded Behaviors (Always Apply)

强制遵循规则（必须始终执行）

CLAUDE.md Compliance: Read and follow repository CLAUDE.md before building
Exact Model Names Only: Use
```
gemini-2.5-flash-image
```
or
```
gemini-3-pro-image-preview
```
exclusively. Never invent model strings, add date suffixes, or guess names.
Server-Side API Calls: All Gemini API calls go through server actions or API routes. Never expose API keys client-side.
Storage Over Base64: Store generated images in object storage (Vercel Blob, S3/R2) and persist URLs, not raw base64 in databases.
Rate Limit Handling: Every production integration must include rate limiting with user-friendly feedback.
Conversational Editing: Leverage multi-turn history for iterative refinement rather than one-shot generation.

CLAUDE.md合规性：构建前阅读并遵循仓库中的CLAUDE.md文档
仅使用指定模型名称：只能使用
```
gemini-2.5-flash-image
```
或
```
gemini-3-pro-image-preview
```
。绝不能自行编造模型字符串、添加日期后缀或猜测模型名称。
服务器端API调用：所有Gemini API调用必须通过服务器操作或API路由进行。绝不能在客户端暴露API密钥。
使用存储而非Base64：将生成的图像存储到对象存储（Vercel Blob、S3/R2）中，并持久化存储URL，而非在数据库中存储原始Base64数据。
速率限制处理：所有生产环境集成必须包含速率限制功能，并提供用户友好的反馈。
对话式编辑：利用多轮对话历史进行迭代优化，而非单次生成。

Default Behaviors (ON unless disabled)

默认规则（默认启用，可手动关闭）

Model Selection by Use Case: Flash for speed/volume, Pro for quality/text rendering
Loading State UX: Show progress indicators during 5-30s generation time
Error Boundaries: Wrap generation components in error boundaries with retry
Debounced Input: Require explicit user action to generate, never on keystroke
Unique Design Per App: Vary UI style, color, layout, and interaction to fit purpose
Environment Variable Validation: Check for required keys at startup

按场景选择模型：Flash模型适用于快速生成/高吞吐量场景，Pro模型适用于高质量/文本渲染场景
加载状态用户体验：在5-30秒的生成过程中显示进度指示器
错误边界处理：将生成组件包裹在错误边界中，并提供重试功能
输入防抖：需要用户明确触发生成操作，绝不能在按键时自动生成
应用设计差异化：根据应用目的调整UI风格、颜色、布局和交互方式
环境变量验证：在启动时检查所需的环境变量是否配置

Optional Behaviors (OFF unless enabled)

可选规则（默认禁用，需手动启用）

Batch Generation: Generate multiple images in parallel with queue management
Image Composition: Combine multiple generated images into composites
Style Transfer: Apply reference styles across generations
Gallery Persistence: Save generation history with browsing and search

批量生成：通过队列管理并行生成多张图像
图像合成：将多张生成的图像合并为复合图像
风格迁移：在生成过程中应用参考风格
图库持久化：保存生成历史，并提供浏览和搜索功能

What This Skill CAN Do

本技能可实现的功能

Build complete image generation web applications with Next.js
Implement server actions and API routes for both Gemini image models
Handle iterative, multi-turn image editing conversations via useChat
Configure object storage (Vercel Blob, S3/R2) for generated images
Implement rate limiting and quota management with Upstash Redis
Select the correct model based on speed, quality, and cost tradeoffs

构建完整的Next.js图像生成Web应用
为两种Gemini图像模型实现服务器操作和API路由
通过useChat实现多轮对话式图像编辑
配置对象存储（Vercel Blob、S3/R2）用于存储生成的图像
使用Upstash Redis实现速率限制和配额管理
根据速度、质量和成本权衡选择合适的模型

What This Skill CANNOT Do

本技能不可实现的功能

Use non-Gemini image models (DALL-E, Midjourney, Stable Diffusion)
Deploy to non-Node.js environments (Python, Go, etc.)
Implement custom model fine-tuning or training
Handle image input/classification (that is Gemini Vision, not Nano Banana)
Skip any of the 4 build phases

使用非Gemini图像模型（DALL-E、Midjourney、Stable Diffusion）
部署到非Node.js环境（Python、Go等）
实现自定义模型微调或训练
处理图像输入/分类（属于Gemini Vision功能，而非Nano Banana）
跳过四个构建阶段中的任意一个

Instructions

操作指南

CRITICAL: Valid Model Names

重要提示：有效模型名称

Only two model strings exist for image generation. Use them exactly as written.

Model String (exact)	Alias	Best For
`gemini-2.5-flash-image`	Nano Banana	Fast iterations, drafts, high volume (2-5s)
`gemini-3-pro-image-preview`	Nano Banana Pro	Quality output, text rendering, 2K resolution

Common wrong names:

gemini-2.5-flash-preview-05-20

(text model suffix),

gemini-2.5-pro-image

(Pro does not generate images),

gemini-3-flash-image

(does not exist),

gemini-pro-vision

(image input, not generation).

图像生成仅支持以下两个模型字符串，必须严格使用指定名称。

模型字符串（严格匹配）	别名	最佳适用场景
`gemini-2.5-flash-image`	Nano Banana	快速迭代、草稿生成、高吞吐量（2-5秒）
`gemini-3-pro-image-preview`	Nano Banana Pro	高质量输出、文本渲染、2K分辨率

常见错误名称：

gemini-2.5-flash-preview-05-20

（文本模型后缀）、

gemini-2.5-pro-image

（2.5 Pro不支持图像生成）、

gemini-3-flash-image

（不存在该模型）、

gemini-pro-vision

（图像输入功能，非图像生成）。

Phase 1: SCAFFOLD

阶段1：搭建

Goal: Set up the Next.js project structure with dependencies and environment.

Step 1: Initialize project and install dependencies

bash

npm install @ai-sdk/google ai @ai-sdk/react

目标：初始化Next.js项目结构，安装依赖并配置环境。

步骤1：初始化项目并安装依赖

bash

npm install @ai-sdk/google ai @ai-sdk/react

Storage (pick one):

存储（二选一）：

npm install @vercel/blob # Vercel Blob

or configure S3/R2 via aws-sdk

或通过aws-sdk配置S3/R2

Rate limiting (optional):

速率限制（可选）：

npm install @upstash/ratelimit @upstash/redis


**Step 2: Configure environment variables**

```bash

npm install @upstash/ratelimit @upstash/redis


**步骤2：配置环境变量**

```bash

.env.local

GEMINI_API_KEY=your_api_key_here BLOB_READ_WRITE_TOKEN=your_vercel_token # if using Vercel Blob


**Step 3: Define the application structure**

```markdown

GEMINI_API_KEY=your_api_key_here BLOB_READ_WRITE_TOKEN=your_vercel_token # 如果使用Vercel Blob


**步骤3：定义应用结构**

```markdown

App Structure

应用结构

app/actions/generate.ts -- Server action for image generation
app/api/generate/route.ts -- API route (if using useChat)
app/components/ -- React client components
lib/storage.ts -- Storage abstraction
lib/rate-limit.ts -- Rate limiting (if production)


**Gate**: Project initializes, dependencies install, env vars configured. Proceed only when gate passes.

app/actions/generate.ts -- 用于图像生成的服务器操作
app/api/generate/route.ts -- API路由（如果使用useChat）
app/components/ -- React客户端组件
lib/storage.ts -- 存储抽象层
lib/rate-limit.ts -- 速率限制（生产环境可选）


**检查点**：项目初始化完成、依赖安装成功、环境变量配置正确。只有通过检查点才能进入下一阶段。

Phase 2: INTEGRATE

阶段2：集成

Goal: Wire up Gemini image generation with server-side API calls and client components.

Step 1: Create server action or API route

typescript

// app/actions/generate.ts
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'

export async function generateImage(prompt: string) {
  const result = await generateText({
    model: google('gemini-2.5-flash-image'),
    prompt,
    providerOptions: {
      google: {
        responseModalities: ['IMAGE'],
        imageConfig: { aspectRatio: '16:9' }
      }
    }
  })
  return result.files[0] // { base64, uint8Array, mediaType }
}

Step 2: Build client component with loading states

Use

useChat

for multi-turn editing or direct server action calls for single-shot generation. Always include loading indicators and error display.

Step 3: Connect storage

Upload generated images to object storage immediately. Return persistent URLs, not base64.

Gate: User can enter a prompt and receive a generated image displayed in the UI. Proceed only when gate passes.

目标：将Gemini图像生成功能与服务器端API调用和客户端组件关联。

步骤1：创建服务器操作或API路由

typescript

// app/actions/generate.ts
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'

export async function generateImage(prompt: string) {
  const result = await generateText({
    model: google('gemini-2.5-flash-image'),
    prompt,
    providerOptions: {
      google: {
        responseModalities: ['IMAGE'],
        imageConfig: { aspectRatio: '16:9' }
      }
    }
  })
  return result.files[0] // { base64, uint8Array, mediaType }
}

步骤2：构建带加载状态的客户端组件

使用

useChat

实现多轮编辑，或直接调用服务器操作实现单次生成。必须包含加载指示器和错误显示。

步骤3：连接存储服务

生成图像后立即上传到对象存储。返回持久化的URL，而非Base64数据。

检查点：用户可输入提示词并在UI中查看生成的图像。只有通过检查点才能进入下一阶段。

Phase 3: POLISH

阶段3：优化

Goal: Harden for production with rate limiting, error handling, and design variation.

Step 1: Add rate limiting

Implement per-user rate limits using Upstash Redis or in-memory fallback. Show friendly wait messages on 429 responses.

Step 2: Implement error boundaries and retry

Wrap generation in try/catch with specific handling for 429 (rate limit), 401 (bad key), 400 (content policy), and network timeout. Provide user-visible feedback for each case.

Step 3: Apply unique design

Match UI style to the application purpose. Avoid generic AI startup aesthetics. Vary color scheme, layout, typography, and interaction pattern intentionally.

Gate: App handles all error cases gracefully and has intentional visual design. Proceed only when gate passes.

目标：通过速率限制、错误处理和设计优化实现生产环境加固。

步骤1：添加速率限制

使用Upstash Redis或内存 fallback 实现按用户的速率限制。在收到429响应时显示友好的等待提示。

步骤2：实现错误边界和重试功能

使用try/catch包裹生成逻辑，针对429（速率限制）、401（密钥错误）、400（内容违规）和网络超时分别处理。为每种情况提供用户可见的反馈。

步骤3：应用差异化设计

根据应用目的调整UI风格。避免通用的AI启动应用美学。有意地改变配色方案、布局、排版和交互模式。

检查点：应用可优雅处理所有错误场景，且具有符合需求的视觉设计。只有通过检查点才能进入下一阶段。

Phase 4: VERIFY

阶段4：验证

Goal: Confirm the application works end-to-end and meets production standards.

Step 1: Generate an image successfully with a test prompt

Step 2: Verify model name strings are exactly

gemini-2.5-flash-image

gemini-3-pro-image-preview

Step 3: Confirm no API keys are exposed in client-side code or bundles

Step 4: Test error states (invalid prompt, rate limit simulation, missing env var)

Step 5: Verify images persist in storage with retrievable URLs

Step 6: Run full test suite if tests exist, no regressions

Gate: All verification steps pass. Build is complete.

目标：确认应用端到端正常运行，并符合生产环境标准。

步骤1：使用测试提示词成功生成图像

步骤2：验证模型名称严格为

gemini-2.5-flash-image

或

gemini-3-pro-image-preview

步骤3：确认API密钥未在客户端代码或打包文件中暴露

步骤4：测试错误状态（无效提示词、模拟速率限制、缺失环境变量）

步骤5：验证图像已持久化存储，且可通过URL访问

步骤6：如果存在测试套件，运行完整测试，确保无回归问题

检查点：所有验证步骤通过。构建完成。

Error Handling

错误处理

Error: "Rate Limit Exceeded (429)"

错误：「速率限制超出（429）」

Cause: Too many requests to Gemini API within quota window Solution:

Implement rate limiting middleware (Upstash Redis or in-memory)
Show user-friendly wait message with estimated retry time
Queue requests if burst traffic is expected

原因：在配额窗口内向Gemini API发送了过多请求解决方案：

实现速率限制中间件（Upstash Redis或内存 fallback）
向用户显示友好的等待提示，并提供预计重试时间
如果预期有突发流量，实现请求队列

Error: "Invalid API Key (401)"

错误：「API密钥无效（401）」

Cause: Missing or incorrect GEMINI_API_KEY in environment Solution:

Verify key exists in
```
.env.local
```
and is loaded server-side
Check key has image generation permissions enabled
Never expose key in client components or API responses

原因：环境中缺少或配置了错误的GEMINI_API_KEY 解决方案：

验证
```
.env.local
```
中是否存在密钥，且已在服务器端加载
检查密钥是否已启用图像生成权限
绝不能在客户端组件或API响应中暴露密钥

Error: "Content Policy Violation (400)"

错误：「内容政策违规（400）」

Cause: Prompt triggers Gemini safety filters Solution:

Display clear user guidance on acceptable content
Do not retry the same prompt automatically
Log violations for monitoring without storing prompt content

原因：提示词触发了Gemini安全过滤器解决方案：

向用户明确展示可接受内容的指南
不要自动重试相同的提示词
记录违规情况用于监控，但不要存储提示词内容

Error: "Network Timeout or Generation Failure"

错误：「网络超时或生成失败」

Cause: Generation exceeding timeout or transient network issue Solution:

Implement retry with exponential backoff (max 3 attempts)
Show progress indicator during the 5-30s generation window
Fall back to cached/placeholder image if all retries fail

原因：生成过程超时或网络临时故障解决方案：

实现指数退避重试（最多3次）
在5-30秒的生成窗口中显示进度指示器
如果所有重试都失败，使用缓存/占位符图像作为 fallback

Anti-Patterns

反模式

Anti-Pattern 1: Inventing Model Names

反模式1：自行编造模型名称

What it looks like: Using

gemini-2.5-flash-preview-05-20

gemini-2.5-pro-image

for generation Why wrong: These model strings do not support image generation. Date suffixes belong to text models, and 2.5 Pro has no image output capability. Do instead: Use exactly

gemini-2.5-flash-image

gemini-3-pro-image-preview

. No variations.

表现：使用

gemini-2.5-flash-preview-05-20

或

gemini-2.5-pro-image

进行生成 错误原因：这些模型字符串不支持图像生成。日期后缀属于文本模型，2.5 Pro不具备图像输出能力。 正确做法：严格使用

gemini-2.5-flash-image

或

gemini-3-pro-image-preview

，不得修改。

Anti-Pattern 2: Exposing API Keys Client-Side

反模式2：在客户端暴露API密钥

What it looks like: Calling Gemini directly from React components or embedding keys in client bundles Why wrong: Credentials are visible in browser DevTools, enabling abuse and billing attacks. Do instead: Route all API calls through server actions or API routes. Store keys in environment variables accessed only server-side.

表现：直接从React组件调用Gemini API，或在客户端打包文件中嵌入密钥 错误原因：凭证会在浏览器开发者工具中暴露，导致滥用和计费攻击。 正确做法：所有API调用必须通过服务器操作或API路由进行。密钥仅存储在服务器端可访问的环境变量中。

Anti-Pattern 3: Storing Base64 in Database

反模式3：在数据库中存储Base64数据

What it looks like: Saving raw base64 image data directly to PostgreSQL or MongoDB Why wrong: Bloats database size, increases query latency, and makes backups expensive. Do instead: Upload to object storage (Vercel Blob, S3, R2) immediately after generation. Persist only the URL.

表现：将原始Base64图像数据直接保存到PostgreSQL或MongoDB 错误原因：会导致数据库膨胀、查询延迟增加，且备份成本高昂。 正确做法：生成后立即上传到对象存储（Vercel Blob、S3、R2）。仅持久化存储URL。

Anti-Pattern 4: Ignoring Multi-Turn Context

反模式4：忽略多轮对话上下文

What it looks like: Treating every generation as a fresh request with no conversation history Why wrong: Discards Nano Banana's strongest feature -- conversational editing and iterative refinement. Do instead: Track generation history as chat messages. Use

useChat

to enable natural language editing of previous results.

表现：将每次生成都视为全新请求，不保留对话历史 错误原因：浪费了Nano Banana最强大的功能——对话式编辑和迭代优化。 正确做法：将生成历史作为聊天消息跟踪。使用

useChat

实现对之前结果的自然语言编辑。

Anti-Pattern 5: No Loading States

反模式5：缺少加载状态

What it looks like: Submit button goes disabled with no visual feedback for 5-30 seconds Why wrong: Users assume the app is broken and spam-click, wasting quota and degrading UX. Do instead: Show skeleton loaders, progress bars, or estimated wait time during generation.

表现：提交按钮禁用后，5-30秒内无任何视觉反馈 错误原因：用户会认为应用已崩溃，从而重复点击，浪费配额并降低用户体验。 正确做法：在生成过程中始终显示进度指示器。

References

参考资料

This skill uses these shared patterns:

Anti-Rationalization - Prevents shortcut rationalizations
Verification Checklist - Pre-completion checks

本技能使用以下共享模式：

Anti-Rationalization - 避免捷径式合理化
Verification Checklist - 完成前检查清单

Domain-Specific Anti-Rationalization

领域特定反合理化

Rationalization	Why It's Wrong	Required Action
"I know the model name"	Wrong model strings silently fail or error	Verify against exact list
"Base64 is fine for now"	Technical debt compounds fast with image data	Use object storage from day one
"Rate limiting can wait"	First production spike causes 429 cascade	Implement before deploying
"Loading state is cosmetic"	5-30s silence destroys user trust	Always show generation progress

合理化借口	错误原因	正确做法
「我知道正确的模型名称」	错误的模型字符串会导致静默失败或报错	严格对照指定的模型名称列表
「Base64暂时够用」	图像数据会快速积累技术债务	从第一天起就使用对象存储
「速率限制可以后续再加」	首次生产流量高峰会导致429错误雪崩	在部署前实现速率限制
「加载状态只是装饰」	5-30秒的无反馈等待会摧毁用户信任	始终显示生成进度

Reference Files

参考文件

```
${CLAUDE_SKILL_DIR}/references/advanced-patterns.md
```
: Server actions, API routes, client components, multi-image composition
```
${CLAUDE_SKILL_DIR}/references/configuration.md
```
: Provider options, storage setup, rate limiting, cost optimization

```
${CLAUDE_SKILL_DIR}/references/advanced-patterns.md
```
：服务器操作、API路由、客户端组件、多图像合成
```
${CLAUDE_SKILL_DIR}/references/configuration.md
```
：提供商选项、存储设置、速率限制、成本优化