nano-banana-builder
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNano Banana Builder
Nano Banana 构建工具
Operator Context
工具概述
This skill operates as an operator for building production web applications powered by Google's Nano Banana image generation APIs. It implements the Phased Build architectural pattern -- Scaffold, Integrate, Polish, Verify -- with Domain Intelligence embedded in model selection, conversational editing, and production hardening.
本技能用于构建基于Google Nano Banana图像生成API的生产级Web应用。它采用分阶段构建架构模式——搭建、集成、优化、验证,并在模型选择、对话式编辑和生产环境加固中融入领域智能。
Hardcoded Behaviors (Always Apply)
强制遵循规则(必须始终执行)
- CLAUDE.md Compliance: Read and follow repository CLAUDE.md before building
- Exact Model Names Only: Use or
gemini-2.5-flash-imageexclusively. Never invent model strings, add date suffixes, or guess names.gemini-3-pro-image-preview - Server-Side API Calls: All Gemini API calls go through server actions or API routes. Never expose API keys client-side.
- Storage Over Base64: Store generated images in object storage (Vercel Blob, S3/R2) and persist URLs, not raw base64 in databases.
- Rate Limit Handling: Every production integration must include rate limiting with user-friendly feedback.
- Conversational Editing: Leverage multi-turn history for iterative refinement rather than one-shot generation.
- CLAUDE.md合规性:构建前阅读并遵循仓库中的CLAUDE.md文档
- 仅使用指定模型名称:只能使用或
gemini-2.5-flash-image。绝不能自行编造模型字符串、添加日期后缀或猜测模型名称。gemini-3-pro-image-preview - 服务器端API调用:所有Gemini API调用必须通过服务器操作或API路由进行。绝不能在客户端暴露API密钥。
- 使用存储而非Base64:将生成的图像存储到对象存储(Vercel Blob、S3/R2)中,并持久化存储URL,而非在数据库中存储原始Base64数据。
- 速率限制处理:所有生产环境集成必须包含速率限制功能,并提供用户友好的反馈。
- 对话式编辑:利用多轮对话历史进行迭代优化,而非单次生成。
Default Behaviors (ON unless disabled)
默认规则(默认启用,可手动关闭)
- Model Selection by Use Case: Flash for speed/volume, Pro for quality/text rendering
- Loading State UX: Show progress indicators during 5-30s generation time
- Error Boundaries: Wrap generation components in error boundaries with retry
- Debounced Input: Require explicit user action to generate, never on keystroke
- Unique Design Per App: Vary UI style, color, layout, and interaction to fit purpose
- Environment Variable Validation: Check for required keys at startup
- 按场景选择模型:Flash模型适用于快速生成/高吞吐量场景,Pro模型适用于高质量/文本渲染场景
- 加载状态用户体验:在5-30秒的生成过程中显示进度指示器
- 错误边界处理:将生成组件包裹在错误边界中,并提供重试功能
- 输入防抖:需要用户明确触发生成操作,绝不能在按键时自动生成
- 应用设计差异化:根据应用目的调整UI风格、颜色、布局和交互方式
- 环境变量验证:在启动时检查所需的环境变量是否配置
Optional Behaviors (OFF unless enabled)
可选规则(默认禁用,需手动启用)
- Batch Generation: Generate multiple images in parallel with queue management
- Image Composition: Combine multiple generated images into composites
- Style Transfer: Apply reference styles across generations
- Gallery Persistence: Save generation history with browsing and search
- 批量生成:通过队列管理并行生成多张图像
- 图像合成:将多张生成的图像合并为复合图像
- 风格迁移:在生成过程中应用参考风格
- 图库持久化:保存生成历史,并提供浏览和搜索功能
What This Skill CAN Do
本技能可实现的功能
- Build complete image generation web applications with Next.js
- Implement server actions and API routes for both Gemini image models
- Handle iterative, multi-turn image editing conversations via useChat
- Configure object storage (Vercel Blob, S3/R2) for generated images
- Implement rate limiting and quota management with Upstash Redis
- Select the correct model based on speed, quality, and cost tradeoffs
- 构建完整的Next.js图像生成Web应用
- 为两种Gemini图像模型实现服务器操作和API路由
- 通过useChat实现多轮对话式图像编辑
- 配置对象存储(Vercel Blob、S3/R2)用于存储生成的图像
- 使用Upstash Redis实现速率限制和配额管理
- 根据速度、质量和成本权衡选择合适的模型
What This Skill CANNOT Do
本技能不可实现的功能
- Use non-Gemini image models (DALL-E, Midjourney, Stable Diffusion)
- Deploy to non-Node.js environments (Python, Go, etc.)
- Implement custom model fine-tuning or training
- Handle image input/classification (that is Gemini Vision, not Nano Banana)
- Skip any of the 4 build phases
- 使用非Gemini图像模型(DALL-E、Midjourney、Stable Diffusion)
- 部署到非Node.js环境(Python、Go等)
- 实现自定义模型微调或训练
- 处理图像输入/分类(属于Gemini Vision功能,而非Nano Banana)
- 跳过四个构建阶段中的任意一个
Instructions
操作指南
CRITICAL: Valid Model Names
重要提示:有效模型名称
Only two model strings exist for image generation. Use them exactly as written.
| Model String (exact) | Alias | Best For |
|---|---|---|
| Nano Banana | Fast iterations, drafts, high volume (2-5s) |
| Nano Banana Pro | Quality output, text rendering, 2K resolution |
Common wrong names: (text model suffix), (Pro does not generate images), (does not exist), (image input, not generation).
gemini-2.5-flash-preview-05-20gemini-2.5-pro-imagegemini-3-flash-imagegemini-pro-vision图像生成仅支持以下两个模型字符串,必须严格使用指定名称。
| 模型字符串(严格匹配) | 别名 | 最佳适用场景 |
|---|---|---|
| Nano Banana | 快速迭代、草稿生成、高吞吐量(2-5秒) |
| Nano Banana Pro | 高质量输出、文本渲染、2K分辨率 |
常见错误名称:(文本模型后缀)、(2.5 Pro不支持图像生成)、(不存在该模型)、(图像输入功能,非图像生成)。
gemini-2.5-flash-preview-05-20gemini-2.5-pro-imagegemini-3-flash-imagegemini-pro-visionPhase 1: SCAFFOLD
阶段1:搭建
Goal: Set up the Next.js project structure with dependencies and environment.
Step 1: Initialize project and install dependencies
bash
npm install @ai-sdk/google ai @ai-sdk/react目标:初始化Next.js项目结构,安装依赖并配置环境。
步骤1:初始化项目并安装依赖
bash
npm install @ai-sdk/google ai @ai-sdk/reactStorage (pick one):
存储(二选一):
npm install @vercel/blob # Vercel Blob
npm install @vercel/blob # Vercel Blob
or configure S3/R2 via aws-sdk
或通过aws-sdk配置S3/R2
Rate limiting (optional):
速率限制(可选):
npm install @upstash/ratelimit @upstash/redis
**Step 2: Configure environment variables**
```bashnpm install @upstash/ratelimit @upstash/redis
**步骤2:配置环境变量**
```bash.env.local
.env.local
GEMINI_API_KEY=your_api_key_here
BLOB_READ_WRITE_TOKEN=your_vercel_token # if using Vercel Blob
**Step 3: Define the application structure**
```markdownGEMINI_API_KEY=your_api_key_here
BLOB_READ_WRITE_TOKEN=your_vercel_token # 如果使用Vercel Blob
**步骤3:定义应用结构**
```markdownApp Structure
应用结构
- app/actions/generate.ts -- Server action for image generation
- app/api/generate/route.ts -- API route (if using useChat)
- app/components/ -- React client components
- lib/storage.ts -- Storage abstraction
- lib/rate-limit.ts -- Rate limiting (if production)
**Gate**: Project initializes, dependencies install, env vars configured. Proceed only when gate passes.- app/actions/generate.ts -- 用于图像生成的服务器操作
- app/api/generate/route.ts -- API路由(如果使用useChat)
- app/components/ -- React客户端组件
- lib/storage.ts -- 存储抽象层
- lib/rate-limit.ts -- 速率限制(生产环境可选)
**检查点**:项目初始化完成、依赖安装成功、环境变量配置正确。只有通过检查点才能进入下一阶段。Phase 2: INTEGRATE
阶段2:集成
Goal: Wire up Gemini image generation with server-side API calls and client components.
Step 1: Create server action or API route
typescript
// app/actions/generate.ts
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'
export async function generateImage(prompt: string) {
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt,
providerOptions: {
google: {
responseModalities: ['IMAGE'],
imageConfig: { aspectRatio: '16:9' }
}
}
})
return result.files[0] // { base64, uint8Array, mediaType }
}Step 2: Build client component with loading states
Use for multi-turn editing or direct server action calls for single-shot generation. Always include loading indicators and error display.
useChatStep 3: Connect storage
Upload generated images to object storage immediately. Return persistent URLs, not base64.
Gate: User can enter a prompt and receive a generated image displayed in the UI. Proceed only when gate passes.
目标:将Gemini图像生成功能与服务器端API调用和客户端组件关联。
步骤1:创建服务器操作或API路由
typescript
// app/actions/generate.ts
'use server'
import { google } from '@ai-sdk/google'
import { generateText } from 'ai'
export async function generateImage(prompt: string) {
const result = await generateText({
model: google('gemini-2.5-flash-image'),
prompt,
providerOptions: {
google: {
responseModalities: ['IMAGE'],
imageConfig: { aspectRatio: '16:9' }
}
}
})
return result.files[0] // { base64, uint8Array, mediaType }
}步骤2:构建带加载状态的客户端组件
使用实现多轮编辑,或直接调用服务器操作实现单次生成。必须包含加载指示器和错误显示。
useChat步骤3:连接存储服务
生成图像后立即上传到对象存储。返回持久化的URL,而非Base64数据。
检查点:用户可输入提示词并在UI中查看生成的图像。只有通过检查点才能进入下一阶段。
Phase 3: POLISH
阶段3:优化
Goal: Harden for production with rate limiting, error handling, and design variation.
Step 1: Add rate limiting
Implement per-user rate limits using Upstash Redis or in-memory fallback. Show friendly wait messages on 429 responses.
Step 2: Implement error boundaries and retry
Wrap generation in try/catch with specific handling for 429 (rate limit), 401 (bad key), 400 (content policy), and network timeout. Provide user-visible feedback for each case.
Step 3: Apply unique design
Match UI style to the application purpose. Avoid generic AI startup aesthetics. Vary color scheme, layout, typography, and interaction pattern intentionally.
Gate: App handles all error cases gracefully and has intentional visual design. Proceed only when gate passes.
目标:通过速率限制、错误处理和设计优化实现生产环境加固。
步骤1:添加速率限制
使用Upstash Redis或内存 fallback 实现按用户的速率限制。在收到429响应时显示友好的等待提示。
步骤2:实现错误边界和重试功能
使用try/catch包裹生成逻辑,针对429(速率限制)、401(密钥错误)、400(内容违规)和网络超时分别处理。为每种情况提供用户可见的反馈。
步骤3:应用差异化设计
根据应用目的调整UI风格。避免通用的AI启动应用美学。有意地改变配色方案、布局、排版和交互模式。
检查点:应用可优雅处理所有错误场景,且具有符合需求的视觉设计。只有通过检查点才能进入下一阶段。
Phase 4: VERIFY
阶段4:验证
Goal: Confirm the application works end-to-end and meets production standards.
Step 1: Generate an image successfully with a test prompt
Step 2: Verify model name strings are exactly or
gemini-2.5-flash-imagegemini-3-pro-image-previewStep 3: Confirm no API keys are exposed in client-side code or bundles
Step 4: Test error states (invalid prompt, rate limit simulation, missing env var)
Step 5: Verify images persist in storage with retrievable URLs
Step 6: Run full test suite if tests exist, no regressions
Gate: All verification steps pass. Build is complete.
目标:确认应用端到端正常运行,并符合生产环境标准。
步骤1:使用测试提示词成功生成图像
步骤2:验证模型名称严格为或
gemini-2.5-flash-imagegemini-3-pro-image-preview步骤3:确认API密钥未在客户端代码或打包文件中暴露
步骤4:测试错误状态(无效提示词、模拟速率限制、缺失环境变量)
步骤5:验证图像已持久化存储,且可通过URL访问
步骤6:如果存在测试套件,运行完整测试,确保无回归问题
检查点:所有验证步骤通过。构建完成。
Error Handling
错误处理
Error: "Rate Limit Exceeded (429)"
错误:「速率限制超出(429)」
Cause: Too many requests to Gemini API within quota window
Solution:
- Implement rate limiting middleware (Upstash Redis or in-memory)
- Show user-friendly wait message with estimated retry time
- Queue requests if burst traffic is expected
原因:在配额窗口内向Gemini API发送了过多请求
解决方案:
- 实现速率限制中间件(Upstash Redis或内存 fallback)
- 向用户显示友好的等待提示,并提供预计重试时间
- 如果预期有突发流量,实现请求队列
Error: "Invalid API Key (401)"
错误:「API密钥无效(401)」
Cause: Missing or incorrect GEMINI_API_KEY in environment
Solution:
- Verify key exists in and is loaded server-side
.env.local - Check key has image generation permissions enabled
- Never expose key in client components or API responses
原因:环境中缺少或配置了错误的GEMINI_API_KEY
解决方案:
- 验证中是否存在密钥,且已在服务器端加载
.env.local - 检查密钥是否已启用图像生成权限
- 绝不能在客户端组件或API响应中暴露密钥
Error: "Content Policy Violation (400)"
错误:「内容政策违规(400)」
Cause: Prompt triggers Gemini safety filters
Solution:
- Display clear user guidance on acceptable content
- Do not retry the same prompt automatically
- Log violations for monitoring without storing prompt content
原因:提示词触发了Gemini安全过滤器
解决方案:
- 向用户明确展示可接受内容的指南
- 不要自动重试相同的提示词
- 记录违规情况用于监控,但不要存储提示词内容
Error: "Network Timeout or Generation Failure"
错误:「网络超时或生成失败」
Cause: Generation exceeding timeout or transient network issue
Solution:
- Implement retry with exponential backoff (max 3 attempts)
- Show progress indicator during the 5-30s generation window
- Fall back to cached/placeholder image if all retries fail
原因:生成过程超时或网络临时故障
解决方案:
- 实现指数退避重试(最多3次)
- 在5-30秒的生成窗口中显示进度指示器
- 如果所有重试都失败,使用缓存/占位符图像作为 fallback
Anti-Patterns
反模式
Anti-Pattern 1: Inventing Model Names
反模式1:自行编造模型名称
What it looks like: Using or for generation
Why wrong: These model strings do not support image generation. Date suffixes belong to text models, and 2.5 Pro has no image output capability.
Do instead: Use exactly or . No variations.
gemini-2.5-flash-preview-05-20gemini-2.5-pro-imagegemini-2.5-flash-imagegemini-3-pro-image-preview表现:使用或进行生成
错误原因:这些模型字符串不支持图像生成。日期后缀属于文本模型,2.5 Pro不具备图像输出能力。
正确做法:严格使用或,不得修改。
gemini-2.5-flash-preview-05-20gemini-2.5-pro-imagegemini-2.5-flash-imagegemini-3-pro-image-previewAnti-Pattern 2: Exposing API Keys Client-Side
反模式2:在客户端暴露API密钥
What it looks like: Calling Gemini directly from React components or embedding keys in client bundles
Why wrong: Credentials are visible in browser DevTools, enabling abuse and billing attacks.
Do instead: Route all API calls through server actions or API routes. Store keys in environment variables accessed only server-side.
表现:直接从React组件调用Gemini API,或在客户端打包文件中嵌入密钥
错误原因:凭证会在浏览器开发者工具中暴露,导致滥用和计费攻击。
正确做法:所有API调用必须通过服务器操作或API路由进行。密钥仅存储在服务器端可访问的环境变量中。
Anti-Pattern 3: Storing Base64 in Database
反模式3:在数据库中存储Base64数据
What it looks like: Saving raw base64 image data directly to PostgreSQL or MongoDB
Why wrong: Bloats database size, increases query latency, and makes backups expensive.
Do instead: Upload to object storage (Vercel Blob, S3, R2) immediately after generation. Persist only the URL.
表现:将原始Base64图像数据直接保存到PostgreSQL或MongoDB
错误原因:会导致数据库膨胀、查询延迟增加,且备份成本高昂。
正确做法:生成后立即上传到对象存储(Vercel Blob、S3、R2)。仅持久化存储URL。
Anti-Pattern 4: Ignoring Multi-Turn Context
反模式4:忽略多轮对话上下文
What it looks like: Treating every generation as a fresh request with no conversation history
Why wrong: Discards Nano Banana's strongest feature -- conversational editing and iterative refinement.
Do instead: Track generation history as chat messages. Use to enable natural language editing of previous results.
useChat表现:将每次生成都视为全新请求,不保留对话历史
错误原因:浪费了Nano Banana最强大的功能——对话式编辑和迭代优化。
正确做法:将生成历史作为聊天消息跟踪。使用实现对之前结果的自然语言编辑。
useChatAnti-Pattern 5: No Loading States
反模式5:缺少加载状态
What it looks like: Submit button goes disabled with no visual feedback for 5-30 seconds
Why wrong: Users assume the app is broken and spam-click, wasting quota and degrading UX.
Do instead: Show skeleton loaders, progress bars, or estimated wait time during generation.
表现:提交按钮禁用后,5-30秒内无任何视觉反馈
错误原因:用户会认为应用已崩溃,从而重复点击,浪费配额并降低用户体验。
正确做法:在生成过程中始终显示进度指示器。
References
参考资料
This skill uses these shared patterns:
- Anti-Rationalization - Prevents shortcut rationalizations
- Verification Checklist - Pre-completion checks
本技能使用以下共享模式:
- Anti-Rationalization - 避免捷径式合理化
- Verification Checklist - 完成前检查清单
Domain-Specific Anti-Rationalization
领域特定反合理化
| Rationalization | Why It's Wrong | Required Action |
|---|---|---|
| "I know the model name" | Wrong model strings silently fail or error | Verify against exact list |
| "Base64 is fine for now" | Technical debt compounds fast with image data | Use object storage from day one |
| "Rate limiting can wait" | First production spike causes 429 cascade | Implement before deploying |
| "Loading state is cosmetic" | 5-30s silence destroys user trust | Always show generation progress |
| 合理化借口 | 错误原因 | 正确做法 |
|---|---|---|
| 「我知道正确的模型名称」 | 错误的模型字符串会导致静默失败或报错 | 严格对照指定的模型名称列表 |
| 「Base64暂时够用」 | 图像数据会快速积累技术债务 | 从第一天起就使用对象存储 |
| 「速率限制可以后续再加」 | 首次生产流量高峰会导致429错误雪崩 | 在部署前实现速率限制 |
| 「加载状态只是装饰」 | 5-30秒的无反馈等待会摧毁用户信任 | 始终显示生成进度 |
Reference Files
参考文件
- : Server actions, API routes, client components, multi-image composition
${CLAUDE_SKILL_DIR}/references/advanced-patterns.md - : Provider options, storage setup, rate limiting, cost optimization
${CLAUDE_SKILL_DIR}/references/configuration.md
- :服务器操作、API路由、客户端组件、多图像合成
${CLAUDE_SKILL_DIR}/references/advanced-patterns.md - :提供商选项、存储设置、速率限制、成本优化
${CLAUDE_SKILL_DIR}/references/configuration.md