nano-banana-image-gen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNano Banana 2 image generation reference
Nano Banana 2 图像生成参考指南
Nano Banana (NB) is Google's Gemini image generation family. NB2 = Gemini 3.1 Flash.
Nano Banana(简称NB)是Google的Gemini图像生成模型系列。NB2即Gemini 3.1 Flash。
Model selection
模型选择
| Model | Best for | Tradeoffs |
|---|---|---|
| NB1 | Existing workflows that already work, lowest cost, speed-critical | No thinking, no visual grounding, no extreme ratios |
| NB2 | Default for new projects (~95% of Pro quality) | ~15% slower than Pro at 2K, weaker arm/leg composition, spelling errors in infographics |
| Pro | Complex multi-layered prompts, extreme logical constraints, interior scale/logic | Most expensive |
Decision: Start with NB2. Step up to Pro only if NB2 consistently fails your specific prompt.
| 模型 | 适用场景 | 权衡点 |
|---|---|---|
| NB1 | 适用于已验证可行的现有工作流、对成本最敏感、对速度要求极高的场景 | 无推理能力、不支持视觉定位、不支持极端比例 |
| NB2 | 新项目默认选择(约达到Pro版本95%的生成质量) | 在2K分辨率下比Pro慢约15%,肢体构图表现较弱,信息图中存在拼写错误 |
| Pro | 适用于复杂多层提示词、极端逻辑约束、室内场景比例/逻辑要求严格的场景 | 成本最高 |
决策建议: 优先选择NB2。仅当NB2持续无法满足你的特定提示词需求时,再升级到Pro版本。
Visual grounding (NB2 only)
视觉定位(仅NB2支持)
NB2 searches the internet for reference images before generating. Useful for:
- Specific real-world locations (churches, bridges, city squares, niche buildings)
- Exact animal species, breeds, insects
- Historically accurate scenes
Example: "Generate a cinematic, golden-hour photograph of the main historical church in Voiron, France. Ensure the architectural details, the spire, the surrounding square, and the landscape (mountains) are accurate to reality."
NB2在生成图像前会联网搜索参考图片。适用于以下场景:
- 特定真实地理位置(教堂、桥梁、城市广场、小众建筑)
- 精确的动物物种、品种、昆虫
- 符合历史史实的场景
示例提示词:"生成法国瓦龙地区主历史教堂在黄金时段的电影质感照片。确保建筑细节、尖顶、周边广场及周边山地景观与现实完全一致。"
Cost optimization
成本优化
512px batch-to-upscale workflow:
- Use Batch API (50% discount) to generate dozens of 512px variations
- Review and pick the best composition
- Upscale that image to 1K/2K/4K
512px output runs faster and costs roughly the same as NB1.
512px批量生成再放大工作流:
- 使用批量API(享50%折扣)生成数十张512px分辨率的变体图像
- 筛选出构图最佳的图像
- 将该图像放大至1K/2K/4K分辨率
512px分辨率的生成速度更快,成本与NB1大致相当。
Parameters
参数说明
| Parameter | Values | Notes |
|---|---|---|
| Resolution | 512px, 1K, 2K, 4K | 512px for drafts, upscale winners |
| Aspect ratio | Standard + 1:4, 1:8, 4:1, 8:1 | Extreme ratios for banners, comics, scrolling |
| Thinking mode | On/Off | Keep OFF by default. Enable for complex infographics or grounding + spatial reasoning |
| 参数 | 可选值 | 说明 |
|---|---|---|
| 分辨率 | 512px、1K、2K、4K | 512px适用于草稿生成,后续可将选中的图像放大 |
| 宽高比 | 标准比例 + 1:4、1:8、4:1、8:1 | 极端比例适用于横幅、漫画、滚动内容等场景 |
| 推理模式 | 开启/关闭 | 默认保持关闭状态。如需生成复杂信息图或结合视觉定位与空间推理时可开启 |
Prompt recipes
提示词模板
3D character selfie (requires image upload): Transform personal photos into stylized 3D characters interacting with real selves.
Anime to photorealistic (requires image upload): "Convert this uploaded animated still into an ultra-realistic, cinematic, and fully photorealistic scene. Transform the animated characters into real humans while perfectly preserving their original identities, facial structures, outfits, expressions, and overall likeness."
Historical street view: "Generate a hyper-realistic image of [event] perfectly replicating a Google Maps Street View capture. Include a 123-degree wide-angle barrel distortion..."
Crayon filter: "A child's crayon drawing on white lined notebook paper of [subject]. Use chunky wax-crayon strokes, wobbly outlines, and bright bold colors that messily overflow the lines. Include visible heavy pressure marks, waxy smudges, and uneven scribble shading."
Comic strip: "Create a 4-panel horizontal comic strip (aspect ratio 4:1). [Story]. Use a vibrant, Franco-Belgian comic book style. Keep the [character] design consistent across all panels."
3D角色自拍(需上传图片):将个人照片转换为风格化3D角色,并与真实自我互动的图像。
动漫转写实风格(需上传图片):"将上传的动漫静帧转换为超写实、电影质感的完整场景。将动漫角色转换为真实人类,同时完美保留其原始身份、面部结构、服装、表情及整体相似度。"
历史街景:"生成[事件]的超写实图像,完美复刻Google Maps街景的呈现效果。包含123度广角桶形畸变..."
蜡笔滤镜风格:"[主题]的儿童蜡笔画,绘制在带横线的白色笔记本纸上。使用粗实的蜡笔笔触、不规则的轮廓线,以及溢出线条的明亮大胆色彩。需包含清晰的用力痕迹、蜡笔污迹和不均匀的涂鸦阴影。"
漫画分镜:"创建一个4格横向漫画分镜(宽高比4:1)。[故事内容]。采用充满活力的法比漫画风格。确保[角色]的设计在所有分镜中保持一致。"
Known limitations
已知局限性
- Spelling mistakes in infographics (NB1 was better at this)
- Consistency across multiple generations is weaker than Pro
- Arm/leg composition issues more frequent than other models
- Knowledge cutoff prevents referencing very recent products even with search on
- No seed value support yet (style changes between generations)
- 信息图中存在拼写错误(NB1在这方面表现更好)
- 多次生成的一致性弱于Pro版本
- 肢体构图问题比其他模型更频繁
- 知识截止日期限制,即使开启搜索也无法参考非常新的产品
- 暂不支持种子值(多次生成的风格会有变化)
Attribution
来源说明
Based on "Getting the most out of Nano Banana 2" by @NanoBanana on X (Mar 11, 2026). Original thread covers model comparisons, visual grounding, parameters, and prompt techniques for Google's Gemini image generation models.
基于X平台@NanoBanana账号于2026年3月11日发布的《Getting the most out of Nano Banana 2》(链接:https://x.com/AINanoBanana)。原推文内容涵盖了Google Gemini图像生成模型的模型对比、视觉定位、参数设置及提示词技巧。