nano-banana-image-gen

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Nano Banana 2 image generation reference

Nano Banana 2 图像生成参考指南

Nano Banana (NB) is Google's Gemini image generation family. NB2 = Gemini 3.1 Flash.

Nano Banana（简称NB）是Google的Gemini图像生成模型系列。NB2即Gemini 3.1 Flash。

Model selection

模型选择

Model	Best for	Tradeoffs
NB1	Existing workflows that already work, lowest cost, speed-critical	No thinking, no visual grounding, no extreme ratios
NB2	Default for new projects (~95% of Pro quality)	~15% slower than Pro at 2K, weaker arm/leg composition, spelling errors in infographics
Pro	Complex multi-layered prompts, extreme logical constraints, interior scale/logic	Most expensive

Decision: Start with NB2. Step up to Pro only if NB2 consistently fails your specific prompt.

模型	适用场景	权衡点
NB1	适用于已验证可行的现有工作流、对成本最敏感、对速度要求极高的场景	无推理能力、不支持视觉定位、不支持极端比例
NB2	新项目默认选择（约达到Pro版本95%的生成质量）	在2K分辨率下比Pro慢约15%，肢体构图表现较弱，信息图中存在拼写错误
Pro	适用于复杂多层提示词、极端逻辑约束、室内场景比例/逻辑要求严格的场景	成本最高

决策建议： 优先选择NB2。仅当NB2持续无法满足你的特定提示词需求时，再升级到Pro版本。

Visual grounding (NB2 only)

视觉定位（仅NB2支持）

NB2 searches the internet for reference images before generating. Useful for:

Specific real-world locations (churches, bridges, city squares, niche buildings)
Exact animal species, breeds, insects
Historically accurate scenes

Example: "Generate a cinematic, golden-hour photograph of the main historical church in Voiron, France. Ensure the architectural details, the spire, the surrounding square, and the landscape (mountains) are accurate to reality."

NB2在生成图像前会联网搜索参考图片。适用于以下场景：

特定真实地理位置（教堂、桥梁、城市广场、小众建筑）
精确的动物物种、品种、昆虫
符合历史史实的场景

示例提示词："生成法国瓦龙地区主历史教堂在黄金时段的电影质感照片。确保建筑细节、尖顶、周边广场及周边山地景观与现实完全一致。"

Cost optimization

成本优化

512px batch-to-upscale workflow:

Use Batch API (50% discount) to generate dozens of 512px variations
Review and pick the best composition
Upscale that image to 1K/2K/4K

512px output runs faster and costs roughly the same as NB1.

512px批量生成再放大工作流：

使用批量API（享50%折扣）生成数十张512px分辨率的变体图像
筛选出构图最佳的图像
将该图像放大至1K/2K/4K分辨率

512px分辨率的生成速度更快，成本与NB1大致相当。

Parameters

参数说明

Parameter	Values	Notes
Resolution	512px, 1K, 2K, 4K	512px for drafts, upscale winners
Aspect ratio	Standard + 1:4, 1:8, 4:1, 8:1	Extreme ratios for banners, comics, scrolling
Thinking mode	On/Off	Keep OFF by default. Enable for complex infographics or grounding + spatial reasoning

参数	可选值	说明
分辨率	512px、1K、2K、4K	512px适用于草稿生成，后续可将选中的图像放大
宽高比	标准比例 + 1:4、1:8、4:1、8:1	极端比例适用于横幅、漫画、滚动内容等场景
推理模式	开启/关闭	默认保持关闭状态。如需生成复杂信息图或结合视觉定位与空间推理时可开启

Prompt recipes

提示词模板

3D character selfie (requires image upload): Transform personal photos into stylized 3D characters interacting with real selves.

Anime to photorealistic (requires image upload): "Convert this uploaded animated still into an ultra-realistic, cinematic, and fully photorealistic scene. Transform the animated characters into real humans while perfectly preserving their original identities, facial structures, outfits, expressions, and overall likeness."

Historical street view: "Generate a hyper-realistic image of [event] perfectly replicating a Google Maps Street View capture. Include a 123-degree wide-angle barrel distortion..."

Crayon filter: "A child's crayon drawing on white lined notebook paper of [subject]. Use chunky wax-crayon strokes, wobbly outlines, and bright bold colors that messily overflow the lines. Include visible heavy pressure marks, waxy smudges, and uneven scribble shading."

Comic strip: "Create a 4-panel horizontal comic strip (aspect ratio 4:1). [Story]. Use a vibrant, Franco-Belgian comic book style. Keep the [character] design consistent across all panels."

3D角色自拍（需上传图片）：将个人照片转换为风格化3D角色，并与真实自我互动的图像。

动漫转写实风格（需上传图片）："将上传的动漫静帧转换为超写实、电影质感的完整场景。将动漫角色转换为真实人类，同时完美保留其原始身份、面部结构、服装、表情及整体相似度。"

历史街景："生成[事件]的超写实图像，完美复刻Google Maps街景的呈现效果。包含123度广角桶形畸变..."

蜡笔滤镜风格："[主题]的儿童蜡笔画，绘制在带横线的白色笔记本纸上。使用粗实的蜡笔笔触、不规则的轮廓线，以及溢出线条的明亮大胆色彩。需包含清晰的用力痕迹、蜡笔污迹和不均匀的涂鸦阴影。"

漫画分镜："创建一个4格横向漫画分镜（宽高比4:1）。[故事内容]。采用充满活力的法比漫画风格。确保[角色]的设计在所有分镜中保持一致。"

Known limitations

已知局限性

Spelling mistakes in infographics (NB1 was better at this)
Consistency across multiple generations is weaker than Pro
Arm/leg composition issues more frequent than other models
Knowledge cutoff prevents referencing very recent products even with search on
No seed value support yet (style changes between generations)

信息图中存在拼写错误（NB1在这方面表现更好）
多次生成的一致性弱于Pro版本
肢体构图问题比其他模型更频繁
知识截止日期限制，即使开启搜索也无法参考非常新的产品
暂不支持种子值（多次生成的风格会有变化）

Attribution

来源说明

Based on "Getting the most out of Nano Banana 2" by @NanoBanana on X (Mar 11, 2026). Original thread covers model comparisons, visual grounding, parameters, and prompt techniques for Google's Gemini image generation models.

基于X平台@NanoBanana账号于2026年3月11日发布的《Getting the most out of Nano Banana 2》（链接：https://x.com/AINanoBanana）。原推文内容涵盖了Google Gemini图像生成模型的模型对比、视觉定位、参数设置及提示词技巧。