image-gen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseImage Generation Skill
图像生成技能
Generate and edit website images using Gemini Native Image Generation.
使用Gemini原生图像生成功能生成和编辑网站图片。
⚠️ Critical: SDK Migration Required
⚠️ 重要提示:需要进行SDK迁移
IMPORTANT: The package is deprecated as of November 30, 2025. All new projects must use .
@google/generative-ai@google/genaiMigration Required:
typescript
// ❌ OLD (deprecated, support ended Nov 30, 2025)
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(API_KEY);
// ✅ NEW (required)
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: API_KEY });重要提示: 包已于2025年11月30日弃用。所有新项目必须使用。
@google/generative-ai@google/genai迁移要求:
typescript
// ❌ 旧版(已弃用,支持于2025年11月30日终止)
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(API_KEY);
// ✅ 新版(必须使用)
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: API_KEY });来源:GitHub仓库迁移通知
Models
模型
| Model | ID | Status | Best For |
|---|---|---|---|
| Gemini 3 Pro Image | | Preview (Nov 20, 2025) | 4K, complex prompts, text |
| Gemini 2.5 Flash Image | | GA (Oct 2, 2025) | Fast iteration, general use |
| Imagen 4.0 | | GA (Aug 14, 2025) | Alternative platform |
Deprecated Models (do not use):
- - Shut down Nov 11, 2025
gemini-2.0-flash-exp-image-generation - - Shut down Nov 11, 2025
gemini-2.0-flash-preview-image-generation - - Scheduled shutdown Jan 15, 2026
gemini-2.5-flash-image-preview
Source: Google AI Changelog
| 模型 | ID | 状态 | 最佳适用场景 |
|---|---|---|---|
| Gemini 3 Pro 图像模型 | | 预览版(2025年11月20日) | 4K分辨率、复杂提示词、文字生成 |
| Gemini 2.5 Flash 图像模型 | | 正式发布(2025年10月2日) | 快速迭代、通用场景 |
| Imagen 4.0 | | 正式发布(2025年8月14日) | 替代平台 |
已弃用模型(请勿使用):
- - 已于2025年11月11日停用
gemini-2.0-flash-exp-image-generation - - 已于2025年11月11日停用
gemini-2.0-flash-preview-image-generation - - 计划于2026年1月15日停用
gemini-2.5-flash-image-preview
Capabilities
功能特性
| Feature | Supported |
|---|---|
| Generate from text | ✅ |
| Edit existing images | ✅ |
| Change aspect ratio | ✅ |
| Widen/extend images | ✅ |
| Style transfer | ✅ |
| Change colours | ✅ |
| Add/remove elements | ✅ |
| Text in images | ✅ (legible!) |
| Multiple reference images | ✅ (up to 14: max 5 humans, 9 objects) |
| 4K resolution | ✅ (Pro only) |
Note: Exceeding 5 human reference images causes unpredictable character consistency. Keep human images ≤ 5 for reliable results.
| 功能 | 是否支持 |
|---|---|
| 文本生成图像 | ✅ |
| 编辑现有图像 | ✅ |
| 修改宽高比 | ✅ |
| 扩展图像宽度 | ✅ |
| 风格迁移 | ✅ |
| 调整颜色 | ✅ |
| 添加/移除元素 | ✅ |
| 图像中添加文字 | ✅(清晰可读!) |
| 多参考图像 | ✅(最多14张:最多5张人物图,9张物体图) |
| 4K分辨率 | ✅(仅Pro版支持) |
注意:参考人物图像超过5张会导致角色一致性不可预测。为获得可靠结果,请将人物图像数量控制在≤5张。
Aspect Ratios
宽高比
1:1 | 2:3 | 3:2 | 3:4 | 4:3
4:5 | 5:4 | 9:16 | 16:9 | 21:91:1 | 2:3 | 3:2 | 3:4 | 4:3
4:5 | 5:4 | 9:16 | 16:9 | 21:9Resolutions (Pro only)
分辨率(仅Pro版支持)
| Size | 1:1 | 16:9 | 4:3 |
|---|---|---|---|
| 1K | 1024x1024 | 1376x768 | 1184x880 |
| 2K | 2048x2048 | 2752x1536 | 2368x1760 |
| 4K | 4096x4096 | 5504x3072 | 4736x3520 |
| 尺寸 | 1:1 | 16:9 | 4:3 |
|---|---|---|---|
| 1K | 1024x1024 | 1376x768 | 1184x880 |
| 2K | 2048x2048 | 2752x1536 | 2368x1760 |
| 4K | 4096x4096 | 5504x3072 | 4736x3520 |
Quick Start
快速开始
typescript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Generate new image
const response = await ai.models.generateContent({
model: "gemini-2.5-flash-image",
contents: "A professional plumber in hi-vis working in modern Australian home",
config: {
responseModalities: ["TEXT", "IMAGE"], // BOTH required - cannot use ["IMAGE"] alone
imageGenerationConfig: {
aspectRatio: "16:9",
},
},
});
// Extract image
for (const part of response.candidates[0].content.parts) {
if (part.inlineData) {
const buffer = Buffer.from(part.inlineData.data, "base64");
fs.writeFileSync("hero.png", buffer);
}
}Important: must include both . Using alone may fail or produce unexpected results.
responseModalities["TEXT", "IMAGE"]["IMAGE"]typescript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// 生成新图像
const response = await ai.models.generateContent({
model: "gemini-2.5-flash-image",
contents: "A professional plumber in hi-vis working in modern Australian home",
config: {
responseModalities: ["TEXT", "IMAGE"], // 必须同时包含两者 - 不能单独使用["IMAGE"]
imageGenerationConfig: {
aspectRatio: "16:9",
},
},
});
// 提取图像
for (const part of response.candidates[0].content.parts) {
if (part.inlineData) {
const buffer = Buffer.from(part.inlineData.data, "base64");
fs.writeFileSync("hero.png", buffer);
}
}重要提示:必须同时包含。单独使用可能会失败或产生意外结果。
responseModalities["TEXT", "IMAGE"]["IMAGE"]Model Selection
模型选择
| Requirement | Use |
|---|---|
| Fast iteration | Gemini 2.5 Flash Image |
| 4K resolution | Gemini 3 Pro Image Preview |
| Text in images | Gemini 3 Pro (94% legibility at 4K) |
| Simple edits | Gemini 2.5 Flash Image |
| Complex compositions | Gemini 3 Pro Image Preview |
| Infographics/diagrams | Gemini 3 Pro Image Preview |
Text Rendering Benchmarks (4K resolution):
- Gemini 3 Pro Image: 94% legible text
- DALL-E 3: 78% legible text
- Midjourney: Decorative pseudo-text only
| 需求 | 推荐模型 |
|---|---|
| 快速迭代 | Gemini 2.5 Flash Image |
| 4K分辨率 | Gemini 3 Pro Image Preview |
| 图像中添加文字 | Gemini 3 Pro(4K分辨率下文字可读性94%) |
| 简单编辑 | Gemini 2.5 Flash Image |
| 复杂构图 | Gemini 3 Pro Image Preview |
| 信息图表/流程图 | Gemini 3 Pro Image Preview |
文字渲染基准测试(4K分辨率):
- Gemini 3 Pro Image:94%文字可读
- DALL-E 3:78%文字可读
- Midjourney:仅生成装饰性伪文字
When to Use
使用场景
Use Gemini Image Gen when:
- Stock photos don't fit brand/context
- Need Australian-specific imagery
- Need text in images (infographics, diagrams)
- Need consistent style across multiple images
- Need to edit/modify existing images
- Client has no photos of their work
Don't use when:
- Client has good photos of actual work
- Real team photos needed (discuss first)
- Product shots (use real products)
- Legal/compliance concerns
建议使用Gemini图像生成的场景:
- 库存照片不符合品牌/场景需求
- 需要澳大利亚特色的图像
- 需要在图像中添加文字(信息图表、流程图)
- 需要多张图像保持统一风格
- 需要编辑/修改现有图像
- 客户没有自己的工作照片
不建议使用的场景:
- 客户已有高质量的实际工作照片
- 需要真实的团队照片(需先沟通)
- 产品照片(使用真实产品)
- 存在法律/合规问题
Known Issues Prevention
已知问题预防
This skill prevents 5 documented issues:
本技能可预防5个已记录的问题:
Issue #1: Resolution Parameter Case Sensitivity
问题#1:分辨率参数大小写敏感
Error: Request fails with invalid parameter error
Source: Google AI Image Generation Docs
Why It Happens: Resolution values are case-sensitive and must use uppercase 'K'.
Prevention: Always use , , - never lowercase .
"4K""2K""1K""4k"typescript
// ❌ WRONG - causes request failure
config: { imageGenerationConfig: { resolution: "4k" } }
// ✅ CORRECT - uppercase required
config: { imageGenerationConfig: { resolution: "4K" } }typescript
// ❌ 错误写法 - 会导致请求失败
config: { imageGenerationConfig: { resolution: "4k" } }
// ✅ 正确写法 - 必须使用大写
config: { imageGenerationConfig: { resolution: "4K" } }Issue #2: Aspect Ratio May Be Ignored (Sept 2025+)
问题#2:宽高比可能被忽略(2025年9月起)
Error: Returns 1:1 square image despite requesting 16:9 or other ratios
Source: Google Support Thread
Why It Happens: Backend update in September 2025 affected Gemini 2.5 Flash Image model's aspect ratio handling.
Prevention: Use Gemini 3 Pro Image Preview for reliable aspect ratio control, or generate 1:1 and use multi-turn editing to extend.
typescript
// May ignore aspectRatio on Gemini 2.5 Flash Image
model: "gemini-2.5-flash-image",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }
// More reliable for aspect ratio control
model: "gemini-3-pro-image-preview",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }Status: Google confirmed working on fix (Sept 2025).
错误表现:尽管请求16:9或其他比例,仍返回1:1的正方形图像
来源:Google支持线程
原因:2025年9月的后端更新影响了Gemini 2.5 Flash Image模型的宽高比处理。
预防措施:使用Gemini 3 Pro Image Preview以获得可靠的宽高比控制,或先生成1:1图像再通过多轮编辑进行扩展。
typescript
// 在Gemini 2.5 Flash Image上可能会忽略aspectRatio
model: "gemini-2.5-flash-image",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }
// 宽高比控制更可靠
model: "gemini-3-pro-image-preview",
config: { imageGenerationConfig: { aspectRatio: "16:9" } }状态:Google已确认正在修复(2025年9月)。
Issue #3: Exceeding 5 Human Reference Images
问题#3:参考人物图像超过5张
Error: Unpredictable character consistency in generated images
Source: Google AI Image Generation Docs
Why It Happens: Gemini 3 Pro Image supports up to 14 reference images total, but only 5 can be human images for character consistency.
Prevention: Limit human images to 5 or fewer. Use remaining slots (up to 14 total) for objects/scenes.
typescript
// ❌ WRONG - 7 human images exceeds limit
const humanImages = [img1, img2, img3, img4, img5, img6, img7];
const prompt = [
{ text: "Generate consistent characters" },
...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];
// ✅ CORRECT - max 5 human images
const humanImages = images.slice(0, 5); // Limit to 5
const objectImages = images.slice(5, 14); // Up to 9 more for objects
const prompt = [
{ text: "Generate consistent characters" },
...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
...objectImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];错误表现:生成图像中的角色一致性不可预测
来源:Google AI 图像生成文档
原因:Gemini 3 Pro Image总共支持最多14张参考图像,但为保证角色一致性,其中人物图像最多只能有5张。
预防措施:将人物图像数量限制在5张或更少。剩余的参考位(最多14张总数)可用于物体/场景图像。
typescript
// ❌ 错误写法 - 7张人物图像超过限制
const humanImages = [img1, img2, img3, img4, img5, img6, img7];
const prompt = [
{ text: "Generate consistent characters" },
...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];
// ✅ 正确写法 - 最多5张人物图像
const humanImages = images.slice(0, 5); // 限制为5张
const objectImages = images.slice(5, 14); // 最多可再添加9张物体图像
const prompt = [
{ text: "Generate consistent characters" },
...humanImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
...objectImages.map(img => ({ inlineData: { data: img, mimeType: "image/png" }})),
];Issue #4: SynthID Watermark Cannot Be Disabled
问题#4:SynthID水印无法禁用
Error: N/A (documented limitation)
Source: Google AI Image Generation Docs
Why It Happens: All generated images automatically include a SynthID watermark for content authenticity tracking.
Prevention: Be aware of this limitation for commercial use cases. Watermark cannot be disabled by developers.
错误表现:无(已记录的限制)
来源:Google AI 图像生成文档
原因:所有生成的图像都会自动包含SynthID水印,用于内容真实性追踪。
预防措施:在商业使用场景中需注意此限制。开发者无法禁用水印。
Issue #5: Google Search Grounding Excludes Image Results
问题#5:Google搜索 grounding 不包含图像结果
Error: Generated images don't reflect visual search results, only text
Source: Google AI Image Generation Docs
Why It Happens: When using Google Search tool with image generation, "image-based search results are not passed to the generation model."
Prevention: Only text-based search results inform the visual output. Don't expect the model to reference images from search results.
typescript
// Google Search tool enabled
const response = await ai.models.generateContent({
model: "gemini-3-pro-image-preview",
contents: "Generate image of latest iPhone design",
tools: [{ googleSearch: {} }],
config: { responseModalities: ["TEXT", "IMAGE"] },
});
// Result: Only text search results used, not image results from web search错误表现:生成的图像不反映视觉搜索结果,仅基于文本
来源:Google AI 图像生成文档
原因:当将Google搜索工具与图像生成结合使用时,“基于图像的搜索结果不会传递给生成模型”。
预防措施:只有基于文本的搜索结果会影响视觉输出。不要期望模型参考搜索结果中的图像。
typescript
// 启用Google搜索工具
const response = await ai.models.generateContent({
model: "gemini-3-pro-image-preview",
contents: "Generate image of latest iPhone design",
tools: [{ googleSearch: {} }],
config: { responseModalities: ["TEXT", "IMAGE"] },
});
// 结果:仅使用文本搜索结果,不使用网页搜索中的图像结果Pricing
定价
Current Pricing (as of November 2025):
- Gemini 2.5 Flash Image: ~$0.008 per image
- Input: 258 tokens per image
- Output: 1290 tokens per image
- Rate: $30.00 per 1M output tokens
Note: The API (Imagen models) does not return in responses. Track costs manually based on pricing above.
generateImagesusageMetadata当前定价(截至2025年11月):
- Gemini 2.5 Flash Image:约每张图片0.008美元
- 输入:每张图片258个token
- 输出:每张图片1290个token
- 费率:每100万输出token30.00美元
注意: API(Imagen模型)不会在响应中返回。请根据上述定价手动跟踪成本。
generateImagesusageMetadataReference Files
参考文件
- - Effective prompt patterns
references/prompting.md - - Hero, service, background templates
references/website-images.md - - Multi-turn editing patterns
references/editing.md - - Australian-specific details
references/local-imagery.md - - API code examples
references/integration.md
Last verified: 2026-01-21 | Skill version: 2.0.0 | Changes: Added SDK migration notice (critical), updated to current model names (gemini-3-pro-image-preview, gemini-2.5-flash-image), added 5 Known Issues (resolution case sensitivity, aspect ratio bug, reference image limits, SynthID watermark, Google Search grounding), added pricing section, added text rendering benchmarks.
- - 有效的提示词模板
references/prompting.md - - 首屏、服务、背景模板
references/website-images.md - - 多轮编辑模板
references/editing.md - - 澳大利亚特色细节
references/local-imagery.md - - API代码示例
references/integration.md
最后验证时间:2026-01-21 | 技能版本:2.0.0 | 更新内容:添加了SDK迁移通知(重要),更新为当前模型名称(gemini-3-pro-image-preview、gemini-2.5-flash-image),添加了5个已知问题(分辨率大小写敏感、宽高比bug、参考图像限制、SynthID水印、Google搜索grounding),添加了定价部分,添加了文字渲染基准测试。