image-prompt-generator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Image Prompt Generator

图像提示词生成器

Generate professional, non-generic images using Google's Gemini API for image generation.
使用Google的Gemini API生成专业且非通用的图像。

Prerequisites & Setup

前提条件与设置

Getting Your Gemini API Key

获取Gemini API密钥

  1. Go to Google AI Studio
  2. Sign in with your Google account
  3. Click "Create API Key"
  4. Copy the generated key
  1. 访问Google AI Studio
  2. 使用Google账号登录
  3. 点击"Create API Key"
  4. 复制生成的密钥

Configuring the API Key

配置API密钥

Option 1: Environment file (recommended)
Create a
.env
file in your project root:
bash
GEMINI_API_KEY=your_api_key_here
Option 2: Direct environment variable
bash
export GEMINI_API_KEY=your_api_key_here
选项1:环境文件(推荐)
在项目根目录创建
.env
文件:
bash
GEMINI_API_KEY=your_api_key_here
选项2:直接设置环境变量
bash
export GEMINI_API_KEY=your_api_key_here

Install Dependencies

安装依赖包

bash
pip install google-generativeai python-dotenv pillow
bash
pip install google-generativeai python-dotenv pillow

Available Models

可用模型

ModelAPI NameBest For
Flash
gemini-2.5-flash-image
Speed, drafts, iteration
Pro
gemini-3-pro-image-preview
Final assets, 16:9 aspect ratio, quality
CRITICAL: Use
gemini-3-pro-image-preview
for:
  • Thumbnails (need 16:9 aspect ratio)
  • Final production images
  • Any image where aspect_ratio config is needed

ModelAPI Name适用场景
Flash
gemini-2.5-flash-image
快速生成、草稿、迭代
Pro
gemini-3-pro-image-preview
最终素材、16:9宽高比、高质量
重要提示:以下场景请使用
gemini-3-pro-image-preview
  • 缩略图(需要16:9宽高比)
  • 最终生产用图像
  • 任何需要配置宽高比的图像

Workflow Overview

工作流程概述

  1. Brainstorm Concepts - Generate 4-6 high-level visual ideas
  2. Select Direction - User picks the concept they like
  3. Optimize Prompt - Refine into a strong, detailed prompt
  4. Style Variations - Adapt to 2-3 different visual styles
  5. Generate Images - Run via Gemini API
  1. 构思概念 - 生成4-6个高水准的视觉创意
  2. 选择方向 - 用户挑选心仪的概念
  3. 优化提示词 - 将概念细化为清晰、详尽的提示词
  4. 风格变体 - 适配2-3种不同的视觉风格
  5. 生成图像 - 通过Gemini API运行生成

Step 1: Brainstorm Concepts

步骤1:构思概念

When the user provides a topic or use case, generate 4-6 high-level visual concepts. Each concept should be:
  • One sentence describing the visual idea
  • Concrete and immediate - you can picture it instantly
  • Conceptual but not abstract - a clear object/scene with meaning
  • Non-generic - avoid cliches (no lightbulbs for ideas, no handshakes for partnership)
Format:
1. **[Short label]** - One sentence description of the visual concept and why it works.

2. **[Short label]** - One sentence description...
Example for "newsletter about personal productivity":
1. **Compass with coffee stain** - A vintage compass where the needle points toward a coffee ring stain on a map, suggesting direction emerges from daily rituals.

2. **Clock face with seasons** - A clock where the 12 hours show seasonal changes, suggesting time management over long arcs, not just hours.

3. **Empty desk with shadow** - A minimalist desk in morning light, but the shadow shows a cluttered desk - the gap between intention and reality.

4. **Single key on many keychains** - One small key attached to dozens of decorative keychains, suggesting we overcomplicate simple solutions.
Wait for user to select before proceeding.
当用户提供主题或使用场景时,生成4-6个高水准的视觉概念。每个概念需满足:
  • 一句话描述视觉创意
  • 具体直观 - 能立刻在脑海中呈现画面
  • 有概念但不抽象 - 包含明确的物体/场景且具备意义
  • 非通用 - 避免陈词滥调(不用灯泡代表创意,不用握手代表合作)
格式:
1. **[简短标签]** - 一句话描述视觉概念及其合理性。

2. **[简短标签]** - 一句话描述...
「个人生产力通讯」示例:
1. **带咖啡渍的指南针** - 复古指南针的指针指向地图上的咖啡渍,寓意方向源于日常习惯。

2. **带四季的时钟面** - 时钟的12个刻度对应四季变化,寓意时间管理需着眼长期而非仅小时。

3. **带阴影的空书桌** - 晨光下的极简书桌,但其阴影却是杂乱的书桌——体现理想与现实的差距。

4. **挂着大量钥匙扣的单把钥匙** - 一把小钥匙挂着数十个装饰性钥匙扣,寓意我们把简单的解决方案复杂化了。
等待用户选择后再继续下一步。

Step 2: Optimize the Prompt

步骤2:优化提示词

Once the user selects a concept, develop it into a full prompt. Structure:
Create a [style type] illustration of [subject].

CONCEPT: [Expand the one-sentence idea into a clear visual description]

STYLE: [Artistic approach - load from references/styles/ if brand-specific]

COMPOSITION: [Framing, focal point, negative space, balance]

COLORS: [Palette - describe by name, not hex codes which may render as text]

TEXTURE: [Surface qualities, analog/digital feel]

AVOID: [What should NOT appear - be specific]

FORMAT: [Aspect ratio]
Key principles:
  • Natural language, full sentences - no tag soup
  • Describe colors by name (burnt orange, sky blue, near-black) not hex codes
  • Maximum 2-3 elements - if it feels busy, remove something
  • Favor metaphor over literal depiction
用户选定概念后,将其扩展为完整的提示词。结构如下:
创作一幅[风格类型]插画,主题为[主体]。

概念:[将一句话创意扩展为清晰的视觉描述]

风格:[艺术手法 - 如需品牌特定风格,请参考样式文件]

构图:[取景、焦点、留白、平衡]

色彩:[调色板 - 用颜色名称描述,不要用十六进制代码,避免被识别为文本]

质感:[表面质感、模拟/数字风格]

避免:[禁止出现的元素 - 需具体]

格式:[宽高比]
核心原则:
  • 使用自然语言、完整句子 - 不要堆砌标签
  • 用颜色名称描述(焦橙色、天蓝色、近黑色)而非十六进制代码
  • 最多包含2-3个元素 - 若画面显得拥挤,移除部分元素
  • 优先使用隐喻而非直白描绘

Step 3: Style Variations

步骤3:风格变体

Default style: Risograph - Use
references/styles/risograph.md
unless the content calls for something different.
Available styles in
references/styles/
:
  • risograph.md - DEFAULT. Halftone dots, misregistration, indie printmaking aesthetic. Warm, tactile, analog.
  • minimalist-ink.md - High-contrast black and white, crosshatching. For craft/mastery posts.
  • watercolor-line.md - Ink linework with watercolor washes, warm. For organic topics.
  • editorial-conceptual.md - Conceptual, sophisticated, editorial wit. For abstract/philosophical posts.
Present style options to user, recommending risograph as default.
默认风格:Risograph - 除非内容有特殊要求,否则使用
references/styles/risograph.md
中的样式。
references/styles/
中的可用样式:
  • risograph.md - 默认风格。半色调网点、套色错位、独立印刷美学。温暖、有触感、模拟风格。
  • minimalist-ink.md - 高对比度黑白、交叉排线。适用于工艺/专业技能类内容。
  • watercolor-line.md - 墨水线条搭配水彩晕染,温暖风格。适用于有机主题。
  • editorial-conceptual.md - 概念化、精致、编辑式巧思。适用于抽象/哲学类内容。
向用户展示风格选项,默认推荐Risograph风格。

Step 4: Generate via API

步骤4:通过API生成

Running the Script

运行脚本

bash
undefined
bash
undefined

Load key from .env and generate

从.env加载密钥并生成

export $(grep GEMINI_API_KEY .env) &&
python scripts/generate_image.py "prompt here" --model pro --aspect 16:9
export $(grep GEMINI_API_KEY .env) &&
python scripts/generate_image.py "prompt here" --model pro --aspect 16:9

Save to specific folder

保存到指定文件夹

python scripts/generate_image.py "prompt" --output "./images" --name "my_image"

**Options:**
- `--model flash` (faster, cheaper) or `--model pro` (higher quality)
- `--aspect 16:9`, `1:1`, or `9:16` (**PRO MODEL ONLY** - for flash, you MUST include ratio in prompt text)
- `--variations N` - generate N versions
- `--output ./path` - save location
- `--name prefix` - filename prefix

**Output location:** Save images alongside the content they belong to - not a generic images dump.
python scripts/generate_image.py "prompt" --output "./images" --name "my_image"

**选项:**
- `--model flash`(更快、成本更低)或`--model pro`(更高质量)
- `--aspect 16:9`、`1:1`或`9:16`(**仅PRO模型支持** - Flash模型需在提示词文本中指定比例)
- `--variations N` - 生成N个版本
- `--output ./path` - 保存路径
- `--name prefix` - 文件名前缀

**输出位置:** 将图像保存在对应内容的旁边 - 不要存到通用图像文件夹。

Step 5: Iterate

步骤5:迭代优化

After user reviews generated images:
  • 80% good? Request specific edits conversationally rather than regenerating
  • Composition off? Adjust framing or element placement in prompt
  • Wrong style? Try a different style reference
  • Too busy? Simplify to fewer elements
  • Colors wrong? Be more explicit about palette
用户查看生成的图像后:
  • 80%满意? 用对话方式请求具体修改,而非重新生成
  • 构图问题? 在提示词中调整取景或元素位置
  • 风格不符? 尝试不同的样式参考
  • 画面拥挤? 简化为更少元素
  • 色彩错误? 更明确地指定调色板

Prompting Principles

提示词撰写原则

Write Like a Creative Director

像创意总监一样撰写

Brief the model like a human artist. Use proper grammar, full sentences, and descriptive adjectives.
Don'tDo
"Cool car, neon, city, night, 8k""A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis."
Be specific about:
  • Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit"
  • Materiality: Describe textures - "matte finish," "brushed steel," "soft velvet," "crumpled paper"
  • Setting: Define location, time of day, weather
  • Lighting: Specify mood and light source
  • Mood: Emotional tone of the image
像指导人类艺术家一样指导模型。使用正确语法、完整句子和描述性形容词。
错误示例正确示例
"Cool car, neon, city, night, 8k""一幅电影级宽幅画面:未来跑车在雨夜的东京街头疾驰。霓虹招牌倒映在湿漉漉的路面和汽车的金属底盘上。"
需明确描述:
  • 主体: 不说“一个女人”,要说“一位身着复古香奈儿风格套装的优雅老年女性”
  • 材质: 描述质感 - “哑光质感”、“拉丝钢材”、“柔软丝绒”、“皱纸”
  • 场景: 定义地点、时间、天气
  • 光线: 指定氛围和光源
  • 情绪: 图像的情感基调

Provide Context

提供上下文

Context helps the model make logical artistic decisions. Include the "why" or "for whom."
Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook." (Model infers: professional plating, shallow depth of field, perfect lighting)
上下文有助于模型做出合理的艺术决策。包含“原因”或“受众”信息。
示例: "为巴西高端美食食谱创作一幅三明治图像。" (模型会推断:专业摆盘、浅景深、完美光线)

Keep It Simple

保持简洁

  • One clear focal point
  • Maximum 2-3 elements total
  • Generous negative space
  • If it feels busy, remove something
  • 一个清晰的焦点
  • 最多包含2-3个元素
  • 充足的留白
  • 若画面拥挤,移除部分元素

Avoid the Generic

避免通用化

  • No lightbulbs for "ideas"
  • No handshakes for "partnership"
  • No happy stock photo poses
  • No glossy AI aesthetic
  • 不用灯泡代表“创意”
  • 不用握手代表“合作”
  • 不用千篇一律的库存照片姿势
  • 不用光滑的AI审美风格

Resources

资源

references/styles/

references/styles/

Aesthetic style definitions:
  • risograph.md
    - DEFAULT - Halftone, misregistration, indie printmaking
  • minimalist-ink.md
    - Black and white ink illustration
  • watercolor-line.md
    - Ink with watercolor washes
  • editorial-conceptual.md
    - Conceptual editorial style
美学风格定义:
  • risograph.md
    - 默认 - 半色调、套色错位、独立印刷风格
  • minimalist-ink.md
    - 黑白墨水插画
  • watercolor-line.md
    - 墨水线条搭配水彩晕染
  • editorial-conceptual.md
    - 概念化编辑风格

scripts/

scripts/

  • generate_image.py
    - Gemini API image generation
  • generate_image.py
    - Gemini API图像生成脚本

Prompt Modifiers Reference

提示词修饰符参考

CategoryExamples
Lightinggolden hour, dramatic shadows, soft diffused light, neon glow, overcast
Stylecinematic, editorial, technical diagram, hand-drawn, photorealistic
Texturematte finish, brushed steel, soft velvet, crumpled paper, weathered wood
Compositionwide shot, close-up, bird's eye view, dutch angle, symmetrical
Moodenergetic, serene, dramatic, playful, sophisticated
Quality4K, high-fidelity, pixel-perfect, professional grade
类别示例
光线黄金时刻、戏剧性阴影、柔和漫射光、霓虹光晕、阴天
风格电影级、编辑风、技术图表、手绘、写实
质感哑光、拉丝钢材、柔软丝绒、皱纸、风化木材
构图宽幅、特写、鸟瞰、倾斜角度、对称
情绪充满活力、宁静、戏剧性、 playful、精致
质量4K、高保真、像素完美、专业级

Advanced Capabilities

高级功能

Text Rendering & Infographics

文本渲染与信息图

Put exact text in quotes. Specify style: "polished editorial," "technical diagram," or "hand-drawn whiteboard."
Example prompts:
Earnings Report Infographic:
"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."
Whiteboard Summary:
"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."
将精确文本放在引号中。指定风格:“精致编辑风”、“技术图表”或“手绘白板”。
示例提示词:
收益报告信息图:
"生成一幅简洁现代的信息图,总结此收益报告的关键财务亮点。包含‘收入增长’和‘净利润’图表,并将CEO的关键语录放在风格化的引用框中突出显示。"
白板总结:
"将‘Transformer神经网络架构’的概念总结为适合大学讲座的手绘白板图。用不同颜色的标记区分Encoder和Decoder模块,并为‘自注意力’和‘前馈网络’添加清晰标签。"

Character Consistency & Thumbnails

角色一致性与缩略图

Use reference images and state "Keep the person's facial features exactly the same as Image 1." Describe expression/action changes while maintaining identity.
Example prompt:
Viral Thumbnail:
"Design a viral video thumbnail using the person from Image 1.
Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised.
Action: Pose the person on the left side, pointing their finger towards the right side of the frame.
Subject: On the right side, place a high-quality image of a delicious avocado toast.
Graphics: Add a bold yellow arrow connecting the person's finger to the toast.
Text: Overlay massive, pop-style text in the middle: 'Done in 3 mins!'. Use a thick white outline and drop shadow.
Background: A blurred, bright kitchen background. High saturation and contrast."
使用参考图像并注明“保持人物面部特征与图像1完全一致”。描述表情/动作变化时保持身份不变。
示例提示词:
爆款缩略图:
"使用图像1中的人物设计爆款视频缩略图。
面部一致性:保持人物面部特征与图像1完全一致,但将表情改为兴奋惊讶。
动作:将人物放在左侧,手指指向画面右侧。
主体:在右侧放置一张高质量的美味牛油果吐司图像。
图形:添加一个醒目的黄色箭头连接人物手指和吐司。
文字:在画面中间叠加巨大的流行风格文字:‘3分钟搞定!’。使用粗白描边和投影。
背景:模糊的明亮厨房背景。高饱和度和对比度。"

Image Reworking (Edit Existing Images)

图像重制(编辑现有图像)

The
--input
flag enables "rework mode" - pass an existing image to Gemini and describe the changes you want.
Key use cases:
  • Small tweaks - Adjust colors, add/remove elements, change lighting
  • Style transfer - Keep composition but change artistic style
  • Object manipulation - Remove, add, or modify specific objects
  • Seasonal/temporal changes - Same scene, different time/season
Running in rework mode:
bash
undefined
--input
标志启用“重制模式” - 将现有图像传入Gemini并描述所需更改。
核心使用场景:
  • 小调整 - 修改颜色、添加/移除元素、更改光线
  • 风格迁移 - 保留构图但更改艺术风格
  • 物体操作 - 移除、添加或修改特定物体
  • 季节/时间变化 - 同一场景,不同时间/季节
运行重制模式:
bash
undefined

Basic edit - add something

基础编辑 - 添加元素

python scripts/generate_image.py "Add snow to the roof and yard"
--input ./house.png
--model pro
python scripts/generate_image.py "Add snow to the roof and yard"
--input ./house.png
--model pro

Color adjustment

颜色调整

python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical"
--input ./thumbnail.png
--model pro
python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical"
--input ./thumbnail.png
--model pro

Style transfer - keep composition, change aesthetic

风格迁移 - 保留构图,更改美学风格

python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration"
--input ./photo.png
--model pro
python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration"
--input ./photo.png
--model pro

Generate variations of an edit

生成编辑变体

python scripts/generate_image.py "Make the lighting warmer, like golden hour"
--input ./portrait.png
--variations 3
--model pro

**Prompting tips for rework mode:**

1. **Be specific about what to preserve:**
   - "Keep the person's facial features exactly the same"
   - "Maintain the composition and framing"
   - "Don't change the background"

2. **Be explicit about what to change:**
   - "Change ONLY the color of the shirt from blue to red"
   - "Add snow to the roof and nothing else"
   - "Remove the text overlay"

3. **Use comparative language:**
   - "Make the colors more vibrant"
   - "Increase the contrast slightly"
   - "Make the lighting softer and more diffused"

**Output naming:** Files from rework mode are named `{prefix}_{timestamp}_edit_{model}.png` to distinguish from generated images (`_gen_`).
python scripts/generate_image.py "Make the lighting warmer, like golden hour"
--input ./portrait.png
--variations 3
--model pro

**重制模式提示词技巧:**

1. **明确说明保留内容:**
   - "保持人物面部特征完全一致"
   - "保留构图和取景"
   - "不要更改背景"

2. **明确说明更改内容:**
   - "仅将衬衫颜色从蓝色改为红色"
   - "仅在屋顶和院子添加雪"
   - "移除文本叠加层"

3. **使用比较性语言:**
   - "让颜色更鲜艳"
   - "略微提高对比度"
   - "让光线更柔和、更漫射"

**输出命名:** 重制模式生成的文件命名为`{prefix}_{timestamp}_edit_{model}.png`,以区分生成的图像(`_gen_`)。

Advanced Editing Examples

高级编辑示例

Object Removal:
bash
python scripts/generate_image.py \
  "Remove the tourists from the background and fill with matching cobblestones and storefronts" \
  --input ./street-photo.png \
  --model pro
Seasonal Control:
bash
python scripts/generate_image.py \
  "Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
  --input ./house-summer.png \
  --model pro
Character Consistency (thumbnail series):
bash
python scripts/generate_image.py \
  "Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
  --input ./person-reference.png \
  --model pro

移除物体:
bash
python scripts/generate_image.py \
  "Remove the tourists from the background and fill with matching cobblestones and storefronts" \
  --input ./street-photo.png \
  --model pro
季节控制:
bash
python scripts/generate_image.py \
  "Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
  --input ./house-summer.png \
  --model pro
角色一致性(缩略图系列):
bash
python scripts/generate_image.py \
  "Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
  --input ./person-reference.png \
  --model pro

Related Skills

相关技能

  • youtube-title-creator - Pair generated images with optimized titles
  • social-content-creation - Use images in platform-optimized posts

For custom brand styles, create new style files in references/styles/ following the existing format
  • youtube-title-creator - 将生成的图像与优化后的标题搭配使用
  • social-content-creation - 在平台优化的帖子中使用图像

如需自定义品牌风格,请按照现有格式在references/styles/中创建新的样式文件