image-prompt-generator

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Image Prompt Generator

图像提示词生成器

Generate professional, non-generic images using Google's Gemini API for image generation.

使用Google的Gemini API生成专业且非通用的图像。

Prerequisites & Setup

前提条件与设置

Getting Your Gemini API Key

获取Gemini API密钥

Go to Google AI Studio
Sign in with your Google account
Click "Create API Key"
Copy the generated key

访问Google AI Studio
使用Google账号登录
点击"Create API Key"
复制生成的密钥

Configuring the API Key

配置API密钥

Option 1: Environment file (recommended)

Create a

.env

file in your project root:

bash

GEMINI_API_KEY=your_api_key_here

Option 2: Direct environment variable

bash

export GEMINI_API_KEY=your_api_key_here

选项1：环境文件（推荐）

在项目根目录创建

.env

文件：

bash

GEMINI_API_KEY=your_api_key_here

选项2：直接设置环境变量

bash

export GEMINI_API_KEY=your_api_key_here

Install Dependencies

安装依赖包

bash

pip install google-generativeai python-dotenv pillow

bash

pip install google-generativeai python-dotenv pillow

Available Models

可用模型

Model	API Name	Best For
Flash	`gemini-2.5-flash-image`	Speed, drafts, iteration
Pro	`gemini-3-pro-image-preview`	Final assets, 16:9 aspect ratio, quality

CRITICAL: Use

gemini-3-pro-image-preview

for:

Thumbnails (need 16:9 aspect ratio)
Final production images
Any image where aspect_ratio config is needed

Model	API Name	适用场景
Flash	`gemini-2.5-flash-image`	快速生成、草稿、迭代
Pro	`gemini-3-pro-image-preview`	最终素材、16:9宽高比、高质量

重要提示：以下场景请使用

gemini-3-pro-image-preview

：

缩略图（需要16:9宽高比）
最终生产用图像
任何需要配置宽高比的图像

Workflow Overview

工作流程概述

Brainstorm Concepts - Generate 4-6 high-level visual ideas
Select Direction - User picks the concept they like
Optimize Prompt - Refine into a strong, detailed prompt
Style Variations - Adapt to 2-3 different visual styles
Generate Images - Run via Gemini API

构思概念 - 生成4-6个高水准的视觉创意
选择方向 - 用户挑选心仪的概念
优化提示词 - 将概念细化为清晰、详尽的提示词
风格变体 - 适配2-3种不同的视觉风格
生成图像 - 通过Gemini API运行生成

Step 1: Brainstorm Concepts

步骤1：构思概念

When the user provides a topic or use case, generate 4-6 high-level visual concepts. Each concept should be:

One sentence describing the visual idea
Concrete and immediate - you can picture it instantly
Conceptual but not abstract - a clear object/scene with meaning
Non-generic - avoid cliches (no lightbulbs for ideas, no handshakes for partnership)

Format:

1. **[Short label]** - One sentence description of the visual concept and why it works.

2. **[Short label]** - One sentence description...

Example for "newsletter about personal productivity":

1. **Compass with coffee stain** - A vintage compass where the needle points toward a coffee ring stain on a map, suggesting direction emerges from daily rituals.

2. **Clock face with seasons** - A clock where the 12 hours show seasonal changes, suggesting time management over long arcs, not just hours.

3. **Empty desk with shadow** - A minimalist desk in morning light, but the shadow shows a cluttered desk - the gap between intention and reality.

4. **Single key on many keychains** - One small key attached to dozens of decorative keychains, suggesting we overcomplicate simple solutions.

Wait for user to select before proceeding.

当用户提供主题或使用场景时，生成4-6个高水准的视觉概念。每个概念需满足：

一句话描述视觉创意
具体直观 - 能立刻在脑海中呈现画面
有概念但不抽象 - 包含明确的物体/场景且具备意义
非通用 - 避免陈词滥调（不用灯泡代表创意，不用握手代表合作）

格式：

1. **[简短标签]** - 一句话描述视觉概念及其合理性。

2. **[简短标签]** - 一句话描述...

「个人生产力通讯」示例：

1. **带咖啡渍的指南针** - 复古指南针的指针指向地图上的咖啡渍，寓意方向源于日常习惯。

2. **带四季的时钟面** - 时钟的12个刻度对应四季变化，寓意时间管理需着眼长期而非仅小时。

3. **带阴影的空书桌** - 晨光下的极简书桌，但其阴影却是杂乱的书桌——体现理想与现实的差距。

4. **挂着大量钥匙扣的单把钥匙** - 一把小钥匙挂着数十个装饰性钥匙扣，寓意我们把简单的解决方案复杂化了。

等待用户选择后再继续下一步。

Step 2: Optimize the Prompt

步骤2：优化提示词

Once the user selects a concept, develop it into a full prompt. Structure:

Create a [style type] illustration of [subject].

CONCEPT: [Expand the one-sentence idea into a clear visual description]

STYLE: [Artistic approach - load from references/styles/ if brand-specific]

COMPOSITION: [Framing, focal point, negative space, balance]

COLORS: [Palette - describe by name, not hex codes which may render as text]

TEXTURE: [Surface qualities, analog/digital feel]

AVOID: [What should NOT appear - be specific]

FORMAT: [Aspect ratio]

Key principles:

Natural language, full sentences - no tag soup
Describe colors by name (burnt orange, sky blue, near-black) not hex codes
Maximum 2-3 elements - if it feels busy, remove something
Favor metaphor over literal depiction

用户选定概念后，将其扩展为完整的提示词。结构如下：

创作一幅[风格类型]插画，主题为[主体]。

概念：[将一句话创意扩展为清晰的视觉描述]

风格：[艺术手法 - 如需品牌特定风格，请参考样式文件]

构图：[取景、焦点、留白、平衡]

色彩：[调色板 - 用颜色名称描述，不要用十六进制代码，避免被识别为文本]

质感：[表面质感、模拟/数字风格]

避免：[禁止出现的元素 - 需具体]

格式：[宽高比]

核心原则：

使用自然语言、完整句子 - 不要堆砌标签
用颜色名称描述（焦橙色、天蓝色、近黑色）而非十六进制代码
最多包含2-3个元素 - 若画面显得拥挤，移除部分元素
优先使用隐喻而非直白描绘

Step 3: Style Variations

步骤3：风格变体

Default style: Risograph - Use

references/styles/risograph.md

unless the content calls for something different.

Available styles in

references/styles/

risograph.md - DEFAULT. Halftone dots, misregistration, indie printmaking aesthetic. Warm, tactile, analog.
minimalist-ink.md - High-contrast black and white, crosshatching. For craft/mastery posts.
watercolor-line.md - Ink linework with watercolor washes, warm. For organic topics.
editorial-conceptual.md - Conceptual, sophisticated, editorial wit. For abstract/philosophical posts.

Present style options to user, recommending risograph as default.

默认风格：Risograph - 除非内容有特殊要求，否则使用

references/styles/risograph.md

中的样式。

references/styles/

中的可用样式：

risograph.md - 默认风格。半色调网点、套色错位、独立印刷美学。温暖、有触感、模拟风格。
minimalist-ink.md - 高对比度黑白、交叉排线。适用于工艺/专业技能类内容。
watercolor-line.md - 墨水线条搭配水彩晕染，温暖风格。适用于有机主题。
editorial-conceptual.md - 概念化、精致、编辑式巧思。适用于抽象/哲学类内容。

向用户展示风格选项，默认推荐Risograph风格。

Step 4: Generate via API

步骤4：通过API生成

Running the Script

运行脚本

bash

undefined

bash

undefined

Load key from .env and generate

从.env加载密钥并生成

export $(grep GEMINI_API_KEY .env) &&
python scripts/generate_image.py "prompt here" --model pro --aspect 16:9

Save to specific folder

保存到指定文件夹

python scripts/generate_image.py "prompt" --output "./images" --name "my_image"


**Options:**
- `--model flash` (faster, cheaper) or `--model pro` (higher quality)
- `--aspect 16:9`, `1:1`, or `9:16` (**PRO MODEL ONLY** - for flash, you MUST include ratio in prompt text)
- `--variations N` - generate N versions
- `--output ./path` - save location
- `--name prefix` - filename prefix

**Output location:** Save images alongside the content they belong to - not a generic images dump.

python scripts/generate_image.py "prompt" --output "./images" --name "my_image"


**选项：**
- `--model flash`（更快、成本更低）或`--model pro`（更高质量）
- `--aspect 16:9`、`1:1`或`9:16`（**仅PRO模型支持** - Flash模型需在提示词文本中指定比例）
- `--variations N` - 生成N个版本
- `--output ./path` - 保存路径
- `--name prefix` - 文件名前缀

**输出位置：** 将图像保存在对应内容的旁边 - 不要存到通用图像文件夹。

Step 5: Iterate

步骤5：迭代优化

After user reviews generated images:

80% good? Request specific edits conversationally rather than regenerating
Composition off? Adjust framing or element placement in prompt
Wrong style? Try a different style reference
Too busy? Simplify to fewer elements
Colors wrong? Be more explicit about palette

用户查看生成的图像后：

80%满意？ 用对话方式请求具体修改，而非重新生成
构图问题？ 在提示词中调整取景或元素位置
风格不符？ 尝试不同的样式参考
画面拥挤？ 简化为更少元素
色彩错误？ 更明确地指定调色板

Prompting Principles

提示词撰写原则

Write Like a Creative Director

像创意总监一样撰写

Brief the model like a human artist. Use proper grammar, full sentences, and descriptive adjectives.

Don't	Do
"Cool car, neon, city, night, 8k"	"A cinematic wide shot of a futuristic sports car speeding through a rainy Tokyo street at night. The neon signs reflect off the wet pavement and the car's metallic chassis."

Be specific about:

Subject: Instead of "a woman," say "a sophisticated elderly woman wearing a vintage chanel-style suit"
Materiality: Describe textures - "matte finish," "brushed steel," "soft velvet," "crumpled paper"
Setting: Define location, time of day, weather
Lighting: Specify mood and light source
Mood: Emotional tone of the image

像指导人类艺术家一样指导模型。使用正确语法、完整句子和描述性形容词。

错误示例	正确示例
"Cool car, neon, city, night, 8k"	"一幅电影级宽幅画面：未来跑车在雨夜的东京街头疾驰。霓虹招牌倒映在湿漉漉的路面和汽车的金属底盘上。"

需明确描述：

主体： 不说“一个女人”，要说“一位身着复古香奈儿风格套装的优雅老年女性”
材质： 描述质感 - “哑光质感”、“拉丝钢材”、“柔软丝绒”、“皱纸”
场景： 定义地点、时间、天气
光线： 指定氛围和光源
情绪： 图像的情感基调

Provide Context

提供上下文

Context helps the model make logical artistic decisions. Include the "why" or "for whom."

Example: "Create an image of a sandwich for a Brazilian high-end gourmet cookbook." (Model infers: professional plating, shallow depth of field, perfect lighting)

上下文有助于模型做出合理的艺术决策。包含“原因”或“受众”信息。

示例： "为巴西高端美食食谱创作一幅三明治图像。" (模型会推断：专业摆盘、浅景深、完美光线)

Keep It Simple

保持简洁

One clear focal point
Maximum 2-3 elements total
Generous negative space
If it feels busy, remove something

一个清晰的焦点
最多包含2-3个元素
充足的留白
若画面拥挤，移除部分元素

Avoid the Generic

避免通用化

No lightbulbs for "ideas"
No handshakes for "partnership"
No happy stock photo poses
No glossy AI aesthetic

不用灯泡代表“创意”
不用握手代表“合作”
不用千篇一律的库存照片姿势
不用光滑的AI审美风格

Resources

资源

references/styles/

Aesthetic style definitions:

```
risograph.md
```
- DEFAULT - Halftone, misregistration, indie printmaking
```
minimalist-ink.md
```
- Black and white ink illustration
```
watercolor-line.md
```
- Ink with watercolor washes
```
editorial-conceptual.md
```
- Conceptual editorial style

美学风格定义：

```
risograph.md
```
- 默认 - 半色调、套色错位、独立印刷风格
```
minimalist-ink.md
```
- 黑白墨水插画
```
watercolor-line.md
```
- 墨水线条搭配水彩晕染
```
editorial-conceptual.md
```
- 概念化编辑风格

scripts/

```
generate_image.py
```
- Gemini API image generation

```
generate_image.py
```
- Gemini API图像生成脚本

Prompt Modifiers Reference

提示词修饰符参考

Category	Examples
Lighting	golden hour, dramatic shadows, soft diffused light, neon glow, overcast
Style	cinematic, editorial, technical diagram, hand-drawn, photorealistic
Texture	matte finish, brushed steel, soft velvet, crumpled paper, weathered wood
Composition	wide shot, close-up, bird's eye view, dutch angle, symmetrical
Mood	energetic, serene, dramatic, playful, sophisticated
Quality	4K, high-fidelity, pixel-perfect, professional grade

类别	示例
光线	黄金时刻、戏剧性阴影、柔和漫射光、霓虹光晕、阴天
风格	电影级、编辑风、技术图表、手绘、写实
质感	哑光、拉丝钢材、柔软丝绒、皱纸、风化木材
构图	宽幅、特写、鸟瞰、倾斜角度、对称
情绪	充满活力、宁静、戏剧性、 playful、精致
质量	4K、高保真、像素完美、专业级

Advanced Capabilities

高级功能

Text Rendering & Infographics

文本渲染与信息图

Put exact text in quotes. Specify style: "polished editorial," "technical diagram," or "hand-drawn whiteboard."

Example prompts:

Earnings Report Infographic:
"Generate a clean, modern infographic summarizing the key financial highlights from this earnings report. Include charts for 'Revenue Growth' and 'Net Income', and highlight the CEO's key quote in a stylized pull-quote box."

Whiteboard Summary:
"Summarize the concept of 'Transformer Neural Network Architecture' as a hand-drawn whiteboard diagram suitable for a university lecture. Use different colored markers for the Encoder and Decoder blocks, and include legible labels for 'Self-Attention' and 'Feed Forward'."

将精确文本放在引号中。指定风格：“精致编辑风”、“技术图表”或“手绘白板”。

示例提示词：

收益报告信息图：
"生成一幅简洁现代的信息图，总结此收益报告的关键财务亮点。包含‘收入增长’和‘净利润’图表，并将CEO的关键语录放在风格化的引用框中突出显示。"

白板总结：
"将‘Transformer神经网络架构’的概念总结为适合大学讲座的手绘白板图。用不同颜色的标记区分Encoder和Decoder模块，并为‘自注意力’和‘前馈网络’添加清晰标签。"

Character Consistency & Thumbnails

角色一致性与缩略图

Use reference images and state "Keep the person's facial features exactly the same as Image 1." Describe expression/action changes while maintaining identity.

Example prompt:

Viral Thumbnail:
"Design a viral video thumbnail using the person from Image 1.
Face Consistency: Keep the person's facial features exactly the same as Image 1, but change their expression to look excited and surprised.
Action: Pose the person on the left side, pointing their finger towards the right side of the frame.
Subject: On the right side, place a high-quality image of a delicious avocado toast.
Graphics: Add a bold yellow arrow connecting the person's finger to the toast.
Text: Overlay massive, pop-style text in the middle: 'Done in 3 mins!'. Use a thick white outline and drop shadow.
Background: A blurred, bright kitchen background. High saturation and contrast."

使用参考图像并注明“保持人物面部特征与图像1完全一致”。描述表情/动作变化时保持身份不变。

示例提示词：

爆款缩略图：
"使用图像1中的人物设计爆款视频缩略图。
面部一致性：保持人物面部特征与图像1完全一致，但将表情改为兴奋惊讶。
动作：将人物放在左侧，手指指向画面右侧。
主体：在右侧放置一张高质量的美味牛油果吐司图像。
图形：添加一个醒目的黄色箭头连接人物手指和吐司。
文字：在画面中间叠加巨大的流行风格文字：‘3分钟搞定！’。使用粗白描边和投影。
背景：模糊的明亮厨房背景。高饱和度和对比度。"

Image Reworking (Edit Existing Images)

图像重制（编辑现有图像）

The

--input

flag enables "rework mode" - pass an existing image to Gemini and describe the changes you want.

Key use cases:

Small tweaks - Adjust colors, add/remove elements, change lighting
Style transfer - Keep composition but change artistic style
Object manipulation - Remove, add, or modify specific objects
Seasonal/temporal changes - Same scene, different time/season

Running in rework mode:

bash

undefined

--input

标志启用“重制模式” - 将现有图像传入Gemini并描述所需更改。

核心使用场景：

小调整 - 修改颜色、添加/移除元素、更改光线
风格迁移 - 保留构图但更改艺术风格
物体操作 - 移除、添加或修改特定物体
季节/时间变化 - 同一场景，不同时间/季节

运行重制模式：

bash

undefined

Basic edit - add something

基础编辑 - 添加元素

python scripts/generate_image.py "Add snow to the roof and yard"
--input ./house.png
--model pro

Color adjustment

颜色调整

python scripts/generate_image.py "Change the accent color from red to teal, keep everything else identical"
--input ./thumbnail.png
--model pro

Style transfer - keep composition, change aesthetic

风格迁移 - 保留构图，更改美学风格

python scripts/generate_image.py "Convert this to risograph style with halftone dots and slight color misregistration"
--input ./photo.png
--model pro

Generate variations of an edit

生成编辑变体

python scripts/generate_image.py "Make the lighting warmer, like golden hour"
--input ./portrait.png
--variations 3
--model pro


**Prompting tips for rework mode:**

1. **Be specific about what to preserve:**
   - "Keep the person's facial features exactly the same"
   - "Maintain the composition and framing"
   - "Don't change the background"

2. **Be explicit about what to change:**
   - "Change ONLY the color of the shirt from blue to red"
   - "Add snow to the roof and nothing else"
   - "Remove the text overlay"

3. **Use comparative language:**
   - "Make the colors more vibrant"
   - "Increase the contrast slightly"
   - "Make the lighting softer and more diffused"

**Output naming:** Files from rework mode are named `{prefix}_{timestamp}_edit_{model}.png` to distinguish from generated images (`_gen_`).

python scripts/generate_image.py "Make the lighting warmer, like golden hour"
--input ./portrait.png
--variations 3
--model pro


**重制模式提示词技巧：**

1. **明确说明保留内容：**
   - "保持人物面部特征完全一致"
   - "保留构图和取景"
   - "不要更改背景"

2. **明确说明更改内容：**
   - "仅将衬衫颜色从蓝色改为红色"
   - "仅在屋顶和院子添加雪"
   - "移除文本叠加层"

3. **使用比较性语言：**
   - "让颜色更鲜艳"
   - "略微提高对比度"
   - "让光线更柔和、更漫射"

**输出命名：** 重制模式生成的文件命名为`{prefix}_{timestamp}_edit_{model}.png`，以区分生成的图像（`_gen_`）。

Advanced Editing Examples

高级编辑示例

Object Removal:

bash

python scripts/generate_image.py \
  "Remove the tourists from the background and fill with matching cobblestones and storefronts" \
  --input ./street-photo.png \
  --model pro

Seasonal Control:

bash

python scripts/generate_image.py \
  "Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
  --input ./house-summer.png \
  --model pro

Character Consistency (thumbnail series):

bash

python scripts/generate_image.py \
  "Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
  --input ./person-reference.png \
  --model pro

移除物体：

bash

python scripts/generate_image.py \
  "Remove the tourists from the background and fill with matching cobblestones and storefronts" \
  --input ./street-photo.png \
  --model pro

季节控制：

bash

python scripts/generate_image.py \
  "Turn this into winter. Add snow to the roof and yard. Change lighting to cold, overcast afternoon. Keep architecture identical." \
  --input ./house-summer.png \
  --model pro

角色一致性（缩略图系列）：

bash

python scripts/generate_image.py \
  "Keep the person's face exactly the same. Change expression to surprised. Add a pointing gesture toward the right side of the frame." \
  --input ./person-reference.png \
  --model pro

image-prompt-generator

Original

Translation

Image Prompt Generator

图像提示词生成器

Prerequisites & Setup

前提条件与设置

Getting Your Gemini API Key

获取Gemini API密钥

Configuring the API Key

配置API密钥

Install Dependencies

安装依赖包

Available Models

可用模型

Workflow Overview

工作流程概述

Step 1: Brainstorm Concepts

步骤1：构思概念

Step 2: Optimize the Prompt

步骤2：优化提示词

Step 3: Style Variations

步骤3：风格变体

Step 4: Generate via API

步骤4：通过API生成

Running the Script

运行脚本

Load key from .env and generate

从.env加载密钥并生成

Save to specific folder

保存到指定文件夹

Step 5: Iterate

步骤5：迭代优化

Prompting Principles

提示词撰写原则

Write Like a Creative Director

像创意总监一样撰写

Provide Context

提供上下文

Keep It Simple

保持简洁

Avoid the Generic

避免通用化

Resources

资源

references/styles/

references/styles/

scripts/

scripts/

Prompt Modifiers Reference

提示词修饰符参考

Advanced Capabilities

高级功能

Text Rendering & Infographics

文本渲染与信息图

Character Consistency & Thumbnails

角色一致性与缩略图

Image Reworking (Edit Existing Images)

图像重制（编辑现有图像）

Basic edit - add something

基础编辑 - 添加元素

Color adjustment

颜色调整

Style transfer - keep composition, change aesthetic

风格迁移 - 保留构图，更改美学风格

Generate variations of an edit

生成编辑变体

Advanced Editing Examples

高级编辑示例

Related Skills

相关技能