gemini-image

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Image Analysis

Gemini 图像分析

Analyze images using Gemini Pro's vision capabilities.
借助Gemini Pro的视觉能力分析图像。

Prerequisites

前置要求

bash
pip install google-generativeai
export GEMINI_API_KEY=your_api_key
bash
pip install google-generativeai
export GEMINI_API_KEY=your_api_key

CLI Reference

CLI参考

Basic Image Analysis

基础图像分析

bash
undefined
bash
undefined

Analyze an image

分析图像

gemini -m pro -f /path/to/image.png "Describe this image in detail"
gemini -m pro -f /path/to/image.png "Describe this image in detail"

With specific question

带特定问题

gemini -m pro -f screenshot.png "What error message is shown?"
gemini -m pro -f screenshot.png "What error message is shown?"

Multiple images

多张图像

gemini -m pro -f image1.png -f image2.png "Compare these two images"
undefined
gemini -m pro -f image1.png -f image2.png "Compare these two images"
undefined

Analysis Operations

分析操作

General Description

通用描述

bash
gemini -m pro -f image.png "Describe this image comprehensively:
1. Main subject/content
2. Colors and composition
3. Text visible (if any)
4. Context and purpose
5. Notable details"
bash
gemini -m pro -f image.png "Describe this image comprehensively:
1. Main subject/content
2. Colors and composition
3. Text visible (if any)
4. Context and purpose
5. Notable details"

Extract Text (OCR)

提取文本(OCR)

bash
gemini -m pro -f screenshot.png "Extract all text from this image.
Format as plain text, preserving layout where possible.
Include any text in buttons, labels, or UI elements."
bash
gemini -m pro -f screenshot.png "Extract all text from this image.
Format as plain text, preserving layout where possible.
Include any text in buttons, labels, or UI elements."

Code from Screenshot

从截图提取代码

bash
gemini -m pro -f code-screenshot.png "Extract the code from this screenshot.
Provide as properly formatted code with correct indentation.
Note any parts that are unclear or partially visible."
bash
gemini -m pro -f code-screenshot.png "Extract the code from this screenshot.
Provide as properly formatted code with correct indentation.
Note any parts that are unclear or partially visible."

UI Analysis

UI分析

bash
gemini -m pro -f ui-screenshot.png "Analyze this UI:
1. What application/website is this?
2. What page/screen is shown?
3. Main UI elements and their purpose
4. User flow/actions available
5. Any UX issues or suggestions"
bash
gemini -m pro -f ui-screenshot.png "Analyze this UI:
1. What application/website is this?
2. What page/screen is shown?
3. Main UI elements and their purpose
4. User flow/actions available
5. Any UX issues or suggestions"

Error Analysis

错误分析

bash
gemini -m pro -f error-screenshot.png "Analyze this error:
1. What error is shown?
2. What is the likely cause?
3. How to fix it?
4. Any related information visible?"
bash
gemini -m pro -f error-screenshot.png "Analyze this error:
1. What error is shown?
2. What is the likely cause?
3. How to fix it?
4. Any related information visible?"

Diagram Understanding

图表理解

bash
gemini -m pro -f diagram.png "Explain this diagram:
1. What type of diagram is this?
2. Main components and their relationships
3. Data/process flow
4. Key takeaways"
bash
gemini -m pro -f diagram.png "Explain this diagram:
1. What type of diagram is this?
2. Main components and their relationships
3. Data/process flow
4. Key takeaways"

Specific Use Cases

特定用例

Debug Screenshot

截图转问题排查

bash
gemini -m pro -f debug-screen.png "I'm debugging an issue. From this screenshot:
1. What is the current state?
2. What errors or warnings are visible?
3. What should I look at?
4. Suggested next steps"
bash
gemini -m pro -f debug-screen.png "I'm debugging an issue. From this screenshot:
1. What is the current state?
2. What errors or warnings are visible?
3. What should I look at?
4. Suggested next steps"

Compare Before/After

前后对比

bash
gemini -m pro -f before.png -f after.png "Compare these before and after images:
1. What changed?
2. Is this an improvement?
3. Any issues in the 'after' version?
4. Anything missing?"
bash
gemini -m pro -f before.png -f after.png "Compare these before and after images:
1. What changed?
2. Is this an improvement?
3. Any issues in the 'after' version?
4. Anything missing?"

Design Feedback

设计反馈

bash
gemini -m pro -f design.png "Provide design feedback:
1. Visual hierarchy
2. Color usage
3. Typography
4. Spacing and alignment
5. Accessibility concerns
6. Suggestions for improvement"
bash
gemini -m pro -f design.png "Provide design feedback:
1. Visual hierarchy
2. Color usage
3. Typography
4. Spacing and alignment
5. Accessibility concerns
6. Suggestions for improvement"

Data Extraction

数据提取

bash
gemini -m pro -f chart.png "Extract data from this chart:
1. Chart type
2. Data series and values
3. Axes labels and ranges
4. Key trends or insights
5. Output as structured data if possible"
bash
gemini -m pro -f chart.png "Extract data from this chart:
1. Chart type
2. Data series and values
3. Axes labels and ranges
4. Key trends or insights
5. Output as structured data if possible"

Form Analysis

表单分析

bash
gemini -m pro -f form.png "Analyze this form:
1. Form purpose
2. Fields and their types
3. Required vs optional
4. Validation rules visible
5. UX suggestions"
bash
gemini -m pro -f form.png "Analyze this form:
1. Form purpose
2. Fields and their types
3. Required vs optional
4. Validation rules visible
5. UX suggestions"

Workflow Patterns

工作流模式

Screenshot to Issue

截图转问题工单

bash
undefined
bash
undefined

Capture screenshot (macOS)

捕获截图(macOS)

screencapture -i /tmp/bug.png
screencapture -i /tmp/bug.png

Analyze and format as issue

分析并格式化为问题工单

gemini -m pro -f /tmp/bug.png "Create a bug report from this screenshot:
gemini -m pro -f /tmp/bug.png "Create a bug report from this screenshot:

Summary

Summary

[One-line description]
[One-line description]

Steps to Reproduce

Steps to Reproduce

[Inferred from screenshot]
[Inferred from screenshot]

Expected Behavior

Expected Behavior

[What should happen]
[What should happen]

Actual Behavior

Actual Behavior

[What the screenshot shows]
[What the screenshot shows]

Environment

Environment

[Any visible system info]"
undefined
[Any visible system info]"
undefined

UI to Code

UI转代码

bash
gemini -m pro -f ui-design.png "Generate React component code that recreates this UI:
- Use Tailwind CSS for styling
- Make it responsive
- Include proper TypeScript types
- Add appropriate accessibility attributes"
bash
gemini -m pro -f ui-design.png "Generate React component code that recreates this UI:
- Use Tailwind CSS for styling
- Make it responsive
- Include proper TypeScript types
- Add appropriate accessibility attributes"

Documentation

文档生成

bash
gemini -m pro -f app-screen.png "Write user documentation for this screen:
- What this screen is for
- How to use each feature
- Common tasks
- Tips and notes"
bash
gemini -m pro -f app-screen.png "Write user documentation for this screen:
- What this screen is for
- How to use each feature
- Common tasks
- Tips and notes"

Image Types Supported

支持的图像类型

  • PNG, JPEG, GIF, WebP
  • Screenshots
  • Photos
  • Diagrams and charts
  • UI mockups
  • Code snippets
  • Documents
  • PNG, JPEG, GIF, WebP
  • 截图
  • 照片
  • 图表和示意图
  • UI原型图
  • 代码片段
  • 文档

Best Practices

最佳实践

  1. Use clear images - Higher quality = better analysis
  2. Crop to relevant area - Remove unnecessary context
  3. Ask specific questions - Vague prompts get vague answers
  4. Provide context - Tell Gemini what you're looking for
  5. Verify extracted text - OCR isn't perfect
  6. Multiple angles - Use multiple images for complex subjects
  1. 使用清晰图像 - 质量越高,分析效果越好
  2. 裁剪至相关区域 - 移除无关背景
  3. 提出具体问题 - 模糊的提示会得到模糊的答案
  4. 提供上下文 - 告诉Gemini你的需求
  5. 验证提取的文本 - OCR并非完美无缺
  6. 多角度分析 - 复杂主题使用多张图像"