muapi-nano-banana

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

🍌 Nano-Banana Expert Skill (Gemini 3 Style)

A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation. Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.

一款供 AI Agent 使用的专用技能，可实现「推理驱动」的图像生成。 本技能基于 Google Gemini 3（Nano Banana Pro）的先进提示词架构，摒弃关键词堆砌的方式，转而采用结构化、基于逻辑的创意简报。

Core Competencies

核心能力

Reasoning-Driven Prompting: Using natural language logic to define physics, lighting, and spatial relationships.
Structured Creative Briefs: Implementing the "Perfect Prompt" formula:
```
Subject + Action + Context + Composition + Lighting
```
.
Text Rendering Precision: Explicitly defining typography and signifiers for legible text integration.
Contextual Grounding: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.

推理驱动提示词生成：使用自然语言逻辑定义物理规则、光照和空间关系。
结构化创意简报：采用「完美提示词」公式：
```
Subject + Action + Context + Composition + Lighting
```
。
文本渲染精度：明确定义排版和标识，实现清晰可读的文本集成。
上下文锚定：使用（模拟的）「搜索锚定」逻辑，让生成内容符合现实世界的准确性。

🏗️ Technical Specification

🏗️ 技术规范

1. The "Perfect Prompt" Formula

1. 「完美提示词」公式

Component	Description	Example
Subject	Detailed entity description	"A stoic robot barista with exposed copper wiring"
Action	Dynamic interaction	"Pouring a latte art leaf with mechanical precision"
Context	Environment & Atmosphere	"Inside a neon-lit cyberpunk cafe at midnight"
Composition	Camera & Lens choice	"Close-up, 85mm lens, f/1.8 aperture"
Lighting	Mood & Direction	"Volumetric blue rim light, warm cafe glow"
Style	Aesthetic anchor	"Cinematic, photorealistic, 4K production value"

组件	描述	示例
主体（Subject）	实体的详细描述	"一个有着外露铜线的禁欲风机器人咖啡师"
动作（Action）	动态交互行为	"以机械精度倒出带有树叶拉花的拿铁"
上下文（Context）	环境与氛围	"午夜时分，霓虹灯点亮的赛博朋克咖啡馆内"
构图（Composition）	相机与镜头选择	"特写，85mm 镜头，f/1.8 光圈"
光照（Lighting）	氛围与光线方向	"体积感蓝色轮廓光，咖啡馆暖光"
风格（Style）	美学定位	"电影感，照片级真实感，4K 制作水准"

2. Advanced Features

2. 高级特性

Negative Constraint Logic: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
Identity Consistency: (Simulated) "Maintain consistent facial structure across variations."
Text Integration: Use double quotes for specific text:
```
The sign reads "OPEN 24/7"
```
.

负面约束逻辑：不用「不要模糊」这类表述，而是使用「确保主体的眼睛对焦清晰」。
身份一致性：（模拟实现）「不同变体生成中保持面部结构一致」。
文本集成：对指定文本使用双引号包裹：
```
The sign reads "OPEN 24/7"
```
。

🧠 Prompt Optimization Protocol (Agent Instruction)

🧠 提示词优化规则（Agent 操作指南）

Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:

NO KEYWORD SOUP: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
PHYSICAL CONSISTENCY: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").

TEXT PRECISION: If the user wants text, define it precisely:

featuring a sign that says "STORE NAME" in a weathered serif font

OPTICAL DIRECTIVES: Specify lens behavior: Shallow Depth of Field (f/1.8), Macro Lens, Anamorphic Flare.

调用脚本前，Agent 必须将用户的提示词重写为逻辑驱动的推理简报：

禁止关键词堆砌：删除「8k、杰作、超精细」这类表述，使用完整的描述性语句。
物理一致性：描述元素之间的交互逻辑（例如：「水晶碎片发出的光线在黑曜石地面上投射出焦散图案」）。
文本精度要求：如果用户需要添加文本，需明确定义：
```
featuring a sign that says "STORE NAME" in a weathered serif font
```
。
光学参数指令：明确指定镜头参数：Shallow Depth of Field (f/1.8)、Macro Lens、Anamorphic Flare。

🚀 Protocol: Using Nano-Banana

🚀 使用流程：Nano-Banana 操作指南

Step 1: Define the Creative Logic

步骤1：定义创意逻辑

Provide the agent with a subject and a specific scenario.

向 Agent 提供主体信息和具体场景。

Step 2: Invoke the Script

步骤2：调用脚本

The

generate-nano-art.sh

script translates the logic into a structured Gemini 3-style prompt.

bash

undefined

generate-nano-art.sh

脚本会将逻辑转换为结构化的 Gemini 3 风格提示词。

bash

undefined

Generating a reasoning-driven image

bash scripts/generate-nano-art.sh
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"

---

bash scripts/generate-nano-art.sh
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"

---

⚠️ Constraints & Guardrails

⚠️ 约束与 guardrails

No Keyword Soup: MANDATORY - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
Physics Logic: Ensure the prompt describes physically possible lighting and reflection interactions.
Full Sentences: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".

禁止关键词堆砌：强制要求 - 不要使用「trending on artstation、masterpiece、8k」这类表述，使用自然语言描述。
物理逻辑合理性：确保提示词描述的光照和反射交互符合物理规律。
使用完整语句：模型会解析元素关系，使用「光线在水面上反射」而非「水、反射」这类短语。

⚙️ Implementation Details

⚙️ 实现细节

This skill applies a "Logic Wrapper" around the

core/media/generate-image.sh

primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.

本技能在

core/media/generate-image.sh

原语外层添加了「逻辑包装器」，可将碎片化的输入转换为连贯的、符合推理要求的叙事型提示词。