muapi-nano-banana
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese🍌 Nano-Banana Expert Skill (Gemini 3 Style)
🍌 Nano-Banana Expert Skill (Gemini 3 Style)
A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation.
Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.
一款供 AI Agent 使用的专用技能,可实现「推理驱动」的图像生成。
本技能基于 Google Gemini 3(Nano Banana Pro)的先进提示词架构,摒弃关键词堆砌的方式,转而采用结构化、基于逻辑的创意简报。
Core Competencies
核心能力
- Reasoning-Driven Prompting: Using natural language logic to define physics, lighting, and spatial relationships.
- Structured Creative Briefs: Implementing the "Perfect Prompt" formula: .
Subject + Action + Context + Composition + Lighting - Text Rendering Precision: Explicitly defining typography and signifiers for legible text integration.
- Contextual Grounding: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.
- 推理驱动提示词生成:使用自然语言逻辑定义物理规则、光照和空间关系。
- 结构化创意简报:采用「完美提示词」公式:。
Subject + Action + Context + Composition + Lighting - 文本渲染精度:明确定义排版和标识,实现清晰可读的文本集成。
- 上下文锚定:使用(模拟的)「搜索锚定」逻辑,让生成内容符合现实世界的准确性。
🏗️ Technical Specification
🏗️ 技术规范
1. The "Perfect Prompt" Formula
1. 「完美提示词」公式
| Component | Description | Example |
|---|---|---|
| Subject | Detailed entity description | "A stoic robot barista with exposed copper wiring" |
| Action | Dynamic interaction | "Pouring a latte art leaf with mechanical precision" |
| Context | Environment & Atmosphere | "Inside a neon-lit cyberpunk cafe at midnight" |
| Composition | Camera & Lens choice | "Close-up, 85mm lens, f/1.8 aperture" |
| Lighting | Mood & Direction | "Volumetric blue rim light, warm cafe glow" |
| Style | Aesthetic anchor | "Cinematic, photorealistic, 4K production value" |
| 组件 | 描述 | 示例 |
|---|---|---|
| 主体(Subject) | 实体的详细描述 | "一个有着外露铜线的禁欲风机器人咖啡师" |
| 动作(Action) | 动态交互行为 | "以机械精度倒出带有树叶拉花的拿铁" |
| 上下文(Context) | 环境与氛围 | "午夜时分,霓虹灯点亮的赛博朋克咖啡馆内" |
| 构图(Composition) | 相机与镜头选择 | "特写,85mm 镜头,f/1.8 光圈" |
| 光照(Lighting) | 氛围与光线方向 | "体积感蓝色轮廓光,咖啡馆暖光" |
| 风格(Style) | 美学定位 | "电影感,照片级真实感,4K 制作水准" |
2. Advanced Features
2. 高级特性
- Negative Constraint Logic: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
- Identity Consistency: (Simulated) "Maintain consistent facial structure across variations."
- Text Integration: Use double quotes for specific text: .
The sign reads "OPEN 24/7"
- 负面约束逻辑:不用「不要模糊」这类表述,而是使用「确保主体的眼睛对焦清晰」。
- 身份一致性:(模拟实现)「不同变体生成中保持面部结构一致」。
- 文本集成:对指定文本使用双引号包裹:。
The sign reads "OPEN 24/7"
🧠 Prompt Optimization Protocol (Agent Instruction)
🧠 提示词优化规则(Agent 操作指南)
Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:
- NO KEYWORD SOUP: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
- PHYSICAL CONSISTENCY: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
- TEXT PRECISION: If the user wants text, define it precisely: .
featuring a sign that says "STORE NAME" in a weathered serif font - OPTICAL DIRECTIVES: Specify lens behavior: Shallow Depth of Field (f/1.8), Macro Lens, Anamorphic Flare.
调用脚本前,Agent 必须将用户的提示词重写为逻辑驱动的推理简报:
- 禁止关键词堆砌:删除「8k、杰作、超精细」这类表述,使用完整的描述性语句。
- 物理一致性:描述元素之间的交互逻辑(例如:「水晶碎片发出的光线在黑曜石地面上投射出焦散图案」)。
- 文本精度要求:如果用户需要添加文本,需明确定义:。
featuring a sign that says "STORE NAME" in a weathered serif font - 光学参数指令:明确指定镜头参数:Shallow Depth of Field (f/1.8)、Macro Lens、Anamorphic Flare。
🚀 Protocol: Using Nano-Banana
🚀 使用流程:Nano-Banana 操作指南
Step 1: Define the Creative Logic
步骤1:定义创意逻辑
Provide the agent with a subject and a specific scenario.
向 Agent 提供主体信息和具体场景。
Step 2: Invoke the Script
步骤2:调用脚本
The script translates the logic into a structured Gemini 3-style prompt.
generate-nano-art.shbash
undefinedgenerate-nano-art.shbash
undefinedGenerating a reasoning-driven image
Generating a reasoning-driven image
bash scripts/generate-nano-art.sh
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"
---bash scripts/generate-nano-art.sh
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"
---⚠️ Constraints & Guardrails
⚠️ 约束与 guardrails
- No Keyword Soup: MANDATORY - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
- Physics Logic: Ensure the prompt describes physically possible lighting and reflection interactions.
- Full Sentences: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".
- 禁止关键词堆砌:强制要求 - 不要使用「trending on artstation、masterpiece、8k」这类表述,使用自然语言描述。
- 物理逻辑合理性:确保提示词描述的光照和反射交互符合物理规律。
- 使用完整语句:模型会解析元素关系,使用「光线在水面上反射」而非「水、反射」这类短语。
⚙️ Implementation Details
⚙️ 实现细节
This skill applies a "Logic Wrapper" around the primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.
core/media/generate-image.sh本技能在 原语外层添加了「逻辑包装器」,可将碎片化的输入转换为连贯的、符合推理要求的叙事型提示词。
core/media/generate-image.sh