muapi-nano-banana

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

🍌 Nano-Banana Expert Skill (Gemini 3 Style)

🍌 Nano-Banana Expert Skill (Gemini 3 Style)

A specialized skill for AI Agents to leverage "Reasoning-Driven" image generation. Based on the advanced prompting architecture of Google's Gemini 3 (Nano Banana Pro), this skill moves beyond keyword stuffing to structured, logic-based creative briefs.
一款供 AI Agent 使用的专用技能,可实现「推理驱动」的图像生成。 本技能基于 Google Gemini 3(Nano Banana Pro)的先进提示词架构,摒弃关键词堆砌的方式,转而采用结构化、基于逻辑的创意简报。

Core Competencies

核心能力

  1. Reasoning-Driven Prompting: Using natural language logic to define physics, lighting, and spatial relationships.
  2. Structured Creative Briefs: Implementing the "Perfect Prompt" formula:
    Subject + Action + Context + Composition + Lighting
    .
  3. Text Rendering Precision: Explicitly defining typography and signifiers for legible text integration.
  4. Contextual Grounding: Using "Search Grounding" logic (simulated) to anchor generations in real-world accuracy.

  1. 推理驱动提示词生成:使用自然语言逻辑定义物理规则、光照和空间关系。
  2. 结构化创意简报:采用「完美提示词」公式:
    Subject + Action + Context + Composition + Lighting
  3. 文本渲染精度:明确定义排版和标识,实现清晰可读的文本集成。
  4. 上下文锚定:使用(模拟的)「搜索锚定」逻辑,让生成内容符合现实世界的准确性。

🏗️ Technical Specification

🏗️ 技术规范

1. The "Perfect Prompt" Formula

1. 「完美提示词」公式

ComponentDescriptionExample
SubjectDetailed entity description"A stoic robot barista with exposed copper wiring"
ActionDynamic interaction"Pouring a latte art leaf with mechanical precision"
ContextEnvironment & Atmosphere"Inside a neon-lit cyberpunk cafe at midnight"
CompositionCamera & Lens choice"Close-up, 85mm lens, f/1.8 aperture"
LightingMood & Direction"Volumetric blue rim light, warm cafe glow"
StyleAesthetic anchor"Cinematic, photorealistic, 4K production value"
组件描述示例
主体(Subject)实体的详细描述"一个有着外露铜线的禁欲风机器人咖啡师"
动作(Action)动态交互行为"以机械精度倒出带有树叶拉花的拿铁"
上下文(Context)环境与氛围"午夜时分,霓虹灯点亮的赛博朋克咖啡馆内"
构图(Composition)相机与镜头选择"特写,85mm 镜头,f/1.8 光圈"
光照(Lighting)氛围与光线方向"体积感蓝色轮廓光,咖啡馆暖光"
风格(Style)美学定位"电影感,照片级真实感,4K 制作水准"

2. Advanced Features

2. 高级特性

  • Negative Constraint Logic: Instead of "no blurry," use "Ensure sharp focus on the subject's eyes."
  • Identity Consistency: (Simulated) "Maintain consistent facial structure across variations."
  • Text Integration: Use double quotes for specific text:
    The sign reads "OPEN 24/7"
    .

  • 负面约束逻辑:不用「不要模糊」这类表述,而是使用「确保主体的眼睛对焦清晰」。
  • 身份一致性:(模拟实现)「不同变体生成中保持面部结构一致」。
  • 文本集成:对指定文本使用双引号包裹:
    The sign reads "OPEN 24/7"

🧠 Prompt Optimization Protocol (Agent Instruction)

🧠 提示词优化规则(Agent 操作指南)

Before calling the script, the Agent MUST rewrite the user's prompt into a logic-driven Reasoning Brief:
  1. NO KEYWORD SOUP: Remove "8k, masterpiece, ultra-detailed." Use full, descriptive sentences.
  2. PHYSICAL CONSISTENCY: Describe how elements interact (e.g., "The light from the crystal shards casts caustic patterns across the obsidian floor").
  3. TEXT PRECISION: If the user wants text, define it precisely:
    featuring a sign that says "STORE NAME" in a weathered serif font
    .
  4. OPTICAL DIRECTIVES: Specify lens behavior: Shallow Depth of Field (f/1.8), Macro Lens, Anamorphic Flare.

调用脚本前,Agent 必须将用户的提示词重写为逻辑驱动的推理简报:
  1. 禁止关键词堆砌:删除「8k、杰作、超精细」这类表述,使用完整的描述性语句。
  2. 物理一致性:描述元素之间的交互逻辑(例如:「水晶碎片发出的光线在黑曜石地面上投射出焦散图案」)。
  3. 文本精度要求:如果用户需要添加文本,需明确定义:
    featuring a sign that says "STORE NAME" in a weathered serif font
  4. 光学参数指令:明确指定镜头参数:Shallow Depth of Field (f/1.8)Macro LensAnamorphic Flare

🚀 Protocol: Using Nano-Banana

🚀 使用流程:Nano-Banana 操作指南

Step 1: Define the Creative Logic

步骤1:定义创意逻辑

Provide the agent with a subject and a specific scenario.
向 Agent 提供主体信息和具体场景。

Step 2: Invoke the Script

步骤2:调用脚本

The
generate-nano-art.sh
script translates the logic into a structured Gemini 3-style prompt.
bash
undefined
generate-nano-art.sh
脚本会将逻辑转换为结构化的 Gemini 3 风格提示词。
bash
undefined

Generating a reasoning-driven image

Generating a reasoning-driven image

bash scripts/generate-nano-art.sh
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"

---
bash scripts/generate-nano-art.sh
--subject "a glass chess piece"
--action "shattering into liquid shards"
--context "on a obsidian table"
--style "macro photography"

---

⚠️ Constraints & Guardrails

⚠️ 约束与 guardrails

  • No Keyword Soup: MANDATORY - Do not use "trending on artstation, masterpiece, 8k". Use natural language descriptions.
  • Physics Logic: Ensure the prompt describes physically possible lighting and reflection interactions.
  • Full Sentences: The model parses relationships; use "light reflecting off the water" instead of "water, reflection".

  • 禁止关键词堆砌强制要求 - 不要使用「trending on artstation、masterpiece、8k」这类表述,使用自然语言描述。
  • 物理逻辑合理性:确保提示词描述的光照和反射交互符合物理规律。
  • 使用完整语句:模型会解析元素关系,使用「光线在水面上反射」而非「水、反射」这类短语。

⚙️ Implementation Details

⚙️ 实现细节

This skill applies a "Logic Wrapper" around the
core/media/generate-image.sh
primitive, converting fragmented inputs into a coherent, reasoning-ready narrative prompt.
本技能在
core/media/generate-image.sh
原语外层添加了「逻辑包装器」,可将碎片化的输入转换为连贯的、符合推理要求的叙事型提示词。