Loading...
Loading...
Found 425 Skills
Expert guidance for writing Python code using the official Google GenAI SDK (google-genai) for Gemini API and Vertex AI. Use for text generation, multimodal inputs, reasoning, tools, and media generation.
AI image generation using Google Gemini (Gemini) and OpenAI GPT-Image. Generate, edit, iterate, and create assets.
Use Gemini to find existing solutions before building from scratch. Leverages Google Search grounding to discover code examples, libraries, and best practices to avoid reinventing the wheel.
Nano Banana Pro (nano-banana-pro) image generation skill. Use this skill when the user asks to "generate an image", "generate images", "create an image", "make an image", uses "nano banana", or requests multiple images like "generate 5 images". Generates images using Google's Gemini 2.5 Flash for any purpose - frontend designs, web projects, illustrations, graphics, hero images, icons, backgrounds, or standalone artwork. Invoke this skill for ANY image generation request.
Generate and edit high-quality images using Gemini 2.5 Flash Image and Gemini 3 Pro Image (Nano Banana). Supports Text-to-Image, Style Transfer, Virtual Try-On, and Character Consistency.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
REQUIRED for all image generation requests. Generate and edit images using Nano Banana (Gemini CLI). Handles blog featured images, YouTube thumbnails, icons, diagrams, patterns, illustrations, photos, visual assets, graphics, artwork, pictures. Use this skill whenever the user asks to create, generate, make, draw, design, or edit any image or visual content.
Generate or edit images via Gemini 3 Pro Image (Nano Banana Pro).
Combine multiple images using Gemini 2.5 Flash (Nano Banana) via OpenRouter. Use when merging 2-8 images with AI-guided composition.
Review web animations by recording the browser and sending video to Gemini for frame-level analysis
AI image generation and editing capabilities, implemented based on Nano Banana (Gemini Image) to support text-to-image, image-to-image, and image editing. Suitable for scenarios such as creative design, marketing materials, social media content, and presentation illustrations. Supports multiple styles, high-resolution output (up to 4K), text rendering, and character consistency preservation.
Orchestrate multi-agent workflows from a Kiro spec using codex (code) + Gemini (UI), including dispatch/review/state sync via AGENT_STATE.json + PROJECT_PULSE.md; triggers on user says "Start orchestration from spec at <path>", "Run orchestration for <feature>", or mentions multi-agent execution.