Loading...
Loading...
Found 314 Skills
Optimize token usage when delegating to Gemini CLI. Covers token caching, batch queries, model selection (Flash vs Pro), and cost tracking. Use when planning bulk Gemini operations.
Image generation and editing using Google Gemini's Nano Banana Pro (gemini-3-pro-image-preview) model. Use when user requests: "Generate an image", "Create an image", "Make me a picture", "Draw", "Edit that image", "Change the color", "Remove background", "Add transparency", "Modify this image", "Make it transparent", "Change the style", "Add text to image", or any image creation/manipulation task. Supports text-to-image generation, image editing, multi-turn conversations, and transparency extraction via difference matting technique.
Convert a YouTube video into infographic slides. Extracts transcript, segments into sections, summarizes, and generates stylized infographic images using Gemini AI. 5 styles: davinci, magazine, comic, geek, chalkboard. Use when user wants slide summaries from YouTube.
Use this skill when the user asks about current events, real-time information, recent news, live scores, financial data, price updates, recent changes, or any question that may require up-to-date web information. This skill first determines if web search is necessary using Gemini, then attempts Google Search Grounding via Gemini, and automatically falls back to Tavily if any failure occurs.
Generate images using Google's Gemini API — hero backgrounds, OG images, placeholder photos, textures, and style-matched variants. Uses free-tier models for drafts, paid for finals. No dependencies beyond Python 3. Trigger with 'generate image', 'gemini image', 'make a hero background', 'create placeholder photo', 'generate OG image', 'AI image', or 'need an image for'.
Use this skill to query your Google NotebookLM notebooks directly from Claude Code for source-grounded, citation-backed answers from Gemini. Browser automation, library management, persistent auth. Drastically reduced hallucinations through document-only responses.
AI image generation and editing using Google Gemini models (Nano Banana). Use when the user asks to generate an image, create an image, edit an image, or references "nano banana", "nanobanana", or "gemini image". Supports text-to-image, image editing, multi-image references, and 1K/2K/4K resolution.
Read, watch, and listen to video/audio files. Use Gemini for native video understanding, or extract key frames + Whisper transcription as fallback. Use when a user sends a video/audio and asks about its content, what's in it, what someone said, etc.
Model Context Protocol (MCP) server development and tool management. Languages: Python, TypeScript. Capabilities: build MCP servers, integrate external APIs, discover/execute MCP tools, manage multi-server configs, design agent-centric tools. Actions: create, build, integrate, discover, execute, configure MCP servers/tools. Keywords: MCP, Model Context Protocol, MCP server, MCP tool, stdio transport, SSE transport, tool discovery, resource provider, prompt template, external API integration, Gemini CLI MCP, Claude MCP, agent tools, tool execution, server config. Use when: building MCP servers, integrating external APIs as MCP tools, discovering available MCP tools, executing MCP capabilities, configuring multi-server setups, designing tools for AI agents.
Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection, segmentation, visual Q&A), video (scene detection, 6hr max, YouTube URLs, temporal analysis), documents (PDF extraction, tables, forms, charts), image generation (text-to-image, editing). Actions: transcribe, analyze, extract, caption, detect, segment, generate from media. Keywords: Gemini API, audio transcription, image captioning, OCR, object detection, video analysis, PDF extraction, text-to-image, multimodal, speech recognition, visual Q&A, scene detection, YouTube transcription, table extraction, form processing, image generation, Imagen. Use when: transcribing audio/video, analyzing images/screenshots, extracting data from PDFs, processing YouTube videos, generating images from text, implementing multimodal AI features.
AI image generation skill powered by Google Gemini, enabling seamless visual content creation for UI placeholders, documentation, and design assets.
使用 Gemini 图像生成 API 生成或修改图片。支持自定义 API Key 和 Base URL。