Loading...
Loading...
Found 11 Skills
Use this skill when building real-time, bidirectional streaming applications with the Gemini Live API. Covers WebSocket-based audio/video/text streaming, voice activity detection (VAD), native audio features, function calling, session management, ephemeral tokens for client-side auth, and all Live API configuration options. SDKs covered - google-genai (Python), @google/genai (JavaScript/TypeScript).
Look up Gemini API documentation, SDK patterns, and current best practices when building with Google Gemini. Maps topics to local cached docs and live sources, provides correct @google/genai patterns, and highlights deprecated vs current API usage. Trigger with 'gemini docs', 'gemini guide', 'how to use gemini', 'gemini SDK', '@google/genai', or when building code that imports from @google/genai or google-genai.
Complete Google Gemini API reference for 2026. Use whenever writing code that calls Gemini models. Covers the google-genai SDK, Gemini 3/3.1 models, thought signatures, thinking config, Interactions API, File Search (managed RAG), Computer Use, URL Context, Nano Banana image gen, Live API, ephemeral tokens, TTS, Veo video gen, Lyria music gen, and all tools. ALWAYS prefer `from google import genai` over any legacy import. Use this skill for ANY Gemini API question, even simple ones.
Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Reference-to-Video, Inpainting, and Video Extension. Available parameters: prompt, image, mask, mode, duration, aspect-ratio. Always confirm parameters with the user or explicitly state defaults before running.
Master of LLM Economic Orchestration, specialized in Google GenAI (Gemini 3), Context Caching, and High-Fidelity Token Engineering.
Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).
Run Gemini CLI for AI-powered tasks, code understanding, file operations, and automation. Free tier with Google OAuth (included in Gemini Advanced). Use for fast generation, bulk content, debugging, and research. Preferred for load balancing sub-agent work (35% weight).
Generate and edit high-quality images using Gemini 2.5 Flash Image and Gemini 3 Pro Image (Nano Banana). Supports Text-to-Image, Style Transfer, Virtual Try-On, and Character Consistency.
Generate (TTS), Transcribe (STT), and Clone voices using Google's GenAI and Cloud Speech SDKs. Supports Gemini-TTS, Chirp 3, and Instant Custom Voice.
Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Inpainting, and Advanced Controls.
Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking, and file management. Triggers on "upload file", "file API", "upload image", "upload PDF", "upload video", "file management".