Loading...
Loading...
Found 293 Skills
Google Gemini API with @google/genai SDK. Use for multimodal AI, thinking mode, function calling, or encountering SDK deprecation warnings, context errors, multimodal format errors.
The meta skill. Turn any raw feature into a properly-skilled, tested, resolvable unit of agent capability. Cross-modal eval is the recommended Phase 3 quality gate: 3 frontier models from different providers critique the output, you iterate to quality, THEN write tests that lock in the proven-good behavior.
Create and optimize popups, modals, overlays, slide-ins, and banners to increase conversions without harming user experience or brand trust.
This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.
Vision-language pre-training framework bridging frozen image encoders and LLMs. Use when you need image captioning, visual question answering, image-text retrieval, or multimodal chat with state-of-the-art zero-shot performance.
Use Google Gemini API for text generation, multimodal analysis, image generation (Nano Banana), function calling, and search grounding. Invoke when user wants to use Gemini, ask Gemini, generate images with Gemini, or analyze content with Gemini.
Semantic and multi-modal search across documents using LanceDB vector embeddings. Use when searching knowledge bases, finding information semantically, ingesting documents for RAG, or performing vector similarity search. Triggers on "search documents", "semantic search", "find in knowledge base", "vector search", "index documents", "LanceDB", or RAG/embedding operations.
Upgrade any skill to v5 Hybrid format using decision theory + modal logic
#1 on DeepResearch Bench (Feb 2026). Any-to-Any AI for agents. Combines deep reasoning with all modalities through sophisticated multi-agent orchestration. Research, videos, images, audio, dashboards, presentations, spreadsheets, and more.
Practical guidance for training MoE VLMs in Megatron Bridge. Compares FSDP and 3D-parallel approaches, using rounded lessons from Qwen3-VL, Qwen3-Next, and other multimodal experiments.
Recipes and patterns for common Inertia Rails use cases. Includes modal dialogs, shadcn/ui integration, search with filters, wizard flows, and other advanced patterns.
Invokes Google Gemini models for structured outputs, multi-modal tasks, and Google-specific features. Use when users request Gemini, structured JSON output, Google API integration, or cost-effective parallel processing.