Total 30,668 skills, AI & Machine Learning has 4953 skills
Showing 12 of 4953 skills
Build AI agents with structured access to Sanity content via Context MCP. Covers Studio setup, agent implementation, and advanced patterns like client-side tools and custom rendering.
Guide AI-assisted learning that empowers learners while maintaining appropriate boundaries. Use when teaching, explaining concepts, or helping someone who is struggling to understand.
GPU-optimized OCR using Surya. Use when: (1) Extracting text from images/screenshots, (2) Processing PDFs with embedded images, (3) Multi-language document OCR, (4) Layout analysis and table detection. Supports 90+ languages with 2x accuracy over Tesseract.
Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K; use --input-image.
Analyze short-form videos with Gemini AI to extract hooks, content structure, and replicable patterns. Supports Instagram Reels, TikTok, and YouTube Shorts. Use when asked to: - Analyze video content for hooks and structure - Extract replicable formulas from viral videos - Understand why a video performed well - Get AI analysis of video content patterns Triggers: "analyze videos", "extract hooks", "video analysis", "analyze reels", "what makes this video work", "hook analysis", "content structure analysis"
Text-to-speech and speech-to-text using fal.ai audio models. Use when the user requests "Convert text to speech", "Transcribe audio", "Generate voice", "Speech to text", "TTS", "STT", or similar audio tasks.
MiniMax API via curl. Use this skill for Chinese LLM chat, text-to-speech, and AI video generation.
Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.
Aggregate news from popular cryptocurrency RSS feeds, analyze sentiment of articles, and calculate an overall market sentiment score with detailed explanation. Use when assessing crypto market sentiment for trading decisions, research, or monitoring trends from RSS sources.
Generate human-like speech audio with Model Studio DashScope Qwen TTS (qwen3-tts-flash). Use when converting text to speech, producing voice lines for short drama/news videos, or documenting TTS request/response fields for DashScope.
Deploy, configure, and integrate Sandbox Agent - a universal API for orchestrating AI coding agents (Claude Code, Codex, OpenCode, Amp) in sandboxed environments. Use when setting up sandbox-agent server locally or in cloud sandboxes (E2B, Daytona, Docker), creating and managing agent sessions via SDK or API, streaming agent events and handling human-in-the-loop interactions, building chat UIs for coding agents, or understanding the universal schema for agent responses.
Connect Claude to any app. Send emails, create issues, post messages, update databases - take real actions across Gmail, Slack, GitHub, Notion, and 1000+ services.