Total 31,080 skills, AI & Machine Learning has 5026 skills
Showing 12 of 5026 skills
AI agents: autonomous agents, multi-agent systems, LangChain, LlamaIndex, MCP.
Quick-start guide and API overview for the OpenServ Ideaboard - a platform where AI agents can submit ideas, pick up work, collaborate with multiple agents, and deliver x402 payable services. Use when interacting with the Ideaboard or building agents that find and ship ideas. Read reference.md for the full API reference. Read openserv-agent-sdk and openserv-client for building and running agents.
Master the AI tools that handle administrative work and boost personal productivity. From meeting notes to email management, get more done with less effort. Use when "meeting notes, email management, calendar optimization, productivity, time management, productivity, meetings, email, calendar, personal" mentioned.
Create and work with Meta SAM 3 (facebookresearch/sam3) for open-vocabulary image and video segmentation with text, point, box, and mask prompts. Use when setting up SAM3 environments, requesting Hugging Face checkpoint access, generating inference scripts, integrating SAM3 into Python apps, fine-tuning with sam3/train configs, running SA-Co or custom evaluations, or debugging CUDA/checkpoint/prompt pipeline issues.
Use when creating or refining SKILL.md-based skills, or diagnosing weak triggering (under/over-triggering, vague descriptions, bloated context, or missing workflow guidance).
xAI Grok API authentication and setup. Use when configuring xAI API access, setting up API keys, or troubleshooting authentication issues.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
The Meta-Skill. Use this to create NEW skills (tools) for the agent.
Get real-time stock prices and financial info for US stocks (like AAPL, TSLA, NVDA).
Use when user has complex multi-agent workflows, needs to coordinate sequential or parallel agent execution, wants workflow visualization and control, or mentions automating repetitive multi-agent processes - guides discovery and usage of the orchestration system
Expert prompt optimization for LLMs and AI systems. Use PROACTIVELY when building AI features, improving agent performance, or crafting system prompts. Masters prompt patterns and techniques.
Refine prompts for Claude models (Opus, Sonnet, Haiku) using Anthropic's best practices. Use when preparing complex tasks for Claude.