Loading...
Loading...
Found 1,204 Skills
Compiles and extracts session knowledge into a living, interconnected LLM-Wiki. Instead of writing isolated logs, it identifies key entities, updates cross-referenced topic files in docs/knowledgelib/, and maintains an index and chronological log. Use this to ensure persistent, compounding project knowledge.
Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.
Extract frames from video files using ffmpeg for AI/LLM analysis. Use when (1) the user asks to analyze, describe, or summarize a video file, (2) the user wants to extract frames or screenshots from a video, (3) the user provides a video file (.mp4, .mov, .avi, .mkv, .webm, etc.) and asks questions about its visual content, (4) the user wants to identify scenes, objects, or events in a video, (5) the user wants timestamps overlaid on extracted frames for temporal reference. Converts video into JPEG frames that can be attached to LLM prompts as images. Requires ffmpeg on PATH. Supports scene-change detection, model-aware optimization (Claude/OpenAI/Gemini), quality presets (efficient/balanced/detailed/ocr), grayscale and high-contrast OCR mode, and automatic FPS calculation via --max-frames.
End-to-end SGLang SOTA performance workflow. Use when a user names an LLM model and wants SGLang to match or beat the best observed vLLM and TensorRT-LLM serving performance by searching each framework's best deployment command, benchmarking them fairly, profiling SGLang if it is slower, identifying kernel/overlap/fusion bottlenecks, patching SGLang code, and revalidating with real model runs.
Expert skill for using TileKernels, a library of optimized GPU kernels for LLM operations (MoE routing, quantization, transpose, engram gating, Manifold HyperConnection) built with TileLang.
Expert guidance for building conversational AI applications with Chainlit framework in Python. Use when (1) creating chat interfaces for LLM applications, (2) building apps with OpenAI, LangChain, LlamaIndex, or Mistral AI, (3) implementing streaming responses, (4) adding UI elements like images, files, charts, (5) handling user file uploads, (6) implementing authentication (OAuth, password), (7) creating multi-step workflows with visible steps, (8) building RAG applications with document upload, or (9) deploying chat apps to web, Slack, Discord, or Teams.
Generative Engine Optimization review: evaluate your content's visibility to AI-powered search engines — citation-worthiness, content structure, authority signals, llms.txt, entity clarity, and AI retrieval readiness.
Investigate LLM analytics clusters — understand usage patterns in AI/LLM traffic, compare cluster behavior, compute cost/latency metrics, and drill into individual traces within clusters.
Use when you need comprehensive security scanning across applications, infrastructure, and dependencies with LLM-based analysis
Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time
FastAPI OpenTelemetry style: native FastAPIInstrumentor, centralized observability init, Python decorators, OTLP logs, and LLM cost metrics.
A minimal teaching framework for understanding AI Agent architecture with core loop, fake LLM interface, and skill discovery system