Loading...
Loading...
Found 45 Skills
Deploy applications and websites to Vercel instantly. Use when asked to "Deploy my app", "Deploy this to production", "Create a preview deployment", or "Push this live". No authentication required - returns preview URL and claimable deployment link.
Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
Container orchestration with Docker Compose for multi-container applications, networking, volumes, and production deployment
Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time applications, or deploying models with limited compute. Covers draft models, tree-based attention, Jacobi iteration, parallel token generation, and production deployment strategies.
Comprehensive Docker best practices for images, containers, and production deployments
Production-ready Express.js development covering middleware architecture, error handling, security hardening, testing strategies, and deployment patterns
Deploy applications and websites to Vercel. Use when the user requests deployment actions like "deploy my app", "deploy and give me the link", "push this live", or "create a preview deployment".
Master Temporal workflow orchestration with Python SDK. Implements durable workflows, saga patterns, and distributed transactions. Covers async/await, testing strategies, and production deployment. Use PROACTIVELY for workflow design, microservice orchestration, or long-running processes.
Operational patterns, templates, and decision rules for time series forecasting (modern best practices): tree-based methods (LightGBM), deep learning (Transformers, RNNs), future-guided learning, temporal validation, feature engineering, generative TS (Chronos), and production deployment. Emphasizes explainability, long-term dependency handling, and adaptive forecasting.
OpenInference semantic conventions and instrumentation for Phoenix AI observability. Use when implementing LLM tracing, creating custom spans, or deploying to production.
Universal platform artifact cleaner: Replit, StackBlitz, CodeSandbox, Glitch
This skill should be used when users need to sync/promote configuration from staging (aws-staging) to production (aws-prod) environment. It handles image tag synchronization, identifies configuration differences, and manages the promotion workflow. Triggers on requests mentioning "sync to prod", "promote to production", "update prod images", or comparing staging vs production.