Search Results: model-serving

Found 24 Skills

AI & Machine Learningdatabricks/databricks-age...

databricks-model-serving

Manage Databricks Model Serving endpoints via CLI. Use when asked to create, configure, query, or manage model serving endpoints for LLM inference, custom models, or external models.

🇺🇸|EnglishTranslated

AI & Machine Learningeyadsibai/ltk

llm-inference

Use when "LLM inference", "serving LLM", "vLLM", "llama.cpp", "GGUF", "text generation", "model serving", "inference optimization", "KV cache", "continuous batching", "speculative decoding", "local LLM", "CPU inference"

🇺🇸|EnglishTranslated

AI & Machine Learningjeremylongshore/claude-co...

flask-ml-api-creator

Flask Ml Api Creator - Auto-activating skill for ML Deployment. Triggers on: flask ml api creator, flask ml api creator Part of the ML Deployment skill category.

🇺🇸|EnglishTranslated

AI & Machine Learningjeremylongshore/claude-co...

vertex-ai-deployer

Vertex Ai Deployer - Auto-activating skill for ML Deployment. Triggers on: vertex ai deployer, vertex ai deployer Part of the ML Deployment skill category.

🇺🇸|EnglishTranslated

AI & Machine Learningsickn33/antigravity-aweso...

ml-engineer

Build production ML systems with PyTorch 2.x, TensorFlow, and modern ML frameworks. Implements model serving, feature engineering, A/B testing, and monitoring. Use PROACTIVELY for ML model deployment, inference optimization, or production ML infrastructure.

🇺🇸|EnglishTranslated

AI & Machine Learningjeremylongshore/claude-co...

onnx-converter

Onnx Converter - Auto-activating skill for ML Deployment. Triggers on: onnx converter, onnx converter Part of the ML Deployment skill category.

🇺🇸|EnglishTranslated

AI & Machine Learningjeremylongshore/claude-co...

model-export-helper

Model Export Helper - Auto-activating skill for ML Deployment. Triggers on: model export helper, model export helper Part of the ML Deployment skill category.

🇺🇸|EnglishTranslated

AI & Machine Learningbuiphucminhtam/forgewrigh...

ai-engineer

Builds production AI/ML systems — model training, fine-tuning, MLOps pipelines, model serving, evaluation frameworks, RAG optimization, and agent orchestration at scale. Use when the user asks to build, train, or deploy ML models, set up MLOps pipelines, optimize RAG systems, create inference endpoints, or design production AI agents.

🇺🇸|EnglishTranslated

AI & Machine Learningjeremylongshore/claude-co...

triton-inference-config

Triton Inference Config - Auto-activating skill for ML Deployment. Triggers on: triton inference config, triton inference config Part of the ML Deployment skill category.

🇺🇸|EnglishTranslated

AI & Machine Learningdengineproblem/agents-mon...

ml-api-endpoint

Эксперт ML API. Используй для model serving, inference endpoints, FastAPI и ML deployment.

🇺🇸|EnglishTranslated

AI & Machine Learningerichowens/some_claude_sk...

ml-system-design-interview

Coaches end-to-end ML system design interviews covering inference pipelines, recommendation systems, RAG, feature stores, and monitoring. Use for L6+ design rounds, ML architecture whiteboarding, system design practice, serving tradeoff analysis. Activate on "ML system design", "ML interview", "recommendation system design", "RAG architecture", "feature store design", "model serving". NOT for coding interviews, behavioral questions, ML theory quizzes, or paper implementations.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-run-inference-service

Start, query, and stop a network-specific TAO inference microservice ({network_arch}-inference-microservice) by delegating container execution to the appropriate platform skill. Handles container image resolution, job-payload JSON construction, and the service registry. Use when the user wants to run inference on a TAO model checkpoint using a microservice container, deploy a TAO inference endpoint, or stop a running inference container.

🇺🇸|EnglishTranslated