Loading...
Loading...
Found 10 Skills
Generate chat completions using Sarvam AI's Sarvam-M model. Use when the user needs AI chat, text generation, question answering, or reasoning in Indian languages. Sarvam-M is a 24B parameter model with hybrid thinking, superior Indic language understanding, and OpenAI-compatible API. Free to use.
Autonomous research review loop using any OpenAI-compatible LLM API. Configure via llm-chat MCP server or environment variables. Trigger with "auto review loop llm" or "llm review".
[QwenCloud] Generate text, have conversations, write code, reason, and call functions with Qwen models. TRIGGER when: user asks to chat with Qwen, generate text, write code with Qwen, use Qwen function calling, or explicitly invokes this skill by name (e.g. use qwencloud-text). DO NOT TRIGGER when: general coding questions without Qwen, non-Qwen AI model usage (OpenAI, Gemini, etc.), image/video understanding (use qwencloud-vision), image/video/audio generation.
Deploy vLLM to Kubernetes (K8s) with GPU support, health probes, and OpenAI-compatible API endpoint. Use this skill whenever the user wants to deploy, run, or serve vLLM on a Kubernetes cluster, including creating deployments, services, checking existing deployments, or managing vLLM on K8s.
Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.
Recipes and configs for serving LLMs locally on RTX 3090 GPUs using vLLM, llama.cpp, and SGLang with OpenAI-compatible API
Generate images with Venice. Covers POST /image/generate (Venice-native), POST /images/generations (OpenAI-compatible), GET /image/styles (style presets), request fields (prompt, dimensions, cfg_scale, seed, variants, style_preset, aspect_ratio, resolution, safe_mode, watermark), and response formats.
Use major AI models (Claude, ChatGPT, Gemini, DeepSeek, Qwen, etc.) without API tokens by leveraging browser authentication instead of paid API keys
Configure MeiGen plugin provider and API keys. Use this when the user runs /meigen:setup, asks to "configure meigen", "set up image generation", "add API key", or needs help configuring the plugin.