Loading...
Loading...
Found 38 Skills
Connects to and performs inference with Google Cloud Agent Platform GenAI models, including First-Party Gemini models and Third-Party OpenMaaS models (Llama, DeepSeek, Qwen, etc.). Use when you need to generate code for calling Gemini or OpenMaaS models, authenticate with GenAI SDK, OpenAI SDK, or legacy Agent Platform SDK, configure base URLs and global/regional endpoints, or troubleshoot 429 Resource Exhausted (DSQ), 400 User Validation, or 404 Not Found errors. Don't use for deploying models to endpoints or for running model evaluations.
Run Claude Code CLI, VS Code, or JetBrains ACP through a local proxy that routes to NVIDIA NIM, Kimi, OpenRouter, DeepSeek, or local LLMs
Chat with LLM models using ModelsLab's OpenAI-compatible Chat Completions API. Supports 60+ models including DeepSeek R1, Meta Llama, Google Gemini, Qwen, and Mistral with streaming, function calling, and structured outputs.
This skill should be used when users want to route LLM requests to different AI providers (OpenAI, Grok/xAI, Groq, DeepSeek, OpenRouter) using SwiftOpenAI-CLI. Use this skill when users ask to "use grok", "ask grok", "use groq", "ask deepseek", or any similar request to query a specific LLM provider in agent mode.
Check and improve your brand's visibility across AI search engines (ChatGPT, Perplexity, Gemini, Grok, Claude, DeepSeek). Set up tracking, run visibility analyses, audit your website for AI readability, and get actionable recommendations. Uses the npx goose-aeo@latest CLI.
Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.
BYOK — register a custom LLM endpoint (Anthropic, OpenAI, Qwen, DeepSeek, etc.) with your own API key
Use major AI models (Claude, ChatGPT, Gemini, DeepSeek, Qwen, etc.) without API tokens by leveraging browser authentication instead of paid API keys
Local proxy that lets OpenAI Codex CLI/desktop talk to MiMo, DeepSeek, and other LLMs via Responses API translation
Write, review, and improve prompts for any LLM — Claude, GPT, Gemini, Llama, DeepSeek, Mistral, Cohere, Qwen, Grok, Nova, and more. Use when the user asks to "write a system prompt", "improve this prompt", "review my prompt", "make a prompt for", "optimize my prompt", "fix my prompt", "why isn't my prompt working", or wants help writing better prompts for any AI model. Also use when building agents, chatbots, or AI assistants that need system-level instructions, or when the user has a bad prompt they want rewritten. Covers system prompts, task prompts, tool descriptions, and general prompt improvement across all major model families.
Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), implementing sparse architectures like Mixtral 8x7B or DeepSeek-V3, or scaling model capacity without proportional compute increase. Covers MoE architectures, routing mechanisms, load balancing, expert parallelism, and inference optimization.
vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.