Loading...
Loading...
Found 32 Skills
Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GPU memory. Supports OpenAI-compatible endpoints, quantization (GPTQ/AWQ/FP8), and tensor parallelism.
USE FOR AI-grounded answers via OpenAI-compatible /chat/completions. Two modes: single-search (fast) or deep research (enable_research=true, thorough multi-search). Streaming/blocking. Citations.
Autonomous novel writing CLI agent - use for creative fiction writing, novel generation, style imitation, chapter continuation/import, EPUB export, and AIGC detection. Supports Chinese web novel genres (xuanhuan, xianxia, urban, horror, other) with multi-agent pipeline, two-phase writer (creative + settlement), 33-dimension auditing, token usage analytics, creative brief input, structured logging (JSON Lines), and custom OpenAI-compatible provider support.
Use Claude Code's full tool system with any OpenAI-compatible LLM — GPT-4o, DeepSeek, Gemini, Ollama, and 200+ models via environment variable configuration.
LangGraph-based agent framework for consistent tool calling with automatic tool loops. Use when you need reliable multi-step task execution with OpenAI-compatible providers (Z.AI/GLM-5, OpenRouter, Groq, DeepSeek, Ollama).
Multimodal UI understanding and single-step planning via OpenAI-compatible Responses APIs. Use when you need AIQuery/AIAssert and plan-next to extract UI element coordinates, validate UI assertions, summarize screenshots, or decide the next UI action from an image. External agents handle execution via adb/hdc and multi-step loops. Defaults to Doubao models but can be pointed at other multimodal providers via base URL, API key, and model name.
Command-line interface for Novita AI - An OpenAI-compatible AI API client for DeepSeek, GLM, and other models.
An image generation/editing Skill for GPT Image 2. It can be used in 3 environments: (A) Garden Local Mode: directly generate and save images via OpenAI-compatible APIs; (B) Host-Native Mode: treat this Skill as a prompt engineering guide, and pass the rendered prompt to the image tool built into the host Agent for image generation; (C) Advisor Mode: degrade to a high-quality prompt consultant when the host has no image tools. It covers 18 major categories and over 80 structured templates, including scenarios such as posters, UI, products, infographics, academic figures, technical architecture diagrams, comics, avatars, process boards, storyboards, IP peripherals, and editing workflows.
Use when working on vLLM Studio backend architecture (controller runtime, Pi-mono agent loop, OpenAI-compatible endpoints, LiteLLM gateway, inference process, and debugging commands).
Guide developers integrating EUrouter into their applications. EUrouter is an OpenAI-compatible AI gateway for EU/GDPR compliance. Use when integrating EUrouter, switching from OpenRouter or OpenAI, configuring EU data residency, routing AI requests to EU providers, managing API keys, or asking about EUrouter's API for chat completions, embeddings, streaming, tool calling, vision, model routing, or GDPR compliance features.
DeepSeek AI large language model API via curl. Use this skill for chat completions, reasoning, and code generation with OpenAI-compatible endpoints.
vLLM Ascend plugin for LLM inference serving on Huawei Ascend NPU. Use for offline batch inference, API server deployment, quantization inference (with msmodelslim quantized models), tensor/pipeline parallelism for distributed serving, and OpenAI-compatible API endpoints. Supports Qwen, DeepSeek, GLM, LLaMA models with Ascend-optimized kernels.