Loading...
Loading...
Found 2 Skills
LLM inference via paid API: OpenAI-compatible chat completions proxied through x402 providers. Supports Kimi K2.5, MiniMax M2.5. Uses x_payment tool for automatic USDC micropayments ($0.001-$0.003/call). Use when: (1) generating text with a specific model, (2) running chat completions through a pay-per-request LLM endpoint, (3) comparing outputs across models.
PR-backed and current-main optimization manual for `moonshotai/Kimi-K2*` and `moonshotai/Kimi-K2.5*` in SGLang. Use when Codex needs to recover, extend, or audit Kimi optimizations, including K2 router/MoE fast paths, K2 thinking Marlin paths, K2.5 wrapper/multimodal/runtime plumbing, W4AFP8/W4A16 quant tracks, parser contracts, LoRA coverage, and backend-specific validation.