Loading...
Loading...
Found 425 Skills
improve a CLAUDE.md file using <important if> blocks to improve instruction adherence
Query NVIDIA PTX ISA 9.1, CUDA Runtime API 13.1, Driver API 13.1, Programming Guide v13.1, Best Practices Guide, Nsight Compute, Nsight Systems local documentation. Debug and optimize GPU kernels with nsys/ncu/compute-sanitizer workflows. Use when writing, debugging, or optimizing CUDA code, GPU kernels, PTX instructions, inline PTX, TensorCore operations (WMMA, WGMMA, TMA, tcgen05), or when the user mentions CUDA API functions, error codes, device properties, memory management, profiling, GPU performance, compute capabilities, CUDA Graphs, Cooperative Groups, Unified Memory, dynamic parallelism, or CUDA programming model concepts.
Develop, debug, and optimize SGLang LLM serving engine. Use when the user mentions SGLang, sglang, srt, sgl-kernel, LLM serving, model inference, KV cache, attention backend, FlashInfer, MLA, MoE routing, speculative decoding, disaggregated serving, TP/PP/EP, radix cache, continuous batching, chunked prefill, CUDA graph, model loading, quantization FP8/GPTQ/AWQ, JIT kernel, triton kernel SGLang, or asks about serving LLMs with SGLang.
Write, debug, and optimize CUTLASS and CuTeDSL GPU kernels using local source code, examples, and header references. Use when the user mentions CUTLASS, CuTe, CuTeDSL, cute::Layout, cute::Tensor, TiledMMA, TiledCopy, CollectiveMainloop, CollectiveEpilogue, GEMM kernel, grouped GEMM, sparse GEMM, flash attention CUTLASS, blackwell GEMM, hopper GEMM, FP8 GEMM, blockwise scaling, MoE GEMM, StreamK, warp specialization CUTLASS, TMA CUTLASS, or asks about writing high-performance CUDA kernels with CUTLASS/CuTe templates.
Write, debug, and optimize Triton and Gluon GPU kernels using local source code, tutorials, and kernel references. Use when the user mentions Triton, Gluon, tl.load, tl.store, tl.dot, triton.jit, gluon.jit, wgmma, tcgen05, TMA, tensor descriptor, persistent kernel, warp specialization, fused attention, matmul kernel, kernel fusion, tl.program_id, triton autotune, MXFP, FP8, FP4, block-scaled matmul, SwiGLU, top-k, or asks about writing GPU kernels in Python.
Rapid Symfony admin panel development with EasyAdminBundle. Capabilities: CRUD generation, dashboard creation, entity management, form customization, menu configuration, field configuration, action customization, filters, permissions, batch actions, custom themes, file uploads, image handling, associations, submenus, custom queries, event listeners. Use for: admin panels, backend interfaces, content management, data administration, CRUD operations, user management, product catalogs, blog administration, e-commerce backends. Triggers: easyadmin, symfony admin, crud, admin panel, dashboard, backend, entity crud, admin interface, symfony backend, easyadmin, easyadmin bundle, admin generator, backend panel
Enforces an opinionated UI baseline to prevent AI-generated interface slop.
Elite frontend image-direction skill for generating premium, artistic, implementation-friendly website design references. Uses combinatorial variation to avoid repetitive AI aesthetics, enforces cinematic hero minimalism, strong hierarchy, generous spacing, image-led composition, and anti-slop visual discipline. For visual frontend tasks, this skill must first generate the design image(s) itself, deeply analyze them, then implement the frontend to match them as closely as possible.
Use this skill when encountering errors, bugs, performance issues, or unexpected behavior in an InsForge project — from frontend SDK errors to backend infrastructure problems. Trigger on: SDK returning error objects, HTTP 4xx/5xx responses, edge function failures or timeouts, slow database queries, authentication/authorization failures, realtime channel issues, backend performance degradation (high CPU/memory/slow responses), edge function deploy failures, or frontend Vercel deploy failures. This skill guides diagnostic command execution to locate problems; it does not provide fix suggestions.
Deep Core Web Vitals and page speed audit. Use when the user asks about page speed, Core Web Vitals, LCP, CLS, INP, FCP, TTFB, Lighthouse scores, why a page is slow, performance optimization, or resource size analysis. For broader technical SEO issues, see diagnose-seo.
Query resource usage metrics for Railway services. Use when user asks about resource usage, CPU, memory, network, disk, or service performance like "how much memory is my service using" or "is my service slow".
Fast in-memory DataFrame library for datasets that fit in RAM. Use when pandas is too slow but data still fits in memory. Lazy evaluation, parallel execution, Apache Arrow backend. Best for 1-100GB datasets, ETL pipelines, faster pandas replacement. For larger-than-RAM data use dask or vaex.