Search Results: GPU-acceleration

Found 19 Skills

kernel-triton-writing

ONLY for OpenAI Triton (@triton.jit) kernel development. NEVER use for CUDA C++ kernels, TileIR, or profiling tools (ncu, nsys). The user's request must involve Triton explicitly. Covers Triton-specific patterns: fused elementwise, reductions (softmax, LayerNorm, RMSNorm), tiled GEMM with triton.autotune, and flash attention. Workflow: design, write, verify (with fast-path for explicit requests).

🇺🇸|EnglishTranslated

3 scripts/Attention

Data Processingg1joshi/agent-skills

polars

Polars fast DataFrame library. Use for fast data processing.

🇺🇸|EnglishTranslated

AI & Machine Learningmodular/skills

mojo-gpu-fundamentals

The basics of how to program GPUs using Mojo. Use this skill in addition to mojo-syntax when writing Mojo code that targets GPUs or other accelerators. Use targeting code to NVIDIA, AMD, Apple silicon GPUs, or others. Use this skill to overcome misconceptions about how Mojo GPU code is written.

🇺🇸|EnglishTranslated

Tools & Utilitiesdavila7/claude-code-templ...

get-available-resources

This skill should be used at the start of any computationally intensive scientific task to detect and report available system resources (CPU cores, GPUs, memory, disk space). It creates a JSON file with resource information and strategic recommendations that inform computational approach decisions such as whether to use parallel processing (joblib, multiprocessing), out-of-core computing (Dask, Zarr), GPU acceleration (PyTorch, JAX), or memory-efficient strategies. Use this skill before running analyses, training models, processing large datasets, or any task where resource constraints matter.

🇺🇸|EnglishTranslated

1 scripts/Attention

Data Processingalphaonedev/openclaw-grap...

coding-julia

Julia: multiple dispatch, type system, metaprogramming, Pkg, scientific computing, GPU CUDA.jl

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

perf-parallelism-strategies

Operational guide for choosing and combining parallelism strategies in Megatron Bridge, including sizing rules, hardware topology mapping, and combined parallelism configuration.

🇺🇸|EnglishTranslated

AI & Machine Learningactionbook/rust-skills

domain-ml

Use when building ML/AI apps in Rust. Keywords: machine learning, ML, AI, tensor, model, inference, neural network, deep learning, training, prediction, ndarray, tch-rs, burn, candle, 机器学习, 人工智能, 模型推理

🇺🇸|EnglishTranslated