Search Results: pytorch

Found 98 Skills

AI & Machine Learningk-dense-ai/claude-scienti...

pytorch-lightning

Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.

🇺🇸|EnglishTranslated

3 scripts/Checked

AI & Machine Learningmindrally/skills

computer-vision-opencv

Expert guidance for computer vision development using OpenCV, PyTorch, and modern deep learning techniques for image and video processing.

🇺🇸|EnglishTranslated

AI & Machine Learningdavila7/claude-code-templ...

pytorch-fsdp

Expert guidance for Fully Sharded Data Parallel training with PyTorch FSDP - parameter sharding, mixed precision, CPU offloading, FSDP2

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

fine-tuning-serving-openpi

Fine-tune and serve Physical Intelligence OpenPI models (pi0, pi0-fast, pi0.5) using JAX or PyTorch backends for robot policy inference across ALOHA, DROID, and LIBERO environments. Use when adapting pi0 models to custom datasets, converting JAX checkpoints to PyTorch, running policy inference servers, or debugging norm stats and GPU memory issues.

🇺🇸|EnglishTranslated

AI & Machine Learningpytorch/pytorch

at-dispatch-v2

Convert PyTorch AT_DISPATCH macros to AT_DISPATCH_V2 format in ATen C++ code. Use when porting AT_DISPATCH_ALL_TYPES_AND*, AT_DISPATCH_FLOATING_TYPES*, or other dispatch macros to the new v2 API. For ATen kernel files, CUDA kernels, and native operator implementations.

🇺🇸|EnglishTranslated

AI & Machine Learningtondevrel/scientific-agen...

pytorch-research

Advanced sub-skill for PyTorch focused on deep research and production engineering. Covers custom Autograd functions, module hooks, advanced initialization, Distributed Data Parallel (DDP), and performance profiling.

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

ml-training-recipes

Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics, genomics. Covers training loops, optimizer selection (AdamW, Muon), LR scheduling, mixed precision, debugging, and systematic experimentation. Use when training or fine-tuning neural networks, debugging loss spikes or OOM, choosing architectures, or optimizing GPU throughput.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesnvidia/skills

bump-base-image

Bump the NVIDIA PyTorch base image (`nvcr.io/nvidia/pytorch:<YY.MM>-py3`) used by Megatron-LM CI. Covers the two pin sites (GitHub CI in `docker/.ngc_version.dev` and GitLab CI in `.gitlab/stages/01.build.yml`), the post-bump CI loop (re-run functional tests, refresh golden values, mark broken tests), and the gotchas that bit PRs

🇺🇸|EnglishTranslated

AI & Machine Learningpytorch/pytorch

add-uint-support

Add unsigned integer (uint) type support to PyTorch operators by updating AT_DISPATCH macros. Use when adding support for uint16, uint32, uint64 types to operators, kernels, or when user mentions enabling unsigned types, barebones unsigned types, or uint support.

🇺🇸|EnglishTranslated

AI & Machine Learningletta-ai/skills

torch-pipeline-parallelism

Guidance for implementing PyTorch pipeline parallelism for distributed model training. This skill should be used when tasks involve implementing pipeline parallelism, distributed training with model partitioning across GPUs/ranks, AFAB (All-Forward-All-Backward) scheduling, or inter-rank tensor communication using torch.distributed.

🇺🇸|EnglishTranslated

AI & Machine Learningdavila7/claude-code-templ...

huggingface-accelerate

Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.

🇺🇸|EnglishTranslated

AI & Machine Learningdavila7/claude-code-templ...

nnsight-remote-interpretability

Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.

🇺🇸|EnglishTranslated