Loading...
Loading...
Found 80 Skills
Guidance for creating standalone CLI tools that perform neural network inference by extracting PyTorch model weights and reimplementing inference in C/C++. This skill applies when tasks involve converting PyTorch models to standalone executables, extracting model weights to portable formats (JSON), implementing neural network forward passes in C/C++, or creating CLI tools that load images and run inference without Python dependencies.
Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.
Write docstrings for PyTorch functions and methods following PyTorch conventions. Use when writing or updating docstrings in PyTorch code.
Add unsigned integer (uint) type support to PyTorch operators by updating AT_DISPATCH macros. Use when adding support for uint16, uint32, uint64 types to operators, kernels, or when user mentions enabling unsigned types, barebones unsigned types, or uint support.
Guidance for recovering PyTorch model architectures from state dictionaries, retraining specific layers, and saving models in TorchScript format. This skill should be used when tasks involve reconstructing model architectures from saved weights, fine-tuning specific layers while freezing others, or converting models to TorchScript format.
Guidance for implementing PyTorch pipeline parallelism for distributed model training. This skill should be used when tasks involve implementing pipeline parallelism, distributed training with model partitioning across GPUs/ranks, AFAB (All-Forward-All-Backward) scheduling, or inter-rank tensor communication using torch.distributed.
Train ML models with scikit-learn, PyTorch, TensorFlow. Use for classification/regression, neural networks, hyperparameter tuning, or encountering overfitting, underfitting, convergence issues.
Triages GitHub issues by routing to oncall teams, applying labels, and closing questions. Use when processing new PyTorch issues or when asked to triage an issue.
Document undocumented public APIs in PyTorch by removing functions from coverage_ignore_functions and coverage_ignore_classes in docs/source/conf.py, running Sphinx coverage, and adding the appropriate autodoc directives to the correct .md or .rst doc files. Use when a user asks to remove functions from conf.py ignore lists.
LeetCode-style PyTorch interview practice environment with auto-grading for implementing softmax, attention, GPT-2 and more from scratch.
Generate PyTorch-style interface documentation (README.md) for AscendC operators. Trigger scenarios: Use this when interface documentation needs to be generated after compilation and debugging are completed, or when the user mentions "generate operator documentation", "create README", "document operator", "help me write documentation" (in operator context), "operator documentation".
Accepts Triton operator implementations, automatically invokes Torch small operator implementations (CPU or NPU) for precision comparison, and generates precision reports. It is used when users need to verify the correctness and precision of Triton operator implementations, compare precision with PyTorch implementations, and generate standardized precision reports.