Loading...
Loading...
Found 36 Skills
Loading and using pretrained models with Hugging Face Transformers. Use when working with pretrained models from the Hub, running inference with Pipeline API, fine-tuning models with Trainer, or handling text, vision, audio, and multimodal tasks.
Work with state-of-the-art machine learning models for NLP, computer vision, audio, and multimodal tasks using HuggingFace Transformers. This skill should be used when fine-tuning pre-trained models, performing inference with pipelines, generating text, training sequence models, or working with BERT, GPT, T5, ViT, and other transformer architectures. Covers model loading, tokenization, training with Trainer API, text generation strategies, and task-specific patterns for classification, NER, QA, summarization, translation, and image tasks. (plugin:scientific-packages@claude-scientific-skills)
Hugging Face Transformers best practices including model loading, tokenization, fine-tuning workflows, and inference optimization. Use when working with transformer models, fine-tuning LLMs, implementing NLP tasks, or optimizing transformer inference.
Convert HuggingFace transformer models to ONNX format for browser inference with Transformers.js and WebGPU. Use when given a HuggingFace model link to convert to ONNX, when setting up optimum-cli for ONNX export, when quantizing models (fp16, q8, q4) for web deployment, when configuring Transformers.js with WebGPU acceleration, or when troubleshooting ONNX conversion errors. Triggers on mentions of ONNX conversion, Transformers.js, WebGPU inference, optimum export, model quantization for browser, or running ML models in the browser.
Use when "HuggingFace Transformers", "pre-trained models", "pipeline API", or asking about "text generation", "text classification", "question answering", "NER", "fine-tuning transformers", "AutoModel", "Trainer API"
Use when "RAG", "retrieval augmented generation", "LangChain", "LlamaIndex", "sentence transformers", "embeddings", "document QA", "chatbot with documents", "semantic search"
Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.
Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.
Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.
Elite AI/ML Senior Engineer with 20+ years experience. Transforms Claude into a world-class AI researcher and engineer capable of building production-grade ML systems, LLMs, transformers, and computer vision solutions. Use when: (1) Building ML/DL models from scratch or fine-tuning, (2) Designing neural network architectures, (3) Implementing LLMs, transformers, attention mechanisms, (4) Computer vision tasks (object detection, segmentation, GANs), (5) NLP tasks (NER, sentiment, embeddings), (6) MLOps and production deployment, (7) Data preprocessing and feature engineering, (8) Model optimization and debugging, (9) Clean code review for ML projects, (10) Choosing optimal libraries and frameworks. Triggers: "ML", "AI", "deep learning", "neural network", "transformer", "LLM", "computer vision", "NLP", "TensorFlow", "PyTorch", "sklearn", "train model", "fine-tune", "embedding", "CNN", "RNN", "LSTM", "attention", "GPT", "BERT", "diffusion", "GAN", "object detection", "segmentation".