Loading...
Loading...
Found 80 Skills
World-class ML engineering skill for productionizing ML models, MLOps, and building scalable ML systems. Expertise in PyTorch, TensorFlow, model deployment, feature stores, model monitoring, and ML infrastructure. Includes LLM integration, fine-tuning, RAG systems, and agentic AI. Use when deploying ML models, building ML platforms, implementing MLOps, or integrating LLMs into production systems.
OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.
Guidance for recovering PyTorch model architectures from state dictionaries, retraining specific layers, and saving models in TorchScript format. This skill should be used when tasks involve reconstructing model architectures from saved weights, fine-tuning specific layers while freezing others, or converting models to TorchScript format.
Train custom TTS voices for Piper (ONNX format) using fine-tuning or from-scratch approaches. Use when creating new synthetic voices, fine-tuning existing Piper checkpoints, preparing audio datasets for TTS training, or deploying voice models to devices like Raspberry Pi or Home Assistant. Covers dataset preparation, Whisper-based validation, training configuration, and ONNX export.
Analyze AI/ML technical content (papers, articles, blog posts) and extract actionable insights filtered through enterprise AI engineering lens. Use when user provides URL/document for AI/ML content analysis, asks to "review this paper", or mentions technical content in domains like RAG, embeddings, fine-tuning, prompt engineering, LLM deployment.
Receive and verify OpenAI webhooks. Use when setting up OpenAI webhook handlers for fine-tuning jobs, batch completions, or async events like fine_tuning.job.completed, batch.completed, or realtime.call.incoming.
Battle-tested PyTorch training recipes for all domains — LLMs, vision, diffusion, medical imaging, protein/drug discovery, spatial omics, genomics. Covers training loops, optimizer selection (AdamW, Muon), LR scheduling, mixed precision, debugging, and systematic experimentation. Use when training or fine-tuning neural networks, debugging loss spikes or OOM, choosing architectures, or optimizing GPU throughput.
Run GPU workloads on Modal — training, fine-tuning, inference, batch processing. Zero-config serverless: no SSH, no Docker, auto scale-to-zero. Use when user says "modal run", "modal training", "modal inference", "deploy to modal", "need a GPU", "run on modal", "serverless GPU", or needs remote GPU compute.
Collaboration Process for UI Style Modifications. Used when users request page style changes, layout adjustments, or UI detail tweaks. The structured process of "Screenshot Localization → Current Status Description → Option Selection → Code Modification → Fine-tuning" reduces communication deviations and avoids token waste.
Build identity-preserving character generation workflows and pipelines in ComfyUI. Selects the optimal identity method (InfiniteYou, FLUX Kontext, PuLID, InstantID, IP-Adapter) based on use case requirements. Handles face preservation, likeness transfer, cross-domain conversion (3D to photo), multi-reference consistency, iterative character editing, and character variation generation. Triggers on requests to generate consistent characters, preserve identity across images, create face-swapping workflows, or convert 3D renders to photorealistic portraits. Does NOT cover general image generation without identity preservation, model training/LoRA fine-tuning, animation, technical explanations, or workflow debugging.
Chief AI Officer advisory for startups: model build-vs-buy decisions (API vs fine-tune vs in-house), AI risk classification under EU AI Act + US state patchwork, AI cost economics (API-to-self-hosted breakeven), and AI team org evolution. Use when deciding whether to call an API or fine-tune, classifying AI use cases for regulatory risk, calculating when self-hosting pays off, sequencing AI hires, or when user mentions CAIO, AI strategy, model selection, foundation model, fine-tuning, EU AI Act, NIST AI RMF, AI governance, model risk, or AI economics. Strategic only — does not duplicate engineering AI/ML skills.
Build text-to-speech applications using Qwen3-TTS, a powerful speech generation system supporting voice clone, voice design, and custom voice synthesis. Use when creating TTS applications, generating speech from text, cloning voices from audio samples, designing new voices via natural language descriptions, or fine-tuning TTS models. Supports 10 languages (Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian).