Search Results: tao-toolkit

Found 12 Skills

tao-train-nvpanoptix3d

NVPanoptix3D for panoptic 3D scene reconstruction from posed RGB images. Produces 3D panoptic segmentation (semantic, instance, and panoptic masks) with occupancy completion. Built on a VGGT backbone with a Mask2Former-style head and 3D frustum reconstruction. Use when training, evaluating, exporting, or running inference for a TAO NVPanoptix3D model. Trigger phrases include "train NVPanoptix3D", "panoptic 3D reconstruction", "3D scene segmentation", "occupancy completion".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-centerpose

CenterPose for keypoint / pose estimation. Detects object centers and regresses keypoint locations for 6-DoF object pose estimation. Use when training, evaluating, exporting, or running inference for a TAO CenterPose model. Trigger phrases include "train CenterPose", "6-DoF object pose", "keypoint estimation", "object pose regression".

🇺🇸|EnglishTranslated

AI & Machine Learningpromptingcompany/nv-skill...

tao-train-nvdinov2

NVDINOv2 for self-supervised visual representation learning. Trains vision transformers via self-distillation (teacher-student) without labels and produces general-purpose visual features. Use when training, distilling, exporting, or running inference for a TAO NVDINOv2 backbone. Trigger phrases include "train NVDINOv2", "self-supervised ViT pretraining", "DINOv2 backbone", "visual representation learning".

🇺🇸|EnglishTranslated

AI & Machine Learningpromptingcompany/nv-skill...

tao-train-fast-foundation-stereo

Real-time stereo depth estimation using FastFoundationStereo (FFS), the distilled bp2 commercial variant of FoundationStereo. Predicts disparity maps from stereo image pairs with ~10× lower latency than full FoundationStereo. Use when training, evaluating, exporting, or running inference for a TAO FastFoundationStereo (FFS) model. Trigger phrases include "train fast stereo", "real-time stereo disparity", "FastFoundationStereo", "distilled stereo depth".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-mask-grounding-dino

Mask Grounding DINO for grounded instance segmentation. Extends Grounding DINO with a mask-prediction head for open-set segmentation guided by text prompts. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask-Grounding-DINO model. Trigger phrases include "train Mask Grounding DINO", "open-vocabulary segmentation", "text-prompted instance segmentation", "grounded mask DETR".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-ocdnet

OCDNet for scene text detection. Detects arbitrary-oriented text regions in natural images using a differentiable binarization approach. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCDNet model. Trigger phrases include "train OCDNet", "scene text detection", "arbitrary-oriented text boxes", "differentiable binarization detector".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-analyze-gaps-vlm-bcq

Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions. Use after running VLM evaluation when you have a predictions JSON and need to identify failure cases for DEFT root cause analysis on a binary-classification VLM workflow.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-run-on-kubernetes

Kubernetes execution platform — submits TAO container jobs as single-pod k8s Jobs with NVIDIA GPU scheduling. Use when running on EKS / GKE / AKS / on-prem clusters with the NVIDIA GPU Operator installed, or when integrating TAO into an existing k8s-native ML platform.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-segformer

SegFormer for semantic segmentation. Lightweight transformer-based architecture with hierarchical feature extraction, efficient for real-time segmentation tasks. Use when training, evaluating, exporting, quantizing, or running inference for a TAO SegFormer model. Trigger phrases include "train SegFormer", "semantic segmentation", "lightweight transformer segmenter", "real-time semantic segmentation".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-depth-anything-v2

Monocular depth estimation using Metric Depth Anything v2 or Relative Depth Anything architectures. Predicts per-pixel depth from single RGB images. Use when training, evaluating, exporting, or running inference for a TAO monocular depth model. Trigger phrases include "train monocular depth", "DepthAnything v2", "metric depth from single image", "monocular depth estimation".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-mine-aoi-images

Runs the DEFT embed-then-mine workflow for VCN AOI iterations — embeds the gap-analysis target parquet, embeds a source pool, and mines nearest-neighbour source images for downstream augmentation. Use as the immediate next step after `tao-route-visual-changenet-samples` when expanding a real-image augmentation queue from the mining subset.

🇺🇸|EnglishTranslated

5 scripts/Attention

AI & Machine Learningnvidia/skills

tao-train-oneformer

OneFormer for universal image segmentation. Unifies panoptic, instance, and semantic segmentation with a single architecture using task-conditioned queries. Use when training, evaluating, exporting, quantizing, or running inference for a TAO OneFormer model. Trigger phrases include "train OneFormer", "universal segmentation", "task-conditioned segmentation", "panoptic / instance / semantic in one model".

🇺🇸|EnglishTranslated