Loading...
Loading...
Mask2Former for universal image segmentation (panoptic, instance, and semantic). Transformer-based with masked attention for high-quality segmentation results. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask2Former model. Trigger phrases include "train Mask2Former", "universal segmentation", "panoptic / instance / semantic segmentation", "masked-attention transformer segmenter".
npx skill4agent add nvidia/skills tao-train-mask2formergen_trt_engineevaluateinferencereferences/tao-deploy-mask2former.mdreferences/spec_template_deploy_*.yamlschemas/<action>.schema.jsonschemas/manifest.jsonreferences/spec_template_<action>.yamldefaultreferences/skill_info.yamlautoml_enabledschemas/train.schema.jsonreferences/spec_template_train.yamlautoml_default_parametersautoml_disabled_parameters~/tao-corereferences/skill_info.yamlautoml_policyautoml_policy: offautoautoml_policy: autoautoml_enabled: trueschemas/train.schema.jsonreferences/spec_template_train.yamltao-skill-bank:tao-run-automlskill_dirautoml_policyautoml_policy: offevaluateinferenceexportautoml_policy| Action | Spec Key | Source | Files | List? |
|---|---|---|---|---|
| evaluate | dataset.train.img_dir | train_datasets | images.tar.gz | No |
| evaluate | dataset.label_map | train_datasets | coco_panoptic: label_map_panoptic.json; *: label_map.json | No |
| evaluate | dataset.train.instance_json | train_datasets | annotations.json | No |
| evaluate | dataset.train.panoptic_json | train_datasets | annotations_panoptic.json | No |
| evaluate | dataset.train.panoptic_dir | train_datasets | images_panoptic.tar.gz | No |
| evaluate | dataset.val.img_dir | eval_dataset | images.tar.gz | No |
| evaluate | dataset.val.instance_json | eval_dataset | annotations.json | No |
| evaluate | dataset.val.panoptic_json | eval_dataset | annotations_panoptic.json | No |
| evaluate | dataset.val.panoptic_dir | eval_dataset | images_panoptic.tar.gz | No |
| evaluate | dataset.test.img_dir | eval_dataset | images.tar.gz | No |
| inference | dataset.train.img_dir | train_datasets | images.tar.gz | No |
| inference | dataset.label_map | train_datasets | coco_panoptic: label_map_panoptic.json; *: label_map.json | No |
| inference | dataset.train.instance_json | train_datasets | annotations.json | No |
| inference | dataset.train.panoptic_json | train_datasets | annotations_panoptic.json | No |
| inference | dataset.train.panoptic_dir | train_datasets | images_panoptic.tar.gz | No |
| inference | dataset.val.img_dir | eval_dataset | images.tar.gz | No |
| inference | dataset.val.instance_json | eval_dataset | annotations.json | No |
| inference | dataset.val.panoptic_json | eval_dataset | annotations_panoptic.json | No |
| inference | dataset.val.panoptic_dir | eval_dataset | images_panoptic.tar.gz | No |
| inference | dataset.test.img_dir | eval_dataset | images.tar.gz | No |
| quantize | dataset.train.img_dir | train_datasets | images.tar.gz | No |
| quantize | dataset.label_map | train_datasets | coco_panoptic: label_map_panoptic.json; *: label_map.json | No |
| quantize | dataset.train.instance_json | train_datasets | annotations.json | No |
| quantize | dataset.train.panoptic_json | train_datasets | annotations_panoptic.json | No |
| quantize | dataset.train.panoptic_dir | train_datasets | images_panoptic.tar.gz | No |
| quantize | dataset.val.img_dir | eval_dataset | images.tar.gz | No |
| quantize | dataset.val.instance_json | eval_dataset | annotations.json | No |
| quantize | dataset.val.panoptic_json | eval_dataset | annotations_panoptic.json | No |
| quantize | dataset.val.panoptic_dir | eval_dataset | images_panoptic.tar.gz | No |
| quantize | dataset.test.img_dir | eval_dataset | images.tar.gz | No |
| quantize | dataset.quant_calibration_dataset.images_dir | train_datasets | images.tar.gz | No |
| train | dataset.train.img_dir | train_datasets | images.tar.gz | No |
| train | dataset.label_map | train_datasets | coco_panoptic: label_map_panoptic.json; *: label_map.json | No |
| train | dataset.train.instance_json | train_datasets | annotations.json | No |
| train | dataset.train.panoptic_json | train_datasets | annotations_panoptic.json | No |
| train | dataset.train.panoptic_dir | train_datasets | images_panoptic.tar.gz | No |
| train | dataset.val.img_dir | eval_dataset | images.tar.gz | No |
| train | dataset.val.instance_json | eval_dataset | annotations.json | No |
| train | dataset.val.panoptic_json | eval_dataset | annotations_panoptic.json | No |
| train | dataset.val.panoptic_dir | eval_dataset | images_panoptic.tar.gz | No |
| train | dataset.test.img_dir | eval_dataset | images.tar.gz | No |
spec_overridesS3_TRAIN = "s3://bucket/data/train"
S3_EVAL = "s3://bucket/data/eval"{
"train.num_gpus": 1,
"train.num_epochs": 10,
"train.checkpoint_interval": 10,
"train.validation_interval": 10,
"model.sem_seg_head.num_classes": 90,
"dataset.contiguous_id": True,
"dataset.train.img_dir": f"{S3_TRAIN}/images.tar.gz",
"dataset.label_map": {"coco_panoptic": f"{S3_TRAIN}/label_map_panoptic.json; *: label_map.json"},
"dataset.train.instance_json": f"{S3_TRAIN}/annotations.json",
"dataset.train.panoptic_json": f"{S3_TRAIN}/annotations_panoptic.json",
"dataset.train.panoptic_dir": f"{S3_TRAIN}/images_panoptic.tar.gz",
"dataset.val.img_dir": f"{S3_EVAL}/images.tar.gz",
"dataset.val.instance_json": f"{S3_EVAL}/annotations.json",
"dataset.val.panoptic_json": f"{S3_EVAL}/annotations_panoptic.json",
"dataset.val.panoptic_dir": f"{S3_EVAL}/images_panoptic.tar.gz",
"dataset.test.img_dir": f"{S3_EVAL}/images.tar.gz",
}{
"model.sem_seg_head.num_classes": 90,
"dataset.contiguous_id": True,
"dataset.train.img_dir": f"{S3_TRAIN}/images.tar.gz",
"dataset.label_map": {"coco_panoptic": f"{S3_TRAIN}/label_map_panoptic.json; *: label_map.json"},
"dataset.train.instance_json": f"{S3_TRAIN}/annotations.json",
"dataset.train.panoptic_json": f"{S3_TRAIN}/annotations_panoptic.json",
"dataset.train.panoptic_dir": f"{S3_TRAIN}/images_panoptic.tar.gz",
"dataset.val.img_dir": f"{S3_EVAL}/images.tar.gz",
"dataset.val.instance_json": f"{S3_EVAL}/annotations.json",
"dataset.val.panoptic_json": f"{S3_EVAL}/annotations_panoptic.json",
"dataset.val.panoptic_dir": f"{S3_EVAL}/images_panoptic.tar.gz",
"dataset.test.img_dir": f"{S3_EVAL}/images.tar.gz",
}{
"model.sem_seg_head.num_classes": 90,
}{
"model.sem_seg_head.num_classes": 90,
"dataset.contiguous_id": True,
"dataset.train.img_dir": f"{S3_TRAIN}/images.tar.gz",
"dataset.label_map": {"coco_panoptic": f"{S3_TRAIN}/label_map_panoptic.json; *: label_map.json"},
"dataset.train.instance_json": f"{S3_TRAIN}/annotations.json",
"dataset.train.panoptic_json": f"{S3_TRAIN}/annotations_panoptic.json",
"dataset.train.panoptic_dir": f"{S3_TRAIN}/images_panoptic.tar.gz",
"dataset.val.img_dir": f"{S3_EVAL}/images.tar.gz",
"dataset.val.instance_json": f"{S3_EVAL}/annotations.json",
"dataset.val.panoptic_json": f"{S3_EVAL}/annotations_panoptic.json",
"dataset.val.panoptic_dir": f"{S3_EVAL}/images_panoptic.tar.gz",
"dataset.test.img_dir": f"{S3_EVAL}/images.tar.gz",
}{
"dataset.train.img_dir": f"{S3_TRAIN}/images.tar.gz",
"dataset.label_map": {"coco_panoptic": f"{S3_TRAIN}/label_map_panoptic.json; *: label_map.json"},
"dataset.train.instance_json": f"{S3_TRAIN}/annotations.json",
"dataset.train.panoptic_json": f"{S3_TRAIN}/annotations_panoptic.json",
"dataset.train.panoptic_dir": f"{S3_TRAIN}/images_panoptic.tar.gz",
"dataset.val.img_dir": f"{S3_EVAL}/images.tar.gz",
"dataset.val.instance_json": f"{S3_EVAL}/annotations.json",
"dataset.val.panoptic_json": f"{S3_EVAL}/annotations_panoptic.json",
"dataset.val.panoptic_dir": f"{S3_EVAL}/images_panoptic.tar.gz",
"dataset.test.img_dir": f"{S3_EVAL}/images.tar.gz",
"dataset.quant_calibration_dataset.images_dir": f"{S3_TRAIN}/images.tar.gz",
}python| Spec Key | Description | Default |
|---|---|---|
| Number of GPUs | 1 |
| GPU device indices | [0] |
| Number of nodes | 1 |
| | |
sync_batchnormfsdpWORLD_SIZENODE_RANKMASTER_ADDRMASTER_PORTNUM_GPU_PER_NODEconfig.jsoncreate_job()infer_params.pymask2former.config.json| Action | Spec Field | Inference Function | Meaning |
|---|---|---|---|
| evaluate | | | encryption key |
| evaluate | | | model file inferred from the parent job results folder |
| evaluate | | | model file inferred from the parent job results folder |
| evaluate | | | current job results directory |
| export | | | encryption key |
| export | | | model file inferred from the parent job results folder |
| export | | | output ONNX path |
| export | | | current job results directory |
| gen_trt_engine | | | encryption key |
| gen_trt_engine | | | model file inferred from the parent job results folder |
| gen_trt_engine | | | output TensorRT engine path |
| gen_trt_engine | | | current job results directory |
| inference | | | encryption key |
| inference | | | model file inferred from the parent job results folder |
| inference | | | model file inferred from the parent job results folder |
| inference | | | current job results directory |
| quantize | | | encryption key |
| quantize | | | model file inferred from the parent job results folder |
| quantize | | | current job results directory |
| train | | | encryption key |
| train | | | {'link': 'https://github.com/SwinTransformer/storage/releases/download/v1.0.8/swin_tiny_patch4_window7_224_22k.pth', 'destination_path': '/ptm/mask2former/swin_tiny_patch4_window7_224_22k/swin_tiny_patch4_window7_224_22k.pth'} |
| train | | | current job results directory |
| train | | | model file inferred from the current job results folder |
parent_modelparent_model_folderparent_job_idconfig.json