nv-segment-ct-finetune

Original🇺🇸 English
Translated
2 scripts

Used for smoke or dataset finetuning of NV-Segment-CT VISTA3D on CT NIfTI labels. Not for clinical validation.

3installs
Added on

NPX Install

npx skill4agent add nvidia/skills nv-segment-ct-finetune

Tags

Translated version includes tags in frontmatter

NV-Segment-CT Finetune

Purpose

  • Used for smoke or dataset finetuning of NV-Segment-CT VISTA3D on CT NIfTI labels. Not for clinical validation.
  • Wraps the upstream MONAI bundle entrypoint; do not replace it with handwritten training or inference code.
  • Manifest inputs are
    dataset_dir
    ,
    datalist
    ,
    target_anatomy
    ,
    label_mapping
    ,
    smoke
    ,
    sanity
    ,
    auto_seg
    , and
    skip_formal_eval
    .
  • Manifest outputs are
    finetuned_ckpt
    and schema-checked
    result_json
    .

Instructions

  • Run
    scripts/run_finetune.py
    ; do not patch files under
    bundle/
    or upstream checkouts during normal skill use.
  • For standalone Bash, include the fresh-environment setup line before the wrapper; benchmark venvs start empty.
  • Run the committed script in place from the repo root. Do not copy this skill to a runtime directory, and do not use
    rm
    or cleanup commands in generated invocations.
  • If a host exposes
    run_script
    , use
    run_script("scripts/run_finetune.py", args=[...])
    ; otherwise run from the repo root.
  • For the shortest workflow check, use
    --smoke
    ; for MSD Task06 Lung Tumor reproduction, use
    --sanity
    .
  • Read
    references/task06-and-results.md
    only when you need Task06 reference details, output-field definitions, or manual bundle setup notes.

Available Scripts

ScriptPurposeArguments
scripts/run_finetune.py
Primary entrypoint declared by
skill_manifest.yaml
; stages configs, runs MONAI, and writes
output.json
.
[FIXTURE_OR_DATASET] --output-dir OUT_DIR [--smoke] [--sanity] [--auto-seg] [--dataset-dir DIR] [--datalist JSON] [--target-anatomy TEXT] [--label-mapping JSON] [--patch-size JSON]

Prerequisites

  • Python 3.10+ with CUDA-capable Torch for GPU runs.
  • Runtime packages from
    skill_manifest.yaml
    , especially
    monai==1.4.0
    ,
    numpy<2
    ,
    nibabel
    ,
    scipy
    ,
    typer
    ,
    PyYAML
    ,
    fire
    ,
    pytorch-ignite
    ,
    einops
    , and
    huggingface_hub
    .
  • Optional environment variables:
    CUDA_VISIBLE_DEVICES
    restricts visible GPUs;
    NPROC_PER_NODE
    overrides GPU count and values
    >=2
    select multi-GPU mode for non-sanity runs.
  • Side effects: writes generated bundle configs under
    skills/nv-segment-ct-finetune/bundle/configs/
    , including
    skills/nv-segment-ct-finetune/bundle/configs/auto_override.json
    ,
    skills/nv-segment-ct-finetune/bundle/configs/train_continual_task06_lung.json
    , and
    skills/nv-segment-ct-finetune/bundle/configs/dfw_no_logging.json
    ; writes checkpoints/evidence under
    --output-dir
    , may cache model assets under
    ~/.cache/huggingface/
    , and may contact
    https://huggingface.co
    or
    https://raw.githubusercontent.com
    .
Fresh environment setup:
bash
python -m pip install "monai==1.4.0" "numpy<2" pytorch-ignite einops nibabel scipy typer PyYAML fire huggingface_hub
Known upstream compatibility constraints:
  • DFW Task06 reference: Python
    3.10.16
    , MONAI
    1.4.0
    , Torch
    2.7.0+cu126
    .
  • Use exact
    monai==1.4.0
    for smoke, sanity, and evidence runs; MONAI 1.5.x can crash the upstream finetune loss on boolean labels.
  • Do not float the dependency as
    monai>=1.4,<1.6
    in generated commands.

Usage

Smoke-scale workflow check:
bash
python -m pip install "monai==1.4.0" "numpy<2" pytorch-ignite einops nibabel scipy typer PyYAML fire huggingface_hub && \
python skills/nv-segment-ct-finetune/scripts/run_finetune.py \
  PATH_TO_DATASET \
  --smoke \
  --patch-size '[64,64,64]' \
  --output-dir runs/nvseg_smoke
Use the staged dataset as
PATH_TO_DATASET
. For the micro fixture, use
skills/nv-segment-ct-finetune/fixtures/spleen_micro
. Smoke mode proves wiring, config generation, checkpoint loading, and runtime compatibility; it is not a quality bar.
MSD Task06 Lung Tumor sanity reproduction:
bash
python skills/nv-segment-ct-finetune/scripts/run_finetune.py \
  /path/to/Task06 \
  --sanity \
  --output-dir runs/nvseg_task06_sanity
The sanity preset follows the single-GPU DFW recipe: fold-0 validation, label mapping
[[1, 23]]
for
lung tumor
, automatic class-prompt segmentation, patch
[128,128,128]
, 5 epochs, and original-spacing
configs/evaluate.json
scoring before and after training. Expected reference range is pretrained Dice about
0.6697
, training-best Dice about
0.6905
, and fine-tuned formal Dice about
0.6836
.
User-data finetune:
bash
python skills/nv-segment-ct-finetune/scripts/run_finetune.py \
  --dataset-dir /path/to/dataset \
  --datalist /path/to/datalist.json \
  --target-anatomy "lung tumor" \
  --auto-seg \
  --epochs 5 \
  --patch-size '[128,128,128]' \
  --output-dir runs/nvseg_user_finetune
Use
--label-mapping '[[1, 23]]'
when local label values are custom or the anatomy name is ambiguous.

Examples

Smoke run on a staged tiny dataset:
bash
python skills/nv-segment-ct-finetune/scripts/run_finetune.py \
  runs/with_vs_without_nv/_inputs/nv_segment_ct_finetune/input_dataset \
  --smoke \
  --patch-size '[64,64,64]' \
  --output-dir runs/nvseg_smoke
Task06 sanity run on a local MSD cache:
bash
python skills/nv-segment-ct-finetune/scripts/run_finetune.py \
  .workbench_data/datasets/Task06_Lung \
  --sanity \
  --output-dir runs/nvseg_task06_sanity

Data Contract

  • Preferred layout:
    dataset/imagesTr/*.nii.gz
    and
    dataset/labelsTr/*.nii.gz
    .
  • Labels must align one-to-one with images by basename.
  • The target label value must be present in the training labels.
  • Use a datalist when patient-level splitting matters. The bundle default
    fold
    is
    0
    , so
    fold: 0
    entries are validation and all other folds are training.
  • Every trained foreground label must map to an existing VISTA3D global class id from
    bundle/label_dict.json
    ; this skill cannot invent a new class.

Results

Check
output.json
in the run directory first:
  • formal_pretrained_val_dice
    and
    formal_finetuned_val_dice
    : original-spacing pre/post scores when formal eval is enabled.
  • training_start_val_dice
    ,
    val_dice_per_epoch
    , and
    training_best_val_dice
    : training-time validation trace.
  • finetuned_ckpt_matches_pretrained_weights
    : detects the epoch-0 checkpoint trap when
    val_at_start=true
    .
  • recommended_ckpt
    : checkpoint to keep. Do not blindly use the last epoch or
    model_finetune.pt
    .
  • runtime.oom
    ,
    runtime.peak_gpu_mb
    , and phase logs: distinguish OOM, slow validation, and process failure.
Decision rule: prefer formal original-spacing pre/post scores when present; reject tensor-identical "fine-tuned" checkpoints for sanity recovery; treat
improved: false
as valid evidence rather than a wrapper failure.

Limitations

  • Thin wrapper. Training, validation, transforms, and checkpointing are delegated to the upstream bundle in
    bundle/
    .
  • The auto-derived plan is heuristic; caller-provided
    --patch-size
    ,
    --cache-rate
    ,
    --epochs
    , and
    --learning-rate
    win.
  • The Task06 sanity recipe intentionally forces single-GPU execution to match the DFW reference. Multi-GPU mode for other datasets requires host
    torchrun
    support.
  • The paired verifier is CPU-only and audits the evidence pack; it does not re-run GPU segmentation.
  • Not for clinical deployment, clinical interpretation, autonomous diagnosis, or regulatory submission.

Troubleshooting

ErrorCauseFix
Missing dependency or import errorRuntime drift from
skill_manifest.yaml
.
Install the packages above or use the documented environment.
Low Task06 pretrained DiceWrong config, wrong checkpoint, data split drift, or dependency drift.Compare environment fields and staged configs before changing training logic.
model_finetune.pt
matches pretrained
val_at_start=true
selected epoch 0 as best.
Use
recommended_ckpt
; treat sanity recovery as failed unless a changed checkpoint improves formal Dice.
Missing formal Dice fieldsFormal eval failed or was skipped.Inspect
eval_pretrained.log
,
eval_finetuned.log
, and
metrics.csv
.
GPU out of memoryPatch/cache settings too large.Reduce
--patch-size
, lower
--cache-rate
, or reduce workers.
No validation casesDatalist lacks
fold: 0
.
Provide at least one validation entry.

Verification

Run the implemented verifier when quality gates matter:
bash
python -m eval_engine.run_trusted skills/nv-segment-ct-finetune \
  --fixture skills/nv-segment-ct-finetune/fixtures/spleen_micro \
  --out runs/nvseg_trusted