tao-analyze-gaps-visual-changenet
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTAO VCN Classify Gap Analysis Skill
TAO VCN Classify差距分析Skill
You are an analyst for NVIDIA TAO VCN Classify (Visual Component Net) inference results. Your job is to identify the weakest samples per ground-truth label by measuring signed distance from the decision threshold in the wrong direction, then surface them for downstream augmentation or relabeling.
This skill is intentionally lightweight. VCN's classify head is a single-score binary boundary (PASS vs NO_PASS by ), so the analysis is computational, not investigative. The whole computation lives behind one direct invocation against the image declared in (resolved at runtime — see Setup). The container's entrypoint takes ; we pass . You do not need subagents, multi-phase image audits, or component-type clustering — VCN does not expose those dimensions. View only a small set of representative weak samples to qualify the gaps after the container returns.
siamese_scoredocker runtao_toolkit.data_servicesversions.yaml<category> <action> [hydra overrides...]gap_analysis vcn_aoi key=value …The CLI surface can shift between data-services container builds. If a invocation fails on argument parsing, introspect the actual schema once per image with and reconcile any renamed keys before retrying. See for the key-rename reconciliation and the full pitfalls list. The output parquet name is .
gap_analysis vcn_aoidocker run --rm "$DS_IMAGE" gap_analysis vcn_aoi --cfg=jobreferences/troubleshooting.mdkpi_gaps.parquet你是NVIDIA TAO VCN Classify(视觉组件网络)推理结果的分析师。你的任务是通过计算样本与决策阈值的反向符号距离,识别每个真实标签下的最弱样本,然后将这些样本提交给下游的增强或重新标注流程。
这个Skill设计得十分轻量化。VCN的分类头是单一分数的二元边界(通过区分PASS与NO_PASS),因此分析是计算性的,而非调查性的。整个计算过程通过对中声明的镜像执行一次直接的调用即可完成(运行时解析——见设置部分)。容器的入口点接收参数;我们传入。你不需要使用子Agent、多阶段图像审核或组件类型聚类——VCN不提供这些维度的信息。容器返回结果后,只需查看少量代表性的弱样本即可确认差距情况。
siamese_scoreversions.yamltao_toolkit.data_servicesdocker run<category> <action> [hydra overrides...]gap_analysis vcn_aoi key=value …CLI界面可能会随data-services容器版本更新而变化。如果调用因参数解析失败,可通过查看当前镜像的实际配置模式,在重试前调整任何重命名的参数键。有关参数键重命名调整和完整问题列表,请参阅。输出的parquet文件名为。
gap_analysis vcn_aoidocker run --rm "$DS_IMAGE" gap_analysis vcn_aoi --cfg=jobreferences/troubleshooting.mdkpi_gaps.parquetInputs
输入
- Experiment result directory — contains (required columns
inference/inference.csv,input_path,object_name,label). Pass the directory (e.g.siamese_score), not the CSV file.inference/latest/ - Training code/config directory — contains the VCN train YAML; the container reads and
dataset.classify.input_mapfor per-lighting expansion.dataset.classify.image_ext - Dataset directory — image root () prepended to each row's relative
kpi_media_path.input_path - Schema overrides — ,
min_recall, and optionally a hard-pinnedtop_k_per_label, passed as Hydra overrides (defaults:threshold,min_recall=1.0,top_k_per_label=50meaning sweep).threshold=-1.0must be a positive integer — omitting it flips the container into "below-threshold filter" mode, which attop_k_per_labelreturns only PASS misclassifications and zero NO_PASS rows.min_recall=1.0
See for the full input detail, the override semantics, and the per-default explanation.
references/parameters-and-artifacts.mdGapAnalysisConfig- 实验结果目录——包含(必填列:
inference/inference.csv、input_path、object_name、label)。需传入目录(例如siamese_score),而非CSV文件。inference/latest/ - 训练代码/配置目录——包含VCN训练YAML文件;容器会读取和
dataset.classify.input_map用于按光照条件展开样本。dataset.classify.image_ext - 数据集目录——图像根目录(),会添加到每行的相对
kpi_media_path前。input_path - 配置覆盖参数——、
min_recall,可选硬固定的top_k_per_label,作为Hydra覆盖参数传入(默认值:threshold,min_recall=1.0,top_k_per_label=50表示自动搜索阈值)。threshold=-1.0必须为正整数——省略该参数会将容器切换为“阈值以下过滤”模式,在top_k_per_label时仅返回分类错误的PASS样本,无NO_PASS样本。min_recall=1.0
有关完整输入细节、覆盖参数语义及默认值说明,请参阅。
GapAnalysisConfigreferences/parameters-and-artifacts.mdSetup
设置
The threshold sweep, weakness ranking, and per-lighting expansion all run inside the image declared in . Resolve the concrete URI once at the top of the run, then confirm Docker, the NVIDIA container toolkit, and a GPU are present and ensure the image is cached:
tao_toolkit.data_servicesversions.yamlbash
undefined阈值搜索、弱点排名和按光照条件展开均在中声明的镜像内运行。在运行开始时先解析具体的URI,然后确认Docker、NVIDIA容器工具包和GPU已就绪,并确保镜像已缓存:
versions.yamltao_toolkit.data_servicesbash
undefinedResolve tao_toolkit.data_services → concrete nvcr.io/... URI from versions.yaml
从versions.yaml中解析tao_toolkit.data_services → 具体的nvcr.io/... URI
DS_IMAGE=$(python3 -c "import yaml,os; print(yaml.safe_load(open(os.environ['TAO_SKILL_BANK_PATH']+'/versions.yaml'))['images']['tao_toolkit']['data_services'])")
echo "DS_IMAGE=$DS_IMAGE"
docker info > /dev/null && echo "OK: docker"
nvidia-smi > /dev/null && echo "OK: GPU"
docker image inspect "$DS_IMAGE" > /dev/null
|| docker pull "$DS_IMAGE"
|| docker pull "$DS_IMAGE"
`TAO_SKILL_BANK_PATH` is exported by the plugin's `session_start` hook. If it is unset (e.g. running outside the Claude Code plugin), point it at the skill-bank repo root before resolving.
A GPU is required (the same image is used across the AOI loop and other actions assume CUDA is present). Aborting early on a GPU-less host saves a confusing late error.
**Path mounting.** Every host path the container reads or writes — `inference.csv`, the train YAML, the dataset image root, and the output dir — must be bind-mounted. The simplest pattern is to mount the workspace root with **identical paths** inside and outside the container so absolute paths in args resolve the same on both sides:
```bash
WORKSPACE=<absolute path that contains inference.csv, train YAML, dataset images, and the output dir>
DOCKER="docker run --gpus all --rm --ipc=host --user $(id -u):$(id -g) -v $WORKSPACE:$WORKSPACE -w $WORKSPACE $DS_IMAGE"If , the train YAML, and the dataset images live in different roots, pass multiple flags — but every absolute path you pass in args must resolve inside the container.
inference.csv-vCLI overrides cover the common case. , , and optionally are passed as Hydra overrides on the command line; defaults baked into the container (, , to sweep) handle most runs. If the container also accepts a spec file via (verify with ), passing one is a convenience, not a requirement — override only what you need.
min_recalltop_k_per_labelthresholdmin_recall=1.0top_k_per_label=50threshold=-1.0-e <spec>--cfg=jobDS_IMAGE=$(python3 -c "import yaml,os; print(yaml.safe_load(open(os.environ['TAO_SKILL_BANK_PATH']+'/versions.yaml'))['images']['tao_toolkit']['data_services'])")
echo "DS_IMAGE=$DS_IMAGE"
docker info > /dev/null && echo "OK: docker"
nvidia-smi > /dev/null && echo "OK: GPU"
docker image inspect "$DS_IMAGE" > /dev/null
|| docker pull "$DS_IMAGE"
|| docker pull "$DS_IMAGE"
`TAO_SKILL_BANK_PATH`由插件的`session_start`钩子导出。如果未设置(例如在Claude Code插件外运行),请在解析前将其指向skill-bank仓库根目录。
必须使用GPU(AOI循环和其他操作均假设CUDA可用,因此同一镜像会被复用)。在无GPU的主机上提前终止运行可避免后续出现令人困惑的错误。
**路径挂载**。容器需要读取或写入的所有主机路径——`inference.csv`、训练YAML文件、数据集图像根目录和输出目录——都必须进行绑定挂载。最简单的方式是将工作区根目录以**内外路径一致**的方式挂载,这样参数中的绝对路径在容器内外均可解析:
```bash
WORKSPACE=<包含inference.csv、训练YAML、数据集图像和输出目录的绝对路径>
DOCKER="docker run --gpus all --rm --ipc=host --user $(id -u):$(id -g) -v $WORKSPACE:$WORKSPACE -w $WORKSPACE $DS_IMAGE"如果、训练YAML和数据集图像位于不同根目录,请传入多个参数——但你传入的每个绝对路径都必须能在容器内解析。
inference.csv-vCLI覆盖参数覆盖常见场景。、和可选的作为Hydra覆盖参数在命令行传入;容器内置的默认值(、、表示自动搜索)可处理大多数运行场景。如果容器也支持通过传入配置文件(请通过验证),传入配置文件只是一种便捷方式,并非必需——仅覆盖你需要修改的参数即可。
min_recalltop_k_per_labelthresholdmin_recall=1.0top_k_per_label=50threshold=-1.0-e <spec>--cfg=jobMethod
方法
The whole skill is a single invocation followed by a small visual spot-check. The container does Steps 1–4 internally (threshold sweep, weakness scoring, top-K selection, per-lighting expansion). You handle Step 5 (visual spot-check) directly with the Read tool.
docker run整个Skill仅需一次调用,随后进行少量视觉抽查。容器会在内部完成步骤1-4(阈值搜索、弱点评分、top-K选择、按光照条件展开)。你需直接使用Read工具完成步骤5(视觉抽查)。
docker runStep 1–4 — Run the container
步骤1-4 — 运行容器
bash
$DOCKER gap_analysis vcn_aoi \
inference_results_dir=<exp_dir>/inference/<label>/ \
train_config=<exp_dir>/train.yaml \
kpi_media_path=<dataset_root> \
results_dir=<rca_results_dir> \
top_k_per_label=50Always pass. This is the argument that switches the container from the default "samples below threshold" filter into proper top-K-per-label ranking. Attop_k_per_labelthe threshold is by construction at-or-below every NO_PASS score, so the below-threshold filter returns ONLY misclassified PASS rows and zero NO_PASS rows — useless as an augmentation queue. Withmin_recall=1.0set to a positive integer (either in the spec or as a Hydra override), the container computes signed weakness against the threshold for every row and surfaces the K weakest per ground-truth label, which is the per-label ranked output downstream steps consume.top_k_per_label
The container sweeps every unique (plus one value just below the minimum), keeps candidates with NO_PASS recall ≥ (tolerance ), picks the best-F1 threshold (tie-break: precision, then threshold value), scores signed weakness per row, takes the top per ground-truth label, and expands each into one row per lighting. See for the exact computation, the override defaults, and the artifact table.
siamese_scoremin_recall1e-12top_k_per_labelreferences/parameters-and-artifacts.mdIf no candidate threshold meets the recall target, the container exits non-zero and writes into explaining which recall the model can actually achieve. In that case, stop the analysis after the docker call, write a one-section report explaining the model fundamentally cannot reach the KPI at any operating point, and recommend retraining or relabeling — skip the visual spot-check.
unreachable_kpi.txtresults_dirContainer writes into : (top-K weakest per label, expanded per lighting; columns , , , ), , , , and (only when the recall target is unreachable). See for the per-artifact contents. Print the container's stdout summary (chosen threshold, kept-row counts, per-label breakdown) to your own stdout so the script-check hook can verify the run produced output.
results_dirkpi_gaps.parquetfilepathlabelsiamese_scoreweaknessthreshold.txtmetrics.jsonweak_samples_breakdown.txtunreachable_kpi.txtreferences/parameters-and-artifacts.mdbash
$DOCKER gap_analysis vcn_aoi \
inference_results_dir=<exp_dir>/inference/<label>/ \
train_config=<exp_dir>/train.yaml \
kpi_media_path=<dataset_root> \
results_dir=<rca_results_dir> \
top_k_per_label=50务必传入。该参数用于将容器从默认的“阈值以下样本”过滤模式切换为正确的“按标签选择top-K”排名模式。当top_k_per_label时,所选阈值必然小于或等于所有NO_PASS样本的分数,因此阈值以下过滤模式仅会返回分类错误的PASS样本,无NO_PASS样本——这对于增强队列来说毫无用处。将min_recall=1.0设置为正整数(在配置文件中或作为Hydra覆盖参数)后,容器会计算每个样本相对于阈值的符号弱点评分,并展示每个真实标签下的K个最弱样本,这正是下游步骤所需的按标签排名输出。top_k_per_label
容器会遍历所有唯一的(加上一个略低于最小值的数值),保留NO_PASS召回率≥(容差)的候选阈值,选择F1值最高的阈值(平局时优先考虑精度,其次是阈值大小),为每个样本计算符号弱点评分,选取每个真实标签下的前个样本,并按光照条件展开每个样本。有关精确计算过程、覆盖参数默认值和产物列表,请参阅。
siamese_scoremin_recall1e-12top_k_per_labelreferences/parameters-and-artifacts.md如果没有候选阈值能达到召回率目标,容器会以非零状态退出,并在中写入,说明模型实际能达到的召回率。在这种情况下,docker调用后停止分析,撰写一段报告说明模型在任何运行点都根本无法达到KPI,并建议重新训练或重新标注——跳过视觉抽查。
results_dirunreachable_kpi.txt容器写入的内容:(每个标签下的top-K最弱样本,按光照条件展开;列包括、、、)、、、,以及(仅当召回率目标无法达到时生成)。有关每个产物的内容,请参阅。将容器的stdout摘要(所选阈值、保留样本数量、按标签细分情况)打印到你的stdout,以便脚本检查钩子验证运行是否产生了输出。
results_dirkpi_gaps.parquetfilepathlabelsiamese_scoreweaknessthreshold.txtmetrics.jsonweak_samples_breakdown.txtunreachable_kpi.txtreferences/parameters-and-artifacts.mdStep 5 — Visual spot check (small, fixed)
步骤5 — 视觉抽查(少量、固定数量)
Skip this step if exists in — there is nothing meaningful to spot-check when the model can't reach the KPI at any threshold.
unreachable_kpi.txtresults_dirOtherwise, use the Read tool to view the test images for:
- The 5 weakest PASS samples (the top of the "PASS misclassified as NO_PASS" pile) — pick by sorting rows where
kpi_gaps.parquetbylabel == 'PASS'descending.weakness - The 5 weakest NO_PASS samples (the top of the "NO_PASS misclassified as PASS" pile) — same, with .
label != 'PASS'
kpi_gaps.parquetfilepathClassify each viewed sample as exactly one of:
- mislabeled — visual content disagrees with the CSV label
- edge case — genuinely ambiguous boundary case
- data quality — corrupted, dark, wrong crop, bad framing
- systematic — model has learned the wrong feature (the image looks "obviously PASS/NO_PASS" but the model disagrees)
Copy each viewed image (resized to 128×128 if PIL is available, otherwise just copy) into so it can be embedded inline in the report.
<results_dir>/rca_images/This is the only image inspection required. Do not view dozens of images, do not run failure mode clustering, do not audit goldens — VCN does not have golden images.
如果中存在,则跳过此步骤——当模型在任何阈值下都无法达到KPI时,没有有意义的内容可抽查。
results_dirunreachable_kpi.txt否则,使用Read工具查看以下测试图像:
- 5个最弱的PASS样本(“PASS被错误分类为NO_PASS”中的最顶部样本)——通过对中
kpi_gaps.parquet的行按label == 'PASS'降序排序选取。weakness - 5个最弱的NO_PASS样本(“NO_PASS被错误分类为PASS”中的最顶部样本)——同理,选取的行。
label != 'PASS'
kpi_gaps.parquetfilepath将每个查看的样本精确分类为以下类别之一:
- 标注错误——视觉内容与CSV标签不符
- 边缘情况——真正模糊的边界案例
- 数据质量问题——图像损坏、过暗、裁剪错误、构图不佳
- 系统性问题——模型学习了错误特征(图像看起来“明显是PASS/NO_PASS”但模型判断错误)
将每个查看的图像(如果PIL可用则调整为128×128大小,否则直接复制)复制到,以便嵌入到报告中。
<results_dir>/rca_images/这是唯一需要的图像检查步骤。无需查看数十张图像,无需运行失败模式聚类,无需审核黄金样本——VCN没有黄金样本。
Reference invocation
参考调用
The paste-and-edit end-to-end recipe (workspace, four paths, two numeric knobs, spec-file write, docker run, and the stdout sanity print that surfaces row counts for the script-check hook) lives in . Use it verbatim, editing only the workspace, paths, and knobs.
references/recipe.md可直接复制编辑的端到端流程(工作区、四个路径、两个数值参数、配置文件写入、docker运行,以及用于脚本检查钩子验证样本数量的stdout sanity打印)位于中。请直接使用该流程,仅修改工作区、路径和参数。
references/recipe.mdOutputs
输出
Write everything into a timestamped folder under the experiment result directory: . The container's outputs (, , , , and when applicable) go straight there; the visual spot-check writes ; the packaging hook adds and automatically when is written. See for the full folder tree.
<experiment_result_dir>/rca_results/YYYY-MM-DD_HHMMSS/kpi_gaps.parquetthreshold.txtmetrics.jsonweak_samples_breakdown.txtunreachable_kpi.txtrca_images/rca_config/claude_session.jsonlRCA_Report.mdreferences/parameters-and-artifacts.mdAt the start of the run, get the real timestamp by running in Bash. Do NOT hardcode or guess. If the user specifies a custom output path, use that instead but maintain the same internal structure.
date +%Y-%m-%d_%H%M%S将所有内容写入实验结果目录下的一个带时间戳的文件夹:。容器的输出(、、、,以及适用时的)直接写入该文件夹;视觉抽查会写入;当被写入时,打包钩子会自动添加和。有关完整文件夹结构,请参阅。
<experiment_result_dir>/rca_results/YYYY-MM-DD_HHMMSS/kpi_gaps.parquetthreshold.txtmetrics.jsonweak_samples_breakdown.txtunreachable_kpi.txtrca_images/RCA_Report.mdrca_config/claude_session.jsonlreferences/parameters-and-artifacts.md在运行开始时,通过在Bash中执行获取真实时间戳。请勿硬编码或猜测时间戳。如果用户指定了自定义输出路径,请使用该路径,但保持内部结构不变。
date +%Y-%m-%d_%H%M%SCommon pitfalls
常见问题
The most consequential failure is forgetting when — at that recall the chosen threshold sits at or below every NO_PASS score, so the fallback below-threshold filter matches ONLY misclassified PASS rows and ends up with zero NO_PASS rows. Always include an explicit positive . The full pitfalls list (spec file outside , unresolved sentinels, wrong/unpulled image tag, path-mount mismatch, handling, missing columns, missing train-YAML keys, prefix mismatch, no GPU inside the container) and the CLI-drift reconciliation are in .
top_k_per_labelmin_recall=1.0kpi_gaps.parquettop_k_per_label$WORKSPACE???unreachable_kpi.txtinference.csvkpi_media_pathreferences/troubleshooting.md最严重的错误是当时忘记传入——在该召回率下,所选阈值会小于或等于所有NO_PASS样本的分数,因此默认的阈值以下过滤模式仅会匹配分类错误的PASS样本,导致中没有NO_PASS样本。务必传入明确的正整数。完整的问题列表(配置文件位于外、未解析的占位符、错误/未拉取的镜像标签、路径挂载不匹配、处理、列缺失、训练YAML键缺失、前缀不匹配、容器内无GPU)以及CLI变更调整方法,请参阅。
min_recall=1.0top_k_per_labelkpi_gaps.parquettop_k_per_label$WORKSPACE???unreachable_kpi.txtinference.csvkpi_media_pathreferences/troubleshooting.mdReport Structure
报告结构
Write the RCA report into the timestamped output folder. It is a 7-section computational gap analysis (Verdict, Threshold Selection, Weakness Distribution, Top-K Weakest Samples, Visual Spot Check, Per-Label Breakdown, Recommended Actions), 1000–1800 words, with the confusion-matrix and per-label tables filled from and the spot-check rows from . When exists, replace sections 3–6 with one short section quoting that file and collapse section 7 to a single retrain-or-relabel recommendation. See for the complete skeleton with every section heading, table layout, and the unreachable-KPI variant.
metrics.jsonkpi_gaps.parquetunreachable_kpi.txtreferences/rca-report-structure.md将RCA报告写入带时间戳的输出文件夹。报告为7节的计算性差距分析(结论、阈值选择、弱点分布、Top-K最弱样本、视觉抽查、按标签细分、建议行动),字数1000-1800字,混淆矩阵和按标签表格从填充,抽查行从获取。如果存在,则将第3-6节替换为引用该文件的简短章节,并将第7节简化为单一的重新训练或重新标注建议。有关完整框架、每个节标题、表格布局以及无法达到KPI的变体,请参阅。
metrics.jsonkpi_gaps.parquetunreachable_kpi.txtreferences/rca-report-structure.mdExecution Order
执行顺序
- Resolve from
DS_IMAGE(versions.yaml), then runimages.tao_toolkit.data_services,docker info, andnvidia-smi(pulling if missing) once to confirm the environment. Abort with a clear message if any fail.docker image inspect "$DS_IMAGE" - Run to get the timestamp; create
date +%Y-%m-%d_%H%M%S.<experiment_result_dir>/rca_results/<timestamp>/ - Write into the timestamped dir with
vcn_aoi_spec.yamlandmin_recallfilled in. Keep it undertop_k_per_labelso the$WORKSPACEpath resolves inside the container.-e - Run . The container writes
docker run … "$DS_IMAGE" gap_analysis vcn_aoi -e vcn_aoi_spec.yaml inference_results_dir=… train_config=… kpi_media_path=… output_dir=…,kpi_gaps.parquet,threshold.txt,metrics.jsonintoweak_samples_breakdown.txt. Print the chosen threshold and kept-row counts to stdout so the script-check hook can verify the run produced output.results_dir - If exists, skip Step 6 and write the abridged report. Otherwise continue.
unreachable_kpi.txt - Pick 10 weak samples (5 weakest PASS + 5 weakest NO_PASS) from , view each test image with Read, classify, and copy each into
kpi_gaps.parquet.rca_images/ - Write last — writing it triggers the packaging hook, which copies session logs and skill config alongside.
RCA_Report.md
- 从(
versions.yaml)解析images.tao_toolkit.data_services,然后运行DS_IMAGE、docker info和nvidia-smi(如果缺失则拉取)一次以确认环境。如果任何步骤失败,需给出明确提示并终止运行。docker image inspect "$DS_IMAGE" - 运行获取时间戳;创建
date +%Y-%m-%d_%H%M%S。<experiment_result_dir>/rca_results/<timestamp>/ - 将写入带时间戳的文件夹,填入
vcn_aoi_spec.yaml和min_recall。确保该文件位于top_k_per_label下,以便$WORKSPACE路径在容器内可解析。-e - 运行。容器会将
docker run … "$DS_IMAGE" gap_analysis vcn_aoi -e vcn_aoi_spec.yaml inference_results_dir=… train_config=… kpi_media_path=… output_dir=…、kpi_gaps.parquet、threshold.txt、metrics.json写入weak_samples_breakdown.txt。将所选阈值和保留样本数量打印到stdout,以便脚本检查钩子验证运行是否产生了输出。results_dir - 如果存在,跳过步骤6并撰写简化版报告。否则继续。
unreachable_kpi.txt - 从中选取10个弱样本(5个最弱PASS + 5个最弱NO_PASS),使用Read工具查看每个测试图像,进行分类,并将每个图像复制到
kpi_gaps.parquet。rca_images/ - 最后写入——写入该文件会触发打包钩子,自动复制会话日志和Skill配置到旁边。
RCA_Report.md