tao-analyze-gaps-vlm-bcq

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

VLM Binary Classification Gap Analysis

VLM二分类差距分析

Reads a VLM predictions JSON, compares each model response against ground truth, and writes FP/FN failure cases to a JSONL file with a summary report.

读取VLM预测结果JSON文件，将每个模型响应与真实标签进行对比，并将FP（假阳性）/FN（假阴性）失败案例写入JSONL文件，同时生成一份汇总报告。

Purpose

用途

After running a VLM on a binary yes/no evaluation task, the predictions need to be compared against ground truth to identify failure cases. This skill produces a structured list of FP (false positive) and FN (false negative) samples that downstream RCCA stages (e.g., cosmos generation, root cause analysis) consume to drive a DEFT iteration.

在VLM完成是/否二分类评估任务后，需要将预测结果与真实标签对比以识别失败案例。本工具会生成结构化的FP（假阳性）和FN（假阴性）样本列表，供下游RCCA阶段（如cosmos生成、根因分析）使用，以推动DEFT迭代。

Usage

使用方法

Invoke the

vlm_bcq

action inside the TAO Toolkit data services container with Hydra-style key=value overrides:

bash

gap_analysis vlm_bcq \
  predictions_json=/path/to/results.json \
  results_dir=/path/to/output/gaps

Include

videos_dir

when

video_id

values in the predictions are relative paths:

bash

gap_analysis vlm_bcq \
  predictions_json=/path/to/results.json \
  results_dir=/path/to/output/gaps \
  videos_dir=/path/to/videos/root

After the run, surface the FP/FN counts from

kpi_gaps_report.txt

and point downstream stages at

kpi_gaps.jsonl

在TAO Toolkit数据服务容器内，使用Hydra风格的key=value参数覆盖调用

vlm_bcq

操作：

bash

gap_analysis vlm_bcq \
  predictions_json=/path/to/results.json \
  results_dir=/path/to/output/gaps

当预测结果中的

video_id

为相对路径时，需添加

videos_dir

参数：

bash

gap_analysis vlm_bcq \
  predictions_json=/path/to/results.json \
  results_dir=/path/to/output/gaps \
  videos_dir=/path/to/videos/root

运行完成后，可从

kpi_gaps_report.txt

中查看FP/FN的数量，并将下游阶段指向

kpi_gaps.jsonl

文件。

Inputs

输入参数

predictions_json: Path to predictions JSON file. Must be a JSON array where each item has
```
video_id
```
,
```
response
```
, and
```
gt
```
fields.
```
response
```
and
```
gt
```
are parsed with word-boundary matching —
```
'yes'
```
or
```
'no'
```
anywhere in the string is recognized. Samples where both or neither are present are skipped with a warning.
videos_dir (optional): Base directory for resolving relative
```
video_id
```
paths. If omitted,
```
video_id
```
values are used as absolute paths.

Predictions JSON format:

json

[
  {
    "video_id": "/path/to/video.mp4",
    "response": "Yes, there is a collision.",
    "gt": "B. No",
    "question": "Is there a collision?"
  }
]

predictions_json：预测结果JSON文件的路径。文件必须为JSON数组，其中每个元素包含
```
video_id
```
、
```
response
```
和
```
gt
```
字段。
```
response
```
和
```
gt
```
通过单词边界匹配解析——字符串中任意位置的
```
'yes'
```
或
```
'no'
```
都会被识别。若样本同时包含或都不包含这两个词，会被跳过并发出警告。
videos_dir（可选）：用于解析相对
```
video_id
```
路径的基础目录。若省略，
```
video_id
```
值将被视为绝对路径。

预测结果JSON格式：

json

[
  {
    "video_id": "/path/to/video.mp4",
    "response": "Yes, there is a collision.",
    "gt": "B. No",
    "question": "Is there a collision?"
  }
]

Outputs

输出结果

kpi_gaps.jsonl: One JSON object per line for each FP/FN case. Fields:
```
video_id
```
(absolute path),
```
error_type
```
(
```
FP
```
or
```
FN
```
),
```
question
```
,
```
ground_truth
```
,
```
response
```
.
kpi_gaps_report.txt: Human-readable table with total FP/FN counts.

If no gaps are found, no files are written and a message is logged.

kpi_gaps.jsonl：每行一个JSON对象，对应每个FP/FN案例。字段包括：
```
video_id
```
（绝对路径）、
```
error_type
```
（
```
FP
```
或
```
FN
```
）、
```
question
```
、
```
ground_truth
```
、
```
response
```
。
kpi_gaps_report.txt：易读的表格形式文件，包含FP/FN的总数。

若未发现差距，则不会生成任何文件，仅记录一条提示信息。

Key Parameters

关键参数

Parameter	Required	Description
predictions_json	Yes	Path to predictions JSON file
results_dir	Yes	Output directory; created if it does not exist
videos_dir	No	Base directory for resolving relative `video_id` paths

参数	必填	描述
predictions_json	是	预测结果JSON文件的路径
results_dir	是	输出目录；若不存在则自动创建
videos_dir	否	用于解析相对 `video_id` 路径的基础目录

Error Patterns

错误模式

Error	Cause	Fix
`FileNotFoundError`	`predictions_json` does not exist	Check the path
`ValueError: must be a JSON array`	Predictions file is not a list	Wrap predictions in `[...]`
`ValueError: missing 'gt'/'response'/'video_id'`	A prediction item is missing a required field	Inspect and fix the predictions JSON
Samples silently skipped	`response` or `gt` contains both or neither 'yes'/'no'	Check logs for warnings; inspect those samples

错误	原因	修复方法
`FileNotFoundError`	`predictions_json` 文件不存在	检查路径是否正确
`ValueError: must be a JSON array`	预测结果文件不是数组格式	将预测结果包裹在 `[...]` 中
`ValueError: missing 'gt'/'response'/'video_id'`	某个预测结果项缺少必填字段	检查并修复预测结果JSON文件
样本被静默跳过	`response` 或 `gt` 同时包含或都不包含'yes'/'no'	查看日志中的警告信息；检查相关样本