tao-train-fast-foundation-stereo

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Depth Net Fast Stereo

Depth Net 快速立体视觉

Real-time stereo depth estimation using FastFoundationStereo (FFS) — the bp2 commercial distilled variant of FoundationStereo. Predicts disparity maps from rectified stereo image pairs with per-layer pruned widths for real-time inference.
The mono / stereo / fast-stereo skills share the unified TAO
depth_net
CLI; FFS is selected via
model.model_type: FastFoundationStereo
. FFS differs from
FoundationStereo
only in pruned per-layer widths and a serialized forward path; everything else (entrypoint, action verbs, dataset classes, deploy chain) is identical to
depth-net-stereo
.
For TAO Deploy TensorRT actions (
gen_trt_engine
, TensorRT
evaluate
, TensorRT
inference
), read
references/tao-deploy-fast-foundation-stereo.md
first. The deploy spec template lives at
references/spec_template_deploy.yaml
.
使用FastFoundationStereo (FFS)——FoundationStereo的bp2商用蒸馏变体,进行实时立体深度估计。通过每层剪枝宽度实现实时推理,从校正后的立体图像对中预测视差图。
单目/立体/快速立体技能共享统一的TAO
depth_net
CLI;通过设置
model.model_type: FastFoundationStereo
来选择FFS。FFS与
FoundationStereo
的区别仅在于每层剪枝宽度和序列化前向路径;其他所有内容(入口点、操作动词、数据集类、部署链)都与
depth-net-stereo
完全相同。
对于TAO Deploy TensorRT操作(
gen_trt_engine
、TensorRT
evaluate
、TensorRT
inference
),请先阅读
references/tao-deploy-fast-foundation-stereo.md
。部署规范模板位于
references/spec_template_deploy.yaml

Train Action Policy

训练操作策略

This model is AutoML-enabled at the model layer. Before handling any train-stage request, read
references/skill_info.yaml
and resolve the run override from either an explicit
automl_policy
value or the user's workflow request. Treat phrases like "turn off AutoML", "disable AutoML", "no HPO", or "plain training" as
automl_policy: off
for this run only; otherwise default to
auto
. When
automl_policy: auto
,
automl_enabled: true
, and both
schemas/train.schema.json
and
references/spec_template_train.yaml
are packaged, route the train action through
tao-skill-bank:tao-run-automl
by default with this model's
skill_dir
. Preserve workflow/application overrides for datasets, specs, output directories, GPU/platform settings, parent checkpoints, and
automl_policy
. Use direct model training only when
automl_policy: off
or the packaged train schema/template is missing; in the missing-schema case, report that AutoML is enabled but not runnable for this model until schemas are generated.
Non-train actions such as
evaluate
,
inference
,
export
, and deploy flows stay in this model skill. The per-run
automl_policy
override does not change model metadata.
该模型在模型层支持AutoML。处理任何训练阶段请求前,请阅读
references/skill_info.yaml
,并通过显式的
automl_policy
值或用户的工作流请求确定运行覆盖配置。将“turn off AutoML”、“disable AutoML”、“no HPO”或“plain training”这类短语视为本次运行的
automl_policy: off
;否则默认设置为
auto
。当
automl_policy: auto
automl_enabled: true
,且
schemas/train.schema.json
references/spec_template_train.yaml
已打包时,默认通过
tao-skill-bank:tao-run-automl
结合该模型的
skill_dir
路由训练操作。保留数据集、规范、输出目录、GPU/平台设置、父检查点和
automl_policy
的工作流/应用覆盖配置。仅当
automl_policy: off
或打包的训练架构/模板缺失时,才使用直接模型训练;若架构缺失,需报告AutoML已启用,但在生成架构前无法针对该模型运行。
非训练操作(如
evaluate
inference
export
和部署流程)仍在该模型技能中执行。每次运行的
automl_policy
覆盖配置不会更改模型元数据。

Two Use Cases

两种使用场景

FFS ships with a pre-trained bp2 commercial checkpoint (
model_best_bp2_serialize.pth
).
  1. Raw deploy — use the bp2 ckpt as-is. Skip
    train
    ; run
    inference
    /
    evaluate
    /
    export
    /
    gen_trt_engine
    directly with the bp2 file as the action's checkpoint.
  2. Finetune on user data — set
    train.pretrained_model_path
    to the bp2 file, train on user data, then verify + deploy on the resulting ckpt. The full 7-action sequence (train → evaluate pyt → inference pyt → export → gen_trt_engine → inference deploy → evaluate deploy) is supported.
FFS附带一个预训练的bp2商用检查点(
model_best_bp2_serialize.pth
)。
  1. 直接部署——直接使用bp2检查点。跳过
    train
    ;直接使用bp2文件作为操作的检查点,运行
    inference
    /
    evaluate
    /
    export
    /
    gen_trt_engine
  2. 基于用户数据微调——将
    train.pretrained_model_path
    设置为bp2文件路径,在用户数据上进行训练,然后验证并部署生成的检查点。支持完整的7步操作流程(train → evaluate pyt → inference pyt → export → gen_trt_engine → inference deploy → evaluate deploy)。

Workflow

工作流

Prerequisites — data accessibility

前提条件——数据可访问性

Your dataset (left + right images + GT disparity for train / evaluate, left + right only for inference) must be reachable from inside the container:
  • SDK runner: place files at the S3 paths the runner resolves (
    S3_TRAIN
    /
    S3_EVAL
    placeholders shown in the spec overrides).
  • Direct
    docker run
    (e.g. local testing): mount the host dataset root read-only at the same in-container path:
docker run ... -v <host_data_root>:<host_data_root>:ro <container> ...
The same accessibility requirement applies to the
<output_dir>
written by all actions, and to the bp2 checkpoint path.
你的数据集(训练/评估用的左+右图像+GT视差,推理用的左+右图像)必须能从容器内部访问:
  • SDK运行器:将文件放置在运行器可解析的S3路径(规范覆盖配置中显示的
    S3_TRAIN
    /
    S3_EVAL
    占位符)。
  • 直接
    docker run
    (如本地测试):将主机数据集根目录以只读方式挂载到容器内的相同路径:
docker run ... -v <host_data_root>:<host_data_root>:ro <container> ...
所有操作写入的
<output_dir>
以及bp2检查点路径也需满足相同的可访问性要求。

Step 1 — Annotation file

步骤1——标注文件

Per-line annotation file referenced by
data_sources[*].data_file
. Schema is identical to
depth-net-stereo
:
ColumnsFormatUse
2
<left> <right>
Stereo inference (no GT)
3
<left> <right> <disparity>
Stereo with GT
4
<left> <right> <disparity> <occlusion_mask>
Stereo with GT and occlusion mask
Generate via
depth_net convert
if needed; see the
depth-net-stereo
skill for
convert_spec.yaml
template.
data_sources[*].data_file
引用的每行标注文件,其架构与
depth-net-stereo
相同:
列数格式用途
2
<left> <right>
立体推理(无GT)
3
<left> <right> <disparity>
带GT的立体任务
4
<left> <right> <disparity> <occlusion_mask>
带GT和遮挡掩码的立体任务
如有需要,可通过
depth_net convert
生成;请参考
depth-net-stereo
技能中的
convert_spec.yaml
模板。

Step 2 — Pair
model_type
and
dataset_name
based on your data

步骤2——根据数据匹配
model_type
dataset_name

Use
model_type: FastFoundationStereo
for FFS. The
dataset_name
choice mirrors the stereo skill — pick the dataset-specific class when your layout matches a registered one, otherwise
GenericDataset
.
Data category
model_type
dataset_name
Middlebury
FastFoundationStereo
Middlebury
KITTI
FastFoundationStereo
Kitti
ETH3D
FastFoundationStereo
Eth3d
FSD synthetic
FastFoundationStereo
FSD
IsaacReal synthetic
FastFoundationStereo
IsaacRealDataset
Crestereo synthetic
FastFoundationStereo
Crestereo
Other / non-canonical
FastFoundationStereo
GenericDataset
For inference with 2-column annotations (left + right, no GT), use
dataset_name: GenericDataset
regardless of layout.
FFS需使用
model_type: FastFoundationStereo
dataset_name
的选择与立体技能一致——当你的数据布局与已注册的数据集匹配时,选择对应数据集类,否则选择
GenericDataset
数据类别
model_type
dataset_name
Middlebury
FastFoundationStereo
Middlebury
KITTI
FastFoundationStereo
Kitti
ETH3D
FastFoundationStereo
Eth3d
FSD合成数据
FastFoundationStereo
FSD
IsaacReal合成数据
FastFoundationStereo
IsaacRealDataset
Crestereo合成数据
FastFoundationStereo
Crestereo
其他/非标准数据
FastFoundationStereo
GenericDataset
对于使用2列标注(左+右,无GT)的推理任务,无论布局如何,均使用
dataset_name: GenericDataset

Step 3 — Set the bp2 distilled width overrides

步骤3——设置bp2蒸馏宽度覆盖配置

FFS requires 15 model-section width override fields whose values match the bp2 commercial checkpoint exactly. Omitting any field falls back to TAO defaults that do not match the bp2 ckpt and produce shape-mismatch errors at forward time.
yaml
model:
  model_type: FastFoundationStereo
  encoder: vitl
  hidden_dims: [128]                    # 1-layer GRU; NOT [128,128,128]
  n_gru_layers: 1                       # bp2 single-GRU
  corr_radius: 4
  corr_levels: 2
  n_downsample: 2
  valid_iters: 8
  max_disparity: 192                    # bp2 commercial; NOT 416 (full FS default)
  volume_dim: 28                       # bp2 ckpt invariant; NOT 32 (full FS default)
  mixed_precision: false                # see references/parameters.md
  gwc_feature_normalize: true           # see references/parameters.md

  # 15 bp2 distilled width overrides — copy as-is
  motion_encoder_widths: [56, 96, 16, 12]
  motion_encoder_final: 48
  gru_hidden: 60
  gru_gating_conv_widths: [100, 168]
  disp_head_input_dim: 60
  disp_head_intermediate: 36
  disp_head_pwconv1_widths: [212, 244]
  mask_widths: [32, 16]
  stem_2_widths: [12, 16]
  spx_2_gru_widths: [16, 12, 16, 24]
  spx_gru_out: 9
  classifier_mid: 14
  cnet_conv04_widths: [60, 48]
  cam_mid_channels: 8
  cost_agg_conv_patch_padding: [0, 0, 0]
The spec templates at
references/spec_template_*.yaml
carry this block as the canonical source.
FFS需要15个模型段宽度覆盖字段,其值必须与bp2商用检查点完全匹配。省略任何字段都会回退到TAO默认值,而这些默认值与bp2检查点不匹配,会在前向传播时产生形状不匹配错误。
yaml
model:
  model_type: FastFoundationStereo
  encoder: vitl
  hidden_dims: [128]                    # 1层GRU;不可设为[128,128,128]
  n_gru_layers: 1                       # bp2单GRU
  corr_radius: 4
  corr_levels: 2
  n_downsample: 2
  valid_iters: 8
  max_disparity: 192                    # bp2商用版本;不可设为416(完整FS默认值)
  volume_dim: 28                       # bp2检查点不变量;不可设为32(完整FS默认值)
  mixed_precision: false                # 参考references/parameters.md
  gwc_feature_normalize: true           # 参考references/parameters.md

  # 15个bp2蒸馏宽度覆盖配置——原样复制
  motion_encoder_widths: [56, 96, 16, 12]
  motion_encoder_final: 48
  gru_hidden: 60
  gru_gating_conv_widths: [100, 168]
  disp_head_input_dim: 60
  disp_head_intermediate: 36
  disp_head_pwconv1_widths: [212, 244]
  mask_widths: [32, 16]
  stem_2_widths: [12, 16]
  spx_2_gru_widths: [16, 12, 16, 24]
  spx_gru_out: 9
  classifier_mid: 14
  cnet_conv04_widths: [60, 48]
  cam_mid_channels: 8
  cost_agg_conv_patch_padding: [0, 0, 0]
references/spec_template_*.yaml
中的规范模板包含此代码块,作为标准来源。

Step 4 — Write spec yaml from the spec overrides

步骤4——根据覆盖配置编写规范yaml

Copy the action block from
references/spec-overrides.md
(per-action Python override dicts plus the shared
FFS_MODEL_BLOCK
). Replace:
  • model.model_type: FastFoundationStereo
    (already set)
  • dataset.<...>.data_sources[*].dataset_name
    from Step 2
  • dataset.<...>.data_sources[*].data_file
    with the path from Step 1
  • For raw deploy use cases (no train): set
    <action>.checkpoint
    to the bp2 file path
  • For finetune use cases: set
    train.pretrained_model_path
    to the bp2 file path
Chained train → next action checkpoint path: For local Docker chaining (no SDK runner), the trained checkpoint lives at
<train.results_dir>/<task>/dn_model_latest.pth
— Lightning
ModelCheckpoint
nests under the task name. Example:
train.results_dir: /workspace/results/finetune/train
produces
/workspace/results/finetune/train/train/dn_model_latest.pth
. Use that nested path for the next action's
<action>.checkpoint
. SDK-runner deploys resolve this automatically via
parent_job_id
— see
references/parent-model-inference.md
.
Shape consistency:
crop_size
in
dataset.test_dataset.augmentation.crop_size
should match
export.input_height
/
input_width
for end-to-end pyt-vs-deploy comparability — see
references/tao-deploy-fast-foundation-stereo.md
's shape table.
references/spec-overrides.md
复制操作块(每个操作的Python覆盖字典以及共享的
FFS_MODEL_BLOCK
)。替换以下内容:
  • model.model_type: FastFoundationStereo
    (已设置)
  • dataset.<...>.data_sources[*].dataset_name
    (来自步骤2)
  • dataset.<...>.data_sources[*].data_file
    (来自步骤1的路径)
  • 对于直接部署场景(无训练):将
    <action>.checkpoint
    设置为bp2文件路径
  • 对于微调场景:将
    train.pretrained_model_path
    设置为bp2文件路径
链式训练→后续操作的检查点路径:对于本地Docker链式调用(无SDK运行器),训练后的检查点位于
<train.results_dir>/<task>/dn_model_latest.pth
——Lightning
ModelCheckpoint
嵌套在任务名称下。示例:
train.results_dir: /workspace/results/finetune/train
会生成
/workspace/results/finetune/train/train/dn_model_latest.pth
。将此嵌套路径用于后续操作的
<action>.checkpoint
。SDK运行器部署会通过
parent_job_id
自动解析此路径——请参考
references/parent-model-inference.md
形状一致性:
dataset.test_dataset.augmentation.crop_size
中的
crop_size
应与
export.input_height
/
input_width
匹配,以确保端到端pyt与部署的可比性——请参考
references/tao-deploy-fast-foundation-stereo.md
中的形状表。

Step 5 — Run

步骤5——运行

docker run --gpus 'device=0' --shm-size 16G --ipc=host \
  --user $(id -u):$(id -g) \
  -v <data_root>:<data_root>:ro \
  -v <output_dir>:<output_dir> \
  -v <bp2_ckpt_dir>:<bp2_ckpt_dir>:ro \
  <container> \
  depth_net <action> -e <spec.yaml>
Without
--user $(id -u):$(id -g)
the container writes outputs as
nobody:nogroup
, blocking host-side cleanup / retry.
For the local bind-mount
__pycache__
caveat (QA / development only — clearing stale
.pyc
files that shadow patched source), see
references/troubleshooting.md
→ "Local bind-mount tip".
docker run --gpus 'device=0' --shm-size 16G --ipc=host \
  --user $(id -u):$(id -g) \
  -v <data_root>:<data_root>:ro \
  -v <output_dir>:<output_dir> \
  -v <bp2_ckpt_dir>:<bp2_ckpt_dir>:ro \
  <container> \
  depth_net <action> -e <spec.yaml>
如果不添加
--user $(id -u):$(id -g)
,容器会以
nobody:nogroup
身份写入输出,导致主机端无法清理/重试。
关于本地绑定挂载
__pycache__
的注意事项(仅用于QA/开发——清除过时的
.pyc
文件,避免覆盖补丁源码),请参考
references/troubleshooting.md
→“Local bind-mount tip”。

Step 6 — Verify

步骤6——验证

  • Container exit code 0
  • status.json
    kpi
    block populated
  • For
    train
    : inspect per-step
    train_loss
    directly (the entrypoint reports
    Execution status: PASS
    even when loss is NaN)
  • For
    evaluate
    : rely on
    epe
    /
    bp1
    /
    bp2
    /
    bp3
    /
    d1
    /
    rmse
    (the evaluator also emits
    abs_rel
    /
    sq_rel
    /
    rmse_log
    which are non-meaningful for stereo)
  • For
    inference
    : artifacts under
    results_dir
  • KPI namespace difference between pyt and deploy: pyt
    evaluate
    writes the metric set under
    kpi.val/epe
    ,
    kpi.val/bp1
    , etc. (namespaced by Lightning's
    val/
    prefix). Deploy
    evaluate
    (TRT engine path) writes the same metric set under
    kpi.epe
    ,
    kpi.bp1
    , etc. (no
    val/
    prefix). Downstream verification scripts that read
    status.json
    need to handle both shapes.
  • Validate drift on your own dataset: if you compare TAO FFS deploy (
    gen_trt_engine
    + TRT
    evaluate
    ) against the upstream FFS deploy path on the same input, expect a small residual mean_abs disparity drift (TAO export graph + TRT 10.13 interaction; not improvable at the source-code level). The exact magnitude is dataset and hardware dependent — measure on your own data and decide whether the drift is acceptable for your downstream task.
  • 容器退出码为0
  • status.json
    中的
    kpi
    块已填充
  • 对于
    train
    :直接检查每一步的
    train_loss
    (即使损失为NaN,入口点仍会报告
    Execution status: PASS
  • 对于
    evaluate
    :依赖
    epe
    /
    bp1
    /
    bp2
    /
    bp3
    /
    d1
    /
    rmse
    (评估器还会输出
    abs_rel
    /
    sq_rel
    /
    rmse_log
    ,但这些对立体任务无意义)
  • 对于
    inference
    results_dir
    下存在输出 artifacts
  • pyt与部署的KPI命名空间差异:pyt的
    evaluate
    将指标集写入
    kpi.val/epe
    kpi.val/bp1
    等(带有Lightning的
    val/
    前缀命名空间)。部署的
    evaluate
    (TRT引擎路径)将相同的指标集写入
    kpi.epe
    kpi.bp1
    等(无
    val/
    前缀)。读取
    status.json
    的下游验证脚本需要处理这两种格式。
  • 在自有数据集上验证漂移:如果在相同输入上比较TAO FFS部署(
    gen_trt_engine
    +TRT
    evaluate
    )与上游FFS部署路径,会存在微小的平均绝对视差残留漂移(由TAO导出图+TRT 10.13交互导致;无法在源码层面改进)。漂移的具体幅度取决于数据集和硬件——请在自有数据上测量,并判断漂移是否满足下游任务要求。

7-action deploy flow

7步部署流程

train (optional)            → finetuned ckpt
evaluate (pyt)              → PyT eager EPE / bp on val GT
inference (pyt)             → PyT eager disparity samples (visual sanity)
export                      → static fp32 ONNX (recommended at 480×736 or 320×736)
gen_trt_engine             → fp16 TRT engine on static ONNX path
inference (deploy)         → TRT disparity samples
evaluate (deploy)          → TRT EPE / bp drift vs PyT eager fp32
Skip
train
for raw-bp2 deploy. The remaining 6 actions (or the 4 deploy-only verbs starting from
export
) cover both use cases.
Full TAO Deploy reference: tao-deploy-fast-foundation-stereo.
train(可选)            → 微调后的检查点
evaluate(pyt)              → PyT eager模式下基于验证集GT的EPE/bp指标
inference(pyt)             → PyT eager模式下的视差样本(视觉合理性检查)
export                      → 静态fp32 ONNX(推荐尺寸为480×736或320×736)
gen_trt_engine             → 基于静态ONNX路径生成fp16 TRT引擎
inference(deploy)         → TRT视差样本
evaluate(deploy)          → TRT EPE/bp与PyT eager fp32的漂移对比
直接使用bp2部署时跳过
train
。剩余6步操作(或从
export
开始的4步仅部署操作)可覆盖两种使用场景。
完整TAO Deploy参考:tao-deploy-fast-foundation-stereo

Training Requirements

训练要求

  • Valid
    dataset_name
    values for stereo
    data_sources
    (case-insensitive):
    FSD
    ,
    IsaacRealDataset
    ,
    Crestereo
    ,
    Middlebury
    ,
    Eth3d
    ,
    Kitti
    ,
    GenericDataset
  • Monitoring metric: val/loss
  • 立体
    data_sources
    的有效
    dataset_name
    值(大小写不敏感):
    FSD
    IsaacRealDataset
    Crestereo
    Middlebury
    Eth3d
    Kitti
    GenericDataset
  • 监控指标:val/loss

Per-Action Dataset Requirements

各操作的数据集要求

ActionSpec KeySourceFilesList?
evaluatedataset.test_dataset.data_sourceseval_datasetdata_file: annotations.txt + dataset_nameYes
inferencedataset.infer_dataset.data_sourcesinference_datasetdata_file: annotations.txt + dataset_nameYes
traindataset.train_dataset.data_sourcestrain_datasetsdata_file: annotations.txt + dataset_nameYes
traindataset.val_dataset.data_sourceseval_datasetdata_file: annotations.txt + dataset_nameYes
Data source overrides are mandatory for every action. Each
data_sources
entry needs both
data_file
and
dataset_name
. The
model.*
width fields from Step 3 are also mandatory. See
references/spec-overrides.md
for the complete per-action override dicts (train finetune, raw-bp2 evaluate / inference / export) and the shared
FFS_MODEL_BLOCK
.
操作规范键来源文件是否为列表?
evaluatedataset.test_dataset.data_sourceseval_datasetdata_file: annotations.txt + dataset_name
inferencedataset.infer_dataset.data_sourcesinference_datasetdata_file: annotations.txt + dataset_name
traindataset.train_dataset.data_sourcestrain_datasetsdata_file: annotations.txt + dataset_name
traindataset.val_dataset.data_sourceseval_datasetdata_file: annotations.txt + dataset_name
数据源覆盖配置对每个操作都是必填项。每个
data_sources
条目都需要
data_file
dataset_name
。步骤3中的
model.*
宽度字段也是必填项。请参考
references/spec-overrides.md
获取完整的每个操作覆盖字典(训练微调、直接bp2评估/推理/导出)以及共享的
FFS_MODEL_BLOCK

Eval Dataset

评估数据集

Optional. Val dataset configured via
dataset.val_dataset.data_sources
(each entry needs
data_file
and
dataset_name
).
可选。验证数据集通过
dataset.val_dataset.data_sources
配置(每个条目需要
data_file
dataset_name
)。

Parameters, Metrics, Hardware

参数、指标、硬件

See
references/parameters.md
for the full parameter glossary (
model.*
/
dataset.*
/
train.*
knobs including
max_disparity: 192
,
gwc_feature_normalize: true
,
mixed_precision: false
,
volume_dim: 28
,
valid_iters
,
save_raw_pfm
), the evaluation-metric table (
epe
/
bp1
/
bp2
/
bp3
/
d1
/
rmse
are meaningful;
abs_rel
/
sq_rel
/
rmse_log
are not), multi-GPU / multi-node spec keys, and hardware requirements.
请参考
references/parameters.md
获取完整的参数术语表(
model.*
/
dataset.*
/
train.*
参数,包括
max_disparity: 192
gwc_feature_normalize: true
mixed_precision: false
volume_dim: 28
valid_iters
save_raw_pfm
)、评估指标表(
epe
/
bp1
/
bp2
/
bp3
/
d1
/
rmse
有意义;
abs_rel
/
sq_rel
/
rmse_log
无意义)、多GPU/多节点规范键以及硬件要求。

Export / TRT Defaults

导出/TRT默认值

export
always emits a fp32 ONNX regardless of
model.mixed_precision
; the fp16 vs fp32 selection happens at
gen_trt_engine
via
gen_trt_engine.tensorrt.data_type
. Recommended TRT precision for FFS-bp2 is
fp16
on the static-shape ONNX path (lowest drift). The dynamic-shape path supports both
fp32
(default; static-fp32 parity) and
fp16
(latency-critical multi-resolution; higher drift, may NaN under some checkpoint states — fall back to fp32 if observed).
See
references/export-trt-defaults.md
for the full TRT/ONNX defaults and the four-way export use-case matrix (
export.batch_size
×
export.dynamic_hw
; dynamic H/W is FFS-only). See
references/tao-deploy-fast-foundation-stereo.md
for the deployment matrix and static-vs-dynamic shape guidance.
无论
model.mixed_precision
设置如何,
export
始终输出fp32 ONNX;fp16与fp32的选择在
gen_trt_engine
阶段通过
gen_trt_engine.tensorrt.data_type
完成。FFS-bp2推荐的TRT精度为静态形状ONNX路径下的
fp16
(漂移最小)。动态形状路径支持
fp32
(默认;与静态fp32性能一致)和
fp16
(延迟敏感的多分辨率场景;漂移更高,在某些检查点状态下可能产生NaN——若出现此情况,回退到fp32)。
请参考
references/export-trt-defaults.md
获取完整的TRT/ONNX默认值以及四种导出场景矩阵(
export.batch_size
×
export.dynamic_hw
;动态H/W为FFS独有)。请参考
references/tao-deploy-fast-foundation-stereo.md
获取部署矩阵以及静态与动态形状的指导说明。

Troubleshooting

故障排除

See
references/troubleshooting.md
for error patterns and fixes, including
shape mismatch
at forward (missing width override), missing
gwc_feature_normalize
(TAO Core too old),
dynamic_hw: true
warning on FS / mono export,
Key 'encoder' not in 'StereoBackBone'
, missing
dataset_name
in
data_sources
, negative disparity, larger-than-expected disparity drift (missing
max_disparity: 192
),
depth_net_stereo: not found
, decorative pyt-eval
crop_size
, the cosmetic
Failed to import SAM3
warning, and silent dynamic-deploy stride-incompatibility.
请参考
references/troubleshooting.md
获取错误模式及修复方法,包括前向传播时的
shape mismatch
(缺失宽度覆盖配置)、缺失
gwc_feature_normalize
(TAO Core版本过旧)、FS/单目导出时的
dynamic_hw: true
警告、
Key 'encoder' not in 'StereoBackBone'
data_sources
中缺失
dataset_name
、负视差、超出预期的视差漂移(缺失
max_disparity: 192
)、
depth_net_stereo: not found
、pyt评估中的装饰性
crop_size
Failed to import SAM3
警告(仅 cosmetic)以及动态部署中的静默步长不兼容问题。

Spec Param / Parent Model Inference

规范参数/父模型推理

Model-specific inference mappings belong in this skill, not in
config.json
. Generated runners should apply the mappings with SDK helpers before
create_job()
. See
references/parent-model-inference.md
for the full per-action spec-field → inference-function mapping table.
For
parent_model
or
parent_model_folder
, pass the upstream train / export / AutoML child job id as
parent_job_id
. The SDK lists the parent result folder, filters checkpoint artifacts, and returns the selected model file or folder. For raw-bp2 use cases without a parent train job, set the
<action>.checkpoint
field explicitly to the bp2 file path. Do not patch generated runner scripts to guess checkpoint paths.
模型特定的推理映射属于该技能,而非
config.json
。生成的运行器应在
create_job()
前通过SDK助手应用这些映射。请参考
references/parent-model-inference.md
获取完整的每个操作规范字段→推理函数映射表。
对于
parent_model
parent_model_folder
,将上游训练/导出/AutoML子任务ID作为
parent_job_id
传入。SDK会列出父结果文件夹,过滤检查点 artifacts,并返回选中的模型文件或文件夹。对于无父训练任务的直接bp2场景,需显式将
<action>.checkpoint
字段设置为bp2文件路径。请勿修改生成的运行器脚本以猜测检查点路径。