tao-train-fast-foundation-stereo
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDepth Net Fast Stereo
Depth Net 快速立体视觉
Real-time stereo depth estimation using FastFoundationStereo (FFS) — the bp2 commercial distilled variant of FoundationStereo. Predicts disparity maps from rectified stereo image pairs with per-layer pruned widths for real-time inference.
The mono / stereo / fast-stereo skills share the unified TAO CLI; FFS is selected via . FFS differs from only in pruned per-layer widths and a serialized forward path; everything else (entrypoint, action verbs, dataset classes, deploy chain) is identical to .
depth_netmodel.model_type: FastFoundationStereoFoundationStereodepth-net-stereoFor TAO Deploy TensorRT actions (, TensorRT , TensorRT ), read first. The deploy spec template lives at .
gen_trt_engineevaluateinferencereferences/tao-deploy-fast-foundation-stereo.mdreferences/spec_template_deploy.yaml使用FastFoundationStereo (FFS)——FoundationStereo的bp2商用蒸馏变体,进行实时立体深度估计。通过每层剪枝宽度实现实时推理,从校正后的立体图像对中预测视差图。
单目/立体/快速立体技能共享统一的TAO CLI;通过设置来选择FFS。FFS与的区别仅在于每层剪枝宽度和序列化前向路径;其他所有内容(入口点、操作动词、数据集类、部署链)都与完全相同。
depth_netmodel.model_type: FastFoundationStereoFoundationStereodepth-net-stereo对于TAO Deploy TensorRT操作(、TensorRT 、TensorRT ),请先阅读。部署规范模板位于。
gen_trt_engineevaluateinferencereferences/tao-deploy-fast-foundation-stereo.mdreferences/spec_template_deploy.yamlTrain Action Policy
训练操作策略
This model is AutoML-enabled at the model layer. Before handling any train-stage request, read and resolve the run override from either an explicit value or the user's workflow request. Treat phrases like "turn off AutoML", "disable AutoML", "no HPO", or "plain training" as for this run only; otherwise default to . When , , and both and are packaged, route the train action through by default with this model's . Preserve workflow/application overrides for datasets, specs, output directories, GPU/platform settings, parent checkpoints, and . Use direct model training only when or the packaged train schema/template is missing; in the missing-schema case, report that AutoML is enabled but not runnable for this model until schemas are generated.
references/skill_info.yamlautoml_policyautoml_policy: offautoautoml_policy: autoautoml_enabled: trueschemas/train.schema.jsonreferences/spec_template_train.yamltao-skill-bank:tao-run-automlskill_dirautoml_policyautoml_policy: offNon-train actions such as , , , and deploy flows stay in this model skill. The per-run override does not change model metadata.
evaluateinferenceexportautoml_policy该模型在模型层支持AutoML。处理任何训练阶段请求前,请阅读,并通过显式的值或用户的工作流请求确定运行覆盖配置。将“turn off AutoML”、“disable AutoML”、“no HPO”或“plain training”这类短语视为本次运行的;否则默认设置为。当、,且和已打包时,默认通过结合该模型的路由训练操作。保留数据集、规范、输出目录、GPU/平台设置、父检查点和的工作流/应用覆盖配置。仅当或打包的训练架构/模板缺失时,才使用直接模型训练;若架构缺失,需报告AutoML已启用,但在生成架构前无法针对该模型运行。
references/skill_info.yamlautoml_policyautoml_policy: offautoautoml_policy: autoautoml_enabled: trueschemas/train.schema.jsonreferences/spec_template_train.yamltao-skill-bank:tao-run-automlskill_dirautoml_policyautoml_policy: off非训练操作(如、、和部署流程)仍在该模型技能中执行。每次运行的覆盖配置不会更改模型元数据。
evaluateinferenceexportautoml_policyTwo Use Cases
两种使用场景
FFS ships with a pre-trained bp2 commercial checkpoint ().
model_best_bp2_serialize.pth- Raw deploy — use the bp2 ckpt as-is. Skip ; run
train/inference/evaluate/exportdirectly with the bp2 file as the action's checkpoint.gen_trt_engine - Finetune on user data — set to the bp2 file, train on user data, then verify + deploy on the resulting ckpt. The full 7-action sequence (train → evaluate pyt → inference pyt → export → gen_trt_engine → inference deploy → evaluate deploy) is supported.
train.pretrained_model_path
FFS附带一个预训练的bp2商用检查点()。
model_best_bp2_serialize.pth- 直接部署——直接使用bp2检查点。跳过;直接使用bp2文件作为操作的检查点,运行
train/inference/evaluate/export。gen_trt_engine - 基于用户数据微调——将设置为bp2文件路径,在用户数据上进行训练,然后验证并部署生成的检查点。支持完整的7步操作流程(train → evaluate pyt → inference pyt → export → gen_trt_engine → inference deploy → evaluate deploy)。
train.pretrained_model_path
Workflow
工作流
Prerequisites — data accessibility
前提条件——数据可访问性
Your dataset (left + right images + GT disparity for train / evaluate, left + right only for inference) must be reachable from inside the container:
- SDK runner: place files at the S3 paths the runner resolves (/
S3_TRAINplaceholders shown in the spec overrides).S3_EVAL - Direct (e.g. local testing): mount the host dataset root read-only at the same in-container path:
docker run
docker run ... -v <host_data_root>:<host_data_root>:ro <container> ...The same accessibility requirement applies to the written by all actions, and to the bp2 checkpoint path.
<output_dir>你的数据集(训练/评估用的左+右图像+GT视差,推理用的左+右图像)必须能从容器内部访问:
- SDK运行器:将文件放置在运行器可解析的S3路径(规范覆盖配置中显示的/
S3_TRAIN占位符)。S3_EVAL - 直接(如本地测试):将主机数据集根目录以只读方式挂载到容器内的相同路径:
docker run
docker run ... -v <host_data_root>:<host_data_root>:ro <container> ...所有操作写入的以及bp2检查点路径也需满足相同的可访问性要求。
<output_dir>Step 1 — Annotation file
步骤1——标注文件
Per-line annotation file referenced by . Schema is identical to :
data_sources[*].data_filedepth-net-stereo| Columns | Format | Use |
|---|---|---|
| 2 | | Stereo inference (no GT) |
| 3 | | Stereo with GT |
| 4 | | Stereo with GT and occlusion mask |
Generate via if needed; see the skill for template.
depth_net convertdepth-net-stereoconvert_spec.yamldata_sources[*].data_filedepth-net-stereo| 列数 | 格式 | 用途 |
|---|---|---|
| 2 | | 立体推理(无GT) |
| 3 | | 带GT的立体任务 |
| 4 | | 带GT和遮挡掩码的立体任务 |
如有需要,可通过生成;请参考技能中的模板。
depth_net convertdepth-net-stereoconvert_spec.yamlStep 2 — Pair model_type
and dataset_name
based on your data
model_typedataset_name步骤2——根据数据匹配model_type
和dataset_name
model_typedataset_nameUse for FFS. The choice mirrors the stereo skill — pick the dataset-specific class when your layout matches a registered one, otherwise .
model_type: FastFoundationStereodataset_nameGenericDataset| Data category | | |
|---|---|---|
| Middlebury | | |
| KITTI | | |
| ETH3D | | |
| FSD synthetic | | |
| IsaacReal synthetic | | |
| Crestereo synthetic | | |
| Other / non-canonical | | |
For inference with 2-column annotations (left + right, no GT), use regardless of layout.
dataset_name: GenericDatasetFFS需使用。的选择与立体技能一致——当你的数据布局与已注册的数据集匹配时,选择对应数据集类,否则选择。
model_type: FastFoundationStereodataset_nameGenericDataset| 数据类别 | | |
|---|---|---|
| Middlebury | | |
| KITTI | | |
| ETH3D | | |
| FSD合成数据 | | |
| IsaacReal合成数据 | | |
| Crestereo合成数据 | | |
| 其他/非标准数据 | | |
对于使用2列标注(左+右,无GT)的推理任务,无论布局如何,均使用。
dataset_name: GenericDatasetStep 3 — Set the bp2 distilled width overrides
步骤3——设置bp2蒸馏宽度覆盖配置
FFS requires 15 model-section width override fields whose values match the bp2 commercial checkpoint exactly. Omitting any field falls back to TAO defaults that do not match the bp2 ckpt and produce shape-mismatch errors at forward time.
yaml
model:
model_type: FastFoundationStereo
encoder: vitl
hidden_dims: [128] # 1-layer GRU; NOT [128,128,128]
n_gru_layers: 1 # bp2 single-GRU
corr_radius: 4
corr_levels: 2
n_downsample: 2
valid_iters: 8
max_disparity: 192 # bp2 commercial; NOT 416 (full FS default)
volume_dim: 28 # bp2 ckpt invariant; NOT 32 (full FS default)
mixed_precision: false # see references/parameters.md
gwc_feature_normalize: true # see references/parameters.md
# 15 bp2 distilled width overrides — copy as-is
motion_encoder_widths: [56, 96, 16, 12]
motion_encoder_final: 48
gru_hidden: 60
gru_gating_conv_widths: [100, 168]
disp_head_input_dim: 60
disp_head_intermediate: 36
disp_head_pwconv1_widths: [212, 244]
mask_widths: [32, 16]
stem_2_widths: [12, 16]
spx_2_gru_widths: [16, 12, 16, 24]
spx_gru_out: 9
classifier_mid: 14
cnet_conv04_widths: [60, 48]
cam_mid_channels: 8
cost_agg_conv_patch_padding: [0, 0, 0]The spec templates at carry this block as the canonical source.
references/spec_template_*.yamlFFS需要15个模型段宽度覆盖字段,其值必须与bp2商用检查点完全匹配。省略任何字段都会回退到TAO默认值,而这些默认值与bp2检查点不匹配,会在前向传播时产生形状不匹配错误。
yaml
model:
model_type: FastFoundationStereo
encoder: vitl
hidden_dims: [128] # 1层GRU;不可设为[128,128,128]
n_gru_layers: 1 # bp2单GRU
corr_radius: 4
corr_levels: 2
n_downsample: 2
valid_iters: 8
max_disparity: 192 # bp2商用版本;不可设为416(完整FS默认值)
volume_dim: 28 # bp2检查点不变量;不可设为32(完整FS默认值)
mixed_precision: false # 参考references/parameters.md
gwc_feature_normalize: true # 参考references/parameters.md
# 15个bp2蒸馏宽度覆盖配置——原样复制
motion_encoder_widths: [56, 96, 16, 12]
motion_encoder_final: 48
gru_hidden: 60
gru_gating_conv_widths: [100, 168]
disp_head_input_dim: 60
disp_head_intermediate: 36
disp_head_pwconv1_widths: [212, 244]
mask_widths: [32, 16]
stem_2_widths: [12, 16]
spx_2_gru_widths: [16, 12, 16, 24]
spx_gru_out: 9
classifier_mid: 14
cnet_conv04_widths: [60, 48]
cam_mid_channels: 8
cost_agg_conv_patch_padding: [0, 0, 0]references/spec_template_*.yamlStep 4 — Write spec yaml from the spec overrides
步骤4——根据覆盖配置编写规范yaml
Copy the action block from (per-action Python override dicts plus the shared ). Replace:
references/spec-overrides.mdFFS_MODEL_BLOCK- (already set)
model.model_type: FastFoundationStereo - from Step 2
dataset.<...>.data_sources[*].dataset_name - with the path from Step 1
dataset.<...>.data_sources[*].data_file - For raw deploy use cases (no train): set to the bp2 file path
<action>.checkpoint - For finetune use cases: set to the bp2 file path
train.pretrained_model_path
Chained train → next action checkpoint path: For local Docker chaining (no SDK runner), the trained checkpoint lives at — Lightning nests under the task name. Example: produces . Use that nested path for the next action's . SDK-runner deploys resolve this automatically via — see .
<train.results_dir>/<task>/dn_model_latest.pthModelCheckpointtrain.results_dir: /workspace/results/finetune/train/workspace/results/finetune/train/train/dn_model_latest.pth<action>.checkpointparent_job_idreferences/parent-model-inference.mdShape consistency: in should match / for end-to-end pyt-vs-deploy comparability — see 's shape table.
crop_sizedataset.test_dataset.augmentation.crop_sizeexport.input_heightinput_widthreferences/tao-deploy-fast-foundation-stereo.md从复制操作块(每个操作的Python覆盖字典以及共享的)。替换以下内容:
references/spec-overrides.mdFFS_MODEL_BLOCK- (已设置)
model.model_type: FastFoundationStereo - (来自步骤2)
dataset.<...>.data_sources[*].dataset_name - (来自步骤1的路径)
dataset.<...>.data_sources[*].data_file - 对于直接部署场景(无训练):将设置为bp2文件路径
<action>.checkpoint - 对于微调场景:将设置为bp2文件路径
train.pretrained_model_path
链式训练→后续操作的检查点路径:对于本地Docker链式调用(无SDK运行器),训练后的检查点位于——Lightning 嵌套在任务名称下。示例:会生成。将此嵌套路径用于后续操作的。SDK运行器部署会通过自动解析此路径——请参考。
<train.results_dir>/<task>/dn_model_latest.pthModelCheckpointtrain.results_dir: /workspace/results/finetune/train/workspace/results/finetune/train/train/dn_model_latest.pth<action>.checkpointparent_job_idreferences/parent-model-inference.md形状一致性:中的应与/匹配,以确保端到端pyt与部署的可比性——请参考中的形状表。
dataset.test_dataset.augmentation.crop_sizecrop_sizeexport.input_heightinput_widthreferences/tao-deploy-fast-foundation-stereo.mdStep 5 — Run
步骤5——运行
docker run --gpus 'device=0' --shm-size 16G --ipc=host \
--user $(id -u):$(id -g) \
-v <data_root>:<data_root>:ro \
-v <output_dir>:<output_dir> \
-v <bp2_ckpt_dir>:<bp2_ckpt_dir>:ro \
<container> \
depth_net <action> -e <spec.yaml>Without the container writes outputs as , blocking host-side cleanup / retry.
--user $(id -u):$(id -g)nobody:nogroupFor the local bind-mount caveat (QA / development only — clearing stale files that shadow patched source), see → "Local bind-mount tip".
__pycache__.pycreferences/troubleshooting.mddocker run --gpus 'device=0' --shm-size 16G --ipc=host \
--user $(id -u):$(id -g) \
-v <data_root>:<data_root>:ro \
-v <output_dir>:<output_dir> \
-v <bp2_ckpt_dir>:<bp2_ckpt_dir>:ro \
<container> \
depth_net <action> -e <spec.yaml>如果不添加,容器会以身份写入输出,导致主机端无法清理/重试。
--user $(id -u):$(id -g)nobody:nogroup关于本地绑定挂载的注意事项(仅用于QA/开发——清除过时的文件,避免覆盖补丁源码),请参考→“Local bind-mount tip”。
__pycache__.pycreferences/troubleshooting.mdStep 6 — Verify
步骤6——验证
- Container exit code 0
status.jsonblock populatedkpi- For : inspect per-step
traindirectly (the entrypoint reportstrain_losseven when loss is NaN)Execution status: PASS - For : rely on
evaluate/epe/bp1/bp2/bp3/d1(the evaluator also emitsrmse/abs_rel/sq_relwhich are non-meaningful for stereo)rmse_log - For : artifacts under
inferenceresults_dir - KPI namespace difference between pyt and deploy: pyt writes the metric set under
evaluate,kpi.val/epe, etc. (namespaced by Lightning'skpi.val/bp1prefix). Deployval/(TRT engine path) writes the same metric set underevaluate,kpi.epe, etc. (nokpi.bp1prefix). Downstream verification scripts that readval/need to handle both shapes.status.json - Validate drift on your own dataset: if you compare TAO FFS deploy (+ TRT
gen_trt_engine) against the upstream FFS deploy path on the same input, expect a small residual mean_abs disparity drift (TAO export graph + TRT 10.13 interaction; not improvable at the source-code level). The exact magnitude is dataset and hardware dependent — measure on your own data and decide whether the drift is acceptable for your downstream task.evaluate
- 容器退出码为0
- 中的
status.json块已填充kpi - 对于:直接检查每一步的
train(即使损失为NaN,入口点仍会报告train_loss)Execution status: PASS - 对于:依赖
evaluate/epe/bp1/bp2/bp3/d1(评估器还会输出rmse/abs_rel/sq_rel,但这些对立体任务无意义)rmse_log - 对于:
inference下存在输出 artifactsresults_dir - pyt与部署的KPI命名空间差异:pyt的将指标集写入
evaluate、kpi.val/epe等(带有Lightning的kpi.val/bp1前缀命名空间)。部署的val/(TRT引擎路径)将相同的指标集写入evaluate、kpi.epe等(无kpi.bp1前缀)。读取val/的下游验证脚本需要处理这两种格式。status.json - 在自有数据集上验证漂移:如果在相同输入上比较TAO FFS部署(+TRT
gen_trt_engine)与上游FFS部署路径,会存在微小的平均绝对视差残留漂移(由TAO导出图+TRT 10.13交互导致;无法在源码层面改进)。漂移的具体幅度取决于数据集和硬件——请在自有数据上测量,并判断漂移是否满足下游任务要求。evaluate
7-action deploy flow
7步部署流程
train (optional) → finetuned ckpt
evaluate (pyt) → PyT eager EPE / bp on val GT
inference (pyt) → PyT eager disparity samples (visual sanity)
export → static fp32 ONNX (recommended at 480×736 or 320×736)
gen_trt_engine → fp16 TRT engine on static ONNX path
inference (deploy) → TRT disparity samples
evaluate (deploy) → TRT EPE / bp drift vs PyT eager fp32Skip for raw-bp2 deploy. The remaining 6 actions (or the 4 deploy-only verbs starting from ) cover both use cases.
trainexportFull TAO Deploy reference: tao-deploy-fast-foundation-stereo.
train(可选) → 微调后的检查点
evaluate(pyt) → PyT eager模式下基于验证集GT的EPE/bp指标
inference(pyt) → PyT eager模式下的视差样本(视觉合理性检查)
export → 静态fp32 ONNX(推荐尺寸为480×736或320×736)
gen_trt_engine → 基于静态ONNX路径生成fp16 TRT引擎
inference(deploy) → TRT视差样本
evaluate(deploy) → TRT EPE/bp与PyT eager fp32的漂移对比直接使用bp2部署时跳过。剩余6步操作(或从开始的4步仅部署操作)可覆盖两种使用场景。
trainexport完整TAO Deploy参考:tao-deploy-fast-foundation-stereo。
Training Requirements
训练要求
- Valid values for stereo
dataset_name(case-insensitive):data_sources,FSD,IsaacRealDataset,Crestereo,Middlebury,Eth3d,KittiGenericDataset - Monitoring metric: val/loss
- 立体的有效
data_sources值(大小写不敏感):dataset_name、FSD、IsaacRealDataset、Crestereo、Middlebury、Eth3d、KittiGenericDataset - 监控指标:val/loss
Per-Action Dataset Requirements
各操作的数据集要求
| Action | Spec Key | Source | Files | List? |
|---|---|---|---|---|
| evaluate | dataset.test_dataset.data_sources | eval_dataset | data_file: annotations.txt + dataset_name | Yes |
| inference | dataset.infer_dataset.data_sources | inference_dataset | data_file: annotations.txt + dataset_name | Yes |
| train | dataset.train_dataset.data_sources | train_datasets | data_file: annotations.txt + dataset_name | Yes |
| train | dataset.val_dataset.data_sources | eval_dataset | data_file: annotations.txt + dataset_name | Yes |
Data source overrides are mandatory for every action. Each entry needs both and . The width fields from Step 3 are also mandatory. See for the complete per-action override dicts (train finetune, raw-bp2 evaluate / inference / export) and the shared .
data_sourcesdata_filedataset_namemodel.*references/spec-overrides.mdFFS_MODEL_BLOCK| 操作 | 规范键 | 来源 | 文件 | 是否为列表? |
|---|---|---|---|---|
| evaluate | dataset.test_dataset.data_sources | eval_dataset | data_file: annotations.txt + dataset_name | 是 |
| inference | dataset.infer_dataset.data_sources | inference_dataset | data_file: annotations.txt + dataset_name | 是 |
| train | dataset.train_dataset.data_sources | train_datasets | data_file: annotations.txt + dataset_name | 是 |
| train | dataset.val_dataset.data_sources | eval_dataset | data_file: annotations.txt + dataset_name | 是 |
数据源覆盖配置对每个操作都是必填项。每个条目都需要和。步骤3中的宽度字段也是必填项。请参考获取完整的每个操作覆盖字典(训练微调、直接bp2评估/推理/导出)以及共享的。
data_sourcesdata_filedataset_namemodel.*references/spec-overrides.mdFFS_MODEL_BLOCKEval Dataset
评估数据集
Optional. Val dataset configured via (each entry needs and ).
dataset.val_dataset.data_sourcesdata_filedataset_name可选。验证数据集通过配置(每个条目需要和)。
dataset.val_dataset.data_sourcesdata_filedataset_nameParameters, Metrics, Hardware
参数、指标、硬件
See for the full parameter glossary ( / / knobs including , , , , , ), the evaluation-metric table ( / / / / / are meaningful; / / are not), multi-GPU / multi-node spec keys, and hardware requirements.
references/parameters.mdmodel.*dataset.*train.*max_disparity: 192gwc_feature_normalize: truemixed_precision: falsevolume_dim: 28valid_iterssave_raw_pfmepebp1bp2bp3d1rmseabs_relsq_relrmse_log请参考获取完整的参数术语表(//参数,包括、、、、、)、评估指标表(/////有意义;//无意义)、多GPU/多节点规范键以及硬件要求。
references/parameters.mdmodel.*dataset.*train.*max_disparity: 192gwc_feature_normalize: truemixed_precision: falsevolume_dim: 28valid_iterssave_raw_pfmepebp1bp2bp3d1rmseabs_relsq_relrmse_logExport / TRT Defaults
导出/TRT默认值
exportmodel.mixed_precisiongen_trt_enginegen_trt_engine.tensorrt.data_typefp16fp32fp16See for the full TRT/ONNX defaults and the four-way export use-case matrix ( × ; dynamic H/W is FFS-only). See for the deployment matrix and static-vs-dynamic shape guidance.
references/export-trt-defaults.mdexport.batch_sizeexport.dynamic_hwreferences/tao-deploy-fast-foundation-stereo.md无论设置如何,始终输出fp32 ONNX;fp16与fp32的选择在阶段通过完成。FFS-bp2推荐的TRT精度为静态形状ONNX路径下的(漂移最小)。动态形状路径支持(默认;与静态fp32性能一致)和(延迟敏感的多分辨率场景;漂移更高,在某些检查点状态下可能产生NaN——若出现此情况,回退到fp32)。
model.mixed_precisionexportgen_trt_enginegen_trt_engine.tensorrt.data_typefp16fp32fp16请参考获取完整的TRT/ONNX默认值以及四种导出场景矩阵( × ;动态H/W为FFS独有)。请参考获取部署矩阵以及静态与动态形状的指导说明。
references/export-trt-defaults.mdexport.batch_sizeexport.dynamic_hwreferences/tao-deploy-fast-foundation-stereo.mdTroubleshooting
故障排除
See for error patterns and fixes, including at forward (missing width override), missing (TAO Core too old), warning on FS / mono export, , missing in , negative disparity, larger-than-expected disparity drift (missing ), , decorative pyt-eval , the cosmetic warning, and silent dynamic-deploy stride-incompatibility.
references/troubleshooting.mdshape mismatchgwc_feature_normalizedynamic_hw: trueKey 'encoder' not in 'StereoBackBone'dataset_namedata_sourcesmax_disparity: 192depth_net_stereo: not foundcrop_sizeFailed to import SAM3请参考获取错误模式及修复方法,包括前向传播时的(缺失宽度覆盖配置)、缺失(TAO Core版本过旧)、FS/单目导出时的警告、、中缺失、负视差、超出预期的视差漂移(缺失)、、pyt评估中的装饰性、警告(仅 cosmetic)以及动态部署中的静默步长不兼容问题。
references/troubleshooting.mdshape mismatchgwc_feature_normalizedynamic_hw: trueKey 'encoder' not in 'StereoBackBone'data_sourcesdataset_namemax_disparity: 192depth_net_stereo: not foundcrop_sizeFailed to import SAM3Spec Param / Parent Model Inference
规范参数/父模型推理
Model-specific inference mappings belong in this skill, not in . Generated runners should apply the mappings with SDK helpers before . See for the full per-action spec-field → inference-function mapping table.
config.jsoncreate_job()references/parent-model-inference.mdFor or , pass the upstream train / export / AutoML child job id as . The SDK lists the parent result folder, filters checkpoint artifacts, and returns the selected model file or folder. For raw-bp2 use cases without a parent train job, set the field explicitly to the bp2 file path. Do not patch generated runner scripts to guess checkpoint paths.
parent_modelparent_model_folderparent_job_id<action>.checkpoint模型特定的推理映射属于该技能,而非。生成的运行器应在前通过SDK助手应用这些映射。请参考获取完整的每个操作规范字段→推理函数映射表。
config.jsoncreate_job()references/parent-model-inference.md对于或,将上游训练/导出/AutoML子任务ID作为传入。SDK会列出父结果文件夹,过滤检查点 artifacts,并返回选中的模型文件或文件夹。对于无父训练任务的直接bp2场景,需显式将字段设置为bp2文件路径。请勿修改生成的运行器脚本以猜测检查点路径。
parent_modelparent_model_folderparent_job_id<action>.checkpoint