tao-port-huggingface-model
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
Chinese<!--
Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!--
版权所有 (c) 2026,NVIDIA CORPORATION。保留所有权利。
根据 Apache License, Version 2.0(“许可证”)授权;
除非符合许可证的规定,否则不得使用本文件。
您可以在以下地址获取许可证副本:
http://www.apache.org/licenses/LICENSE-2.0
除非适用法律要求或书面同意,否则根据许可证分发的软件
按“原样”分发,不附带任何明示或暗示的担保或条件。
请参阅许可证以了解管理权限和限制的特定语言。
-->
TAO-HF Integration Skill
TAO-HF集成技能
Integrate a HuggingFace (HF) Computer Vision model into the NVIDIA TAO Toolkit ecosystem. Work the phases iteratively — not purely linearly — following a build → test → debug → fix → retest loop at every step.
This SKILL.md is the workflow coordinator. Each phase has a dedicated reference file under with the full step-by-step content, code blocks, docker invocations, and gates. Read the matching reference at the start of each phase — the summaries below are not sufficient on their own.
references/将HuggingFace(HF)计算机视觉模型集成到NVIDIA TAO Toolkit生态系统中。各阶段需迭代执行——而非单纯线性推进——每一步都遵循构建→测试→调试→修复→重新测试的循环。
本SKILL.md为工作流协调器。每个阶段在目录下都有对应的参考文件,包含完整的分步内容、代码块、Docker调用指令和检查点。在开始每个阶段前,请阅读对应的参考文件——以下仅为摘要,无法替代完整参考内容。
references/Local-Only Rule
本地唯一规则
All work is strictly local. You may only read/clone from remotes; all file edits, Docker builds, and test runs stay on the local machine. Do NOT //create remote branches (GitLab, GitHub, HuggingFace), create merge requests / pull requests / issues, or upload/publish/push Docker images to any registry or artifact store. This follows from the bind-mounted local-clone layout in .
git commitgit pushreferences/execution-and-debugging.md所有工作均严格在本地完成。仅可从远程仓库读取/克隆代码;所有文件编辑、Docker构建和测试运行均需在本地机器上进行。禁止执行/、创建远程分支(GitLab、GitHub、HuggingFace)、创建合并请求/拉取请求/议题,或向任何镜像仓库或制品库上传/发布/推送Docker镜像。此规则遵循中绑定挂载本地克隆的布局要求。
git commitgit pushreferences/execution-and-debugging.mdSubmodule Override & Execution Platform
子模块覆盖与执行平台
local-dockertao-coretao-pytorchtao-deploytao-dataservicestao-core/tao-pytorch/tao-core/tao-core/<repo>/tao-core/pip install tao-core/-v $(pwd):/workspacepip install /workspace/tao-corePYTHONPATH-e PYTHONPATH=/workspace/tao-core:/workspace/tao-pytorchEvery test, smoke run, and end-to-end validation runs inside a locally prepared TAO Toolkit container (, , optionally , all from Phase 0), with local clones bind-mounted at and installed via + . All Python work runs in containers — no host venvs, no host s. The platform skills own the how of running containers — host GPU runtime via ; flags / NGC auth / mounts / env passthrough / / / inspection / error modes via and . This workflow specifies only what to run inside them and never forks those conventions. The annotated working-directory tree, canonical flag set with the workflow-specific //install-shell additions, three isolation contexts, four isolation rules, the Development Loop, and the Debugging Playbook table: .
tao-pytorch-base:latesttao-deploy-base:latesttao-dataservices-base:latest/workspacepip install /workspace/tao-coresetup.py developpip installtao-setup-nvidia-gpu-hostdocker run--ipc=host--shm-sizetao-run-on-dockertao-run-on-local-dockerdocker run-wPYTHONPATHreferences/execution-and-debugging.mdlocal-dockertao-coretao-pytorchtao-deploytao-dataservicestao-core/tao-pytorch/tao-core/tao-core/<repo>/tao-core/pip install tao-core/-v $(pwd):/workspacepip install /workspace/tao-corePYTHONPATH-e PYTHONPATH=/workspace/tao-core:/workspace/tao-pytorch所有测试、冒烟测试和端到端验证均在本地准备的TAO Toolkit容器(、,可选,均来自第0阶段)中运行,本地克隆代码绑定挂载到,并通过 + 安装。所有Python工作均在容器内运行——禁止使用主机虚拟环境或主机。平台技能负责容器运行的具体方式:通过配置主机GPU运行时;通过和处理参数/NGC认证/挂载/环境变量传递///检查/错误模式。本工作流仅指定容器内运行的内容,不会修改这些约定。带注释的工作目录树、包含工作流特定//安装脚本补充的标准参数集、三种隔离上下文、四条隔离规则、开发循环和调试手册表格,请参阅:。
tao-pytorch-base:latesttao-deploy-base:latesttao-dataservices-base:latest/workspacepip install /workspace/tao-coresetup.py developpip installtao-setup-nvidia-gpu-hosttao-run-on-dockertao-run-on-local-dockerdocker run--ipc=host--shm-size-wPYTHONPATHdocker runreferences/execution-and-debugging.mdPhase Map
阶段映射
The seven phases (full goals + gates below; references per phase):
- Phase 0 — Prerequisites + TAO Toolkit images + local image tags: phase-0-prereqs.md
- Phase 1 — HF-inspection environment, validate HF model + dataset: phase-1-inspection.md, hf-inspection.md
- Phase 2 — Closest existing TAO reference model: phase-2-codebase.md, task-type-guide.md
- Phase 3 — tao-core config + tao-pytorch trainer / native eval / inference: phase-3-implementation.md, tao-patterns.md, repo-structure.md
- Phase 4 — ONNX export + tao-deploy TRT engine, inference, evaluation: phase-4-deploy.md
- Phase 5 — Packaging (console_scripts) + L0 tests: phase-5-packaging.md
setup.py - Phase 6 — Container-based testing + end-to-end pipeline validation: phase-6-container-tests.md, docker-patterns.md
- Phase 7 — (conditional) Accuracy / latency / size tuning: phase-7-optimization.md
IMPORTANT — Continuous Execution Through Phase 6: Do NOT stop after implementation (Phases 3–5) to wait for the user to run tests; immediately proceed to the mandatory Phase 6. The implementation is not complete until tests pass inside the TAO Toolkit containers and the end-to-end pipeline is validated. Apply the build-test-debug loop at every step — write, test immediately, fix on failure, never accumulate untested code.
七个阶段(以下为完整目标及检查点;各阶段对应参考文件):
- 第0阶段 — 先决条件 + TAO Toolkit镜像 + 本地镜像标签:phase-0-prereqs.md
- 第1阶段 — HF模型检查环境、验证HF模型+数据集:phase-1-inspection.md、hf-inspection.md
- 第2阶段 — 最匹配的现有TAO参考模型:phase-2-codebase.md、task-type-guide.md
- 第3阶段 — tao-core配置 + tao-pytorch训练器/原生评估/推理:phase-3-implementation.md、tao-patterns.md、repo-structure.md
- 第4阶段 — ONNX导出 + tao-deploy TRT引擎、推理、评估:phase-4-deploy.md
- 第5阶段 — 打包(console_scripts) + L0测试:phase-5-packaging.md
setup.py - 第6阶段 — 基于容器的测试 + 端到端流水线验证:phase-6-container-tests.md、docker-patterns.md
- 第7阶段 — (可选)精度/延迟/模型大小调优:phase-7-optimization.md
重要提示——持续执行至第6阶段: 完成实现(第3-5阶段)后请勿停止,等待用户运行测试;需立即进入强制的第6阶段。只有当代码在TAO Toolkit容器内测试通过,且端到端流水线验证完成后,实现工作才算结束。每一步都要应用构建-测试-调试循环——编写代码后立即测试,失败则修复,绝不累积未测试的代码。
Phase 0 — Prerequisites Check
第0阶段 — 先决条件检查
Goal: verify Python 3.10+ and ; delegate the NVIDIA driver / CUDA / Docker / NVIDIA Container Toolkit host check to ; verify NGC for . Then ask the user for the TAO Toolkit image references (tao-pytorch, tao-deploy, optionally tao-dataservices), pull them, and prepare local image tags , , for Phases 3–6. Preparation strips the released TAO packages already in those images so the user's local clones (mounted at ) install and get picked up at run time. Hard stop if any check fails. Full commands, user-prompt wording, and per-image preparation snippets: phase-0-prereqs.md.
gittao-setup-nvidia-gpu-hostdocker loginnvcr.iotao-pytorch-base:latesttao-deploy-base:latesttao-dataservices-base:latest/workspace/...DockerfileGate: all prerequisite checks pass; the user has supplied the required image references; and exist locally; exists if dataservices work is expected.
tao-pytorch-base:latesttao-deploy-base:latesttao-dataservices-base:latest目标: 验证Python 3.10+和是否安装;将NVIDIA驱动/CUDA/Docker/NVIDIA容器工具包的主机检查工作委托给;验证NGC的是否可访问。然后询问用户获取TAO Toolkit镜像引用(tao-pytorch、tao-deploy,可选tao-dataservices),拉取镜像,并为第3-6阶段准备本地镜像标签、、。准备过程会移除这些镜像中已有的TAO发布包,确保用户的本地克隆代码(挂载到)在运行时被安装并优先加载。若任何检查失败,需立即终止流程。完整命令、用户提示话术、每个镜像的准备片段,请参阅:phase-0-prereqs.md。
gittao-setup-nvidia-gpu-hostdocker loginnvcr.iotao-pytorch-base:latesttao-deploy-base:latesttao-dataservices-base:latest/workspace/...Dockerfile检查点: 所有先决条件检查通过;用户已提供所需的镜像引用;和已存在于本地;若需使用dataservices,则也需存在。
tao-pytorch-base:latesttao-deploy-base:latesttao-dataservices-base:latestPhase 1 — Information Gathering & Validation
第1阶段 — 信息收集与验证
Goal: decide whether to proceed. Gather credentials, locate (or clone) the four TAO repos and create a consistent local working branch across them, launch the long-lived container (isolation Context A), validate that the HF model is a CV model with a supported , extract config + state-dict schema, sanity-check ONNX export, and clean up. Full step-by-step (1.1–1.7): phase-1-inspection.md; generic patterns: hf-inspection.md.
tao-hf-inspectpipeline_tagReject if is NLP / audio / LLM (out of CV scope), raises, or ONNX export fundamentally cannot work and has no rewrite path.
pipeline_tagAutoConfigGate: all 4 TAO repos located/cloned with a consistent working branch; confirmed CV; , , , extracted; state-dict keys documented and the HF→TAO remapping plan drafted; ONNX sanity check passed (or failure mode understood); user confirmed and task type. Present findings and confirm before proceeding.
pipeline_tagmodel_typeimage_sizehidden_sizenum_labelsmodel_short_name目标: 判断是否可继续推进。收集凭证,定位(或克隆)四个TAO仓库并在所有仓库中创建一致的本地工作分支,启动长期运行的容器(隔离上下文A),验证HF模型是否为支持的CV模型,提取配置+状态字典 schema,检查ONNX导出的可行性,最后清理环境。完整分步流程(1.1–1.7)请参阅:phase-1-inspection.md;通用模式请参阅:hf-inspection.md。
tao-hf-inspectpipeline_tag拒绝条件: 若为NLP/音频/LLM(超出CV范围)、报错,或ONNX导出从根本上无法实现且无重构路径,则拒绝执行。
pipeline_tagAutoConfig检查点: 已定位/克隆4个TAO仓库并创建一致的工作分支;确认属于CV领域;提取、、、;记录状态字典键并草拟HF→TAO的映射方案;ONNX导出检查通过(或已明确失败原因);用户确认和任务类型。需向用户展示结果并确认后再继续。
pipeline_tagmodel_typeimage_sizehidden_sizenum_labelsmodel_short_namePhase 2 — Codebase Exploration
第2阶段 — 代码库探索
Goal: find the closest existing TAO reference model for the detected (classification → , detection → /, segmentation → , instance → , panoptic → , zero-shot → , depth → ), read its full implementation across , , and , and decide whether the backbone already exists in . The chosen reference drives everything downstream — config structure, architecture, loss, ONNX export shape, TRT builder, deploy inferencer/loader, metrics, dataset format. The full reference list (12 files per model), the coverage check (it already provides , , , , and others), and the coverage check: phase-2-codebase.md; per-task details: task-type-guide.md.
pipeline_tagclassification_pytdinortdetrsegformermask2formeroneformergrounding_dinomono_depthtao-coretao-pytorchtao-deploybackbone_v2/backbone_v2/vitswinresnetdino_v2tao-dataservicesIf a new backbone is needed, decide the strategy (timm wrap > re-implement from scratch > HF black-box wrap) before Phase 3 — it changes weight loading, ONNX export, and the deploy pipeline. Never dual-inherit from and (metaclass conflict).
transformers.PreTrainedModelBackboneBaseGate: reference TAO model identified and all 12 locations read; task-type implications understood (architecture, loss, ONNX outputs, deploy classes, metrics, dataset); backbone coverage decided (reuse / wrap timm / new); dataservices coverage checked.
目标: 为检测到的找到最匹配的现有TAO参考模型(分类→、检测→/、分割→、实例分割→、全景分割→、零样本检测→、深度估计→),阅读其在、和中的完整实现,并判断中是否已存在对应的骨干网络。所选参考模型将决定后续所有工作——配置结构、架构、损失函数、ONNX导出形状、TRT构建器、部署推理器/加载器、指标、数据集格式。完整参考列表(每个模型对应12个文件)、覆盖范围检查(已包含、、、等)、覆盖范围检查,请参阅:phase-2-codebase.md;各任务细节请参阅:task-type-guide.md。
pipeline_tagclassification_pytdinortdetrsegformermask2formeroneformergrounding_dinomono_depthtao-coretao-pytorchtao-deploybackbone_v2/backbone_v2/vitswinresnetdino_v2tao-dataservices若需要新的骨干网络,需在第3阶段前确定策略(timm封装 > 从零重构 > HF黑盒封装)——这会影响权重加载、ONNX导出和部署流水线。禁止同时继承和(元类冲突)。
transformers.PreTrainedModelBackboneBase检查点: 已确定参考TAO模型并阅读其所有12个位置的代码;理解任务类型的影响(架构、损失函数、ONNX输出、部署类、指标、数据集);确定骨干网络的覆盖方案(复用/封装timm/新增);完成dataservices覆盖范围检查。
Phase 3 — TAO Core Configuration & Native Implementation
第3阶段 — TAO Core配置与原生实现
Goal: write the tao-core config schema and the tao-pytorch trainer + native inference + native evaluation, smoke-testing in between. Use ( from Phase 1) and (). Seven steps: (1) config under — MUST contain , , , , , , , ; (2) trainer under (, , , entrypoint, ; new backbone → add+register ); (3) multi-GPU/multi-node via the entrypoint's ; (4) native inference → ; (5) native evaluation → ; (6–7) MLOps wiring ( → ). Consistency rules (including vs and = required ) are enforced by the Cross-Phase checklist below.
<model_name>snake_case<ModelName>PascalCasetao-coreconfig/<model_name>/ExperimentConfig(CommonExperimentConfig)modeldatasettrainevaluateinferenceexportgen_trt_enginequantizetao-pytorchcv/<model_name>/build_model()<ModelName>PlModel(TAOLightningModule)train.pyexperiment_spec.yamlcv/backbone_v2/<backbone_name>.pylaunch()result.csvresults.json@monitor_statusstatus.jsonexport.onnx_filegen_trt_engine.onnx_file???MISSINGFull per-step code and the canonical : phase-3-implementation.md (with snippets tao-patterns.md, layout repo-structure.md, per-task task-type-guide.md).
experiment_spec.yamlGates: Step 1 — imports cleanly in the container; Step 2 — runs and the PLModel instantiates; overall — all 7 steps complete, smoke tests pass, no missing .
ExperimentConfigbuild_model(cfg)__init__.py目标: 编写tao-core配置schema和tao-pytorch训练器+原生推理+原生评估代码,并在过程中进行冒烟测试。使用(第1阶段确定的蛇形命名)和(大驼峰命名)。包含七个步骤:(1) 在下编写tao-core配置——必须包含、、、、、、、;(2) 在下编写tao-pytorch训练器(、、、入口点、;若需新增骨干网络,则在中添加并注册);(3) 通过入口点的实现多GPU/多节点训练;(4) 原生推理→生成;(5) 原生评估→生成;(6–7) MLOps集成(→生成)。一致性规则(包括与的区别,以及表示必填的项)由下文的跨阶段检查清单强制执行。
<model_name><ModelName>config/<model_name>/ExperimentConfig(CommonExperimentConfig)modeldatasettrainevaluateinferenceexportgen_trt_enginequantizecv/<model_name>/build_model()<ModelName>PlModel(TAOLightningModule)train.pyexperiment_spec.yamlcv/backbone_v2/<backbone_name>.pylaunch()result.csvresults.json@monitor_statusstatus.jsonexport.onnx_filegen_trt_engine.onnx_file???MISSING完整分步代码和标准请参阅:phase-3-implementation.md(含代码片段tao-patterns.md、目录结构repo-structure.md、各任务细节task-type-guide.md)。
experiment_spec.yaml检查点: 步骤1——可在容器中正常导入;步骤2——可运行且PLModel可实例化;整体——所有7个步骤完成,冒烟测试通过,无缺失的文件。
ExperimentConfigbuild_model(cfg)__init__.pyPhase 4 — Export, Deployment & TensorRT Integration
第4阶段 — 导出、部署与TensorRT集成
Goal: ship ONNX export from tao-pytorch, then a TRT engine builder + TRT inference + TRT evaluation in tao-deploy that reuse the tao-core . Four steps (8–11): ONNX export (, per-task input/output names, ⇒ dynamic batch); TRT engine builder (, subclasses or reuses , writes ); TRT inference (NumPy-only → ); TRT evaluation (sklearn/pycocotools → ). Full code and the Phase 3+4 gate: phase-4-deploy.md.
ExperimentConfigscripts/export.pybatch_size=-1gen_trt_engine.pyEngineBuilderClassificationEngineBuilderspecs/{gen_trt_engine,inference,evaluate}.yamlClassificationLoaderresult.csvresults.jsonModule pitfall: tao-pytorch and tao-deploy have separate and implementations — use the deploy versions in deploy scripts; is imported from in both repos (same schema, same field paths).
hydra_runnermonitor_statusExperimentConfignvidia_tao_corePhase 3+4 gate: all three in-container checks pass — imports + model + ONNX export, and imports.
tao-pytorchtao-deploy目标: 实现从tao-pytorch导出ONNX,然后在tao-deploy中实现TRT引擎构建+TRT推理+TRT评估,且复用tao-core的。包含四个步骤(8–11):ONNX导出(,按任务定义输入/输出名称,⇒动态批量);TRT引擎构建(,继承或复用,编写);TRT推理(仅使用NumPy的→生成);TRT评估(使用sklearn/pycocotools→生成)。完整代码和第3+4阶段检查点请参阅:phase-4-deploy.md。
ExperimentConfigscripts/export.pybatch_size=-1gen_trt_engine.pyEngineBuilderClassificationEngineBuilderspecs/{gen_trt_engine,inference,evaluate}.yamlClassificationLoaderresult.csvresults.json模块陷阱:tao-pytorch和tao-deploy的和实现相互独立——在部署脚本中需使用deploy版本;在两个仓库中均从导入(相同schema,相同字段路径)。
hydra_runnermonitor_statusExperimentConfignvidia_tao_core第3+4阶段检查点: 三项容器内检查均通过——tao-pytorch可导入+模型可运行+ONNX导出成功,且tao-deploy可正常导入。
Phase 5 — Packaging & L0 Testing
第5阶段 — 打包与L0测试
Goal: register the model as a console_script in both and (deploy entrypoint uses ), and add L0 tests — deploy tests (, subprocess + ) and trainer tests (, , markers ). Full code and test layout: phase-5-packaging.md.
'<model_name>=...entrypoint.<model_name>:main'tao-pytorch/setup.pytao-deploy/setup.pynvidia_tao_deploy.cv.common.entrypoint.entrypoint_hydratao-deploy/tests/<model_name>/--buildOnlytrtexectao-pytorch/tests/cv_unit_test/<model_name>/Trainer(..., fast_dev_run=True)@pytest.mark.cv_unit @pytest.mark.<model_name>Gate: entrypoints registered; pytest files exist and follow the marker convention. Do NOT stop here — proceed directly to Phase 6.
目标: 在和中注册模型为控制台脚本(部署入口点使用),并添加L0测试——部署测试(,子进程+ )和训练器测试(,,标记)。完整代码和测试目录结构请参阅:phase-5-packaging.md。
tao-pytorch/setup.pytao-deploy/setup.py'<model_name>=...entrypoint.<model_name>:main'nvidia_tao_deploy.cv.common.entrypoint.entrypoint_hydratao-deploy/tests/<model_name>/--buildOnlytrtexectao-pytorch/tests/cv_unit_test/<model_name>/Trainer(..., fast_dev_run=True)@pytest.mark.cv_unit @pytest.mark.<model_name>检查点: 入口点已注册;pytest文件已创建且符合标记约定。请勿在此处停止——直接进入第6阶段。
Cross-Phase Data Flow & Consistency Verification
跨阶段数据流与一致性验证
Before Docker testing, verify the artifact chain — produces → → → → → / . Then confirm the consistency checklist: the name; / matching across the training spec, , , and builder ; ONNX /; / vs ; vs ; shared ; and an in every package dir (including for discovery). Full interpolation paths, itemized checklist, and config field paths: workflow-consistency.md.
train<results_dir>/train/<model_name>_model_latest.pthexport.checkpoint<results_dir>/export/<model_name>.onnxgen_trt_engine<results_dir>/trt/<model_name>.engineinference.trt_engineevaluate.trt_engine*_latest.pthaugmentation.meanstdinference.yamlevaluate.yamlpreprocess_modeinput_namesoutput_namesexport.input_widthinput_heightdataset.img_sizemodel.head.in_channelsmodel_params_mapping.pyclasses.txt__init__.pyscripts/__init__.pyget_subtasks()pkgutil在Docker测试前,需验证工件链——生成→→→→→/。然后确认一致性检查清单:的命名;/在训练配置、、和构建器中的一致性;ONNX的/;/与的匹配性;与的匹配性;共享的;每个包目录(包括,用于的发现)中均存在。完整插值路径、逐项检查清单和配置字段路径请参阅:workflow-consistency.md。
train<results_dir>/train/<model_name>_model_latest.pthexport.checkpoint<results_dir>/export/<model_name>.onnxgen_trt_engine<results_dir>/trt/<model_name>.engineinference.trt_engineevaluate.trt_engine*_latest.pthaugmentation.meanstdinference.yamlevaluate.yamlpreprocess_modeinput_namesoutput_namesexport.input_widthinput_heightdataset.img_sizemodel.head.in_channelsmodel_params_mapping.pyclasses.txtscripts/__init__.pyget_subtasks()pkgutil__init__.pyPhase 6 — Container Testing & End-to-End Validation
第6阶段 — 容器测试与端到端验证
Mandatory — start immediately after Phase 5. All TAO models ship as Docker images; code that only works outside a container is incomplete. Testing runs directly inside the TAO Toolkit container (no Docker image build in the test loop): mount the local source into the Phase-0 image tags, install via , and invoke / / / directly — use vanilla + lint binaries, NOT any / wrappers (those exist only in NVIDIA's internal mirrors; the public mirrors have no directory).
setup.py developpytestpylintpydocstyleflake8pytestci/run_functional_tests.pyci/run_static_tests.pygithub.com/NVIDIA-TAO/ci/Steps 16–25, in order: verify the local image tags (16); container for tao-core (17), tao-pytorch (18, , ), tao-deploy (19); static/lint tests (20, + optional /); wheel builds (21); the end-to-end pipeline (22 — train dry-run + export in one tao-pytorch session, then gen_trt_engine + inference + evaluate in one tao-deploy session, since discards installed packages); native-vs-TRT cross-check (23 — FP32 ≈ exact, FP16 ≈ small delta, divergence ⇒ ONNX/TRT issue); interactive debug shells (24); optional release Docker image build (25, distribution-only). Full per-step commands and the fix-and-retest loop: phase-6-container-tests.md; build scripts, runner patterns, requirements, CI conventions: docker-patterns.md.
pytest-m cv_unit--shm-size=16Gpylint --errors-onlypydocstyleflake8--rmPhase 6 gate (Done criteria): tao-core / tao-pytorch / tao-deploy unit tests pass in their TAO Toolkit containers; static tests pass (or only legacy lint warnings); wheels build; end-to-end → → → non-empty and ; native vs TRT predictions agree within tolerance.
<model_name>_model_latest.pthmodel.onnxmodel.engineresult.csvresults.json强制要求——完成第5阶段后立即启动。所有TAO模型均以Docker镜像形式交付;仅在容器外运行的代码视为未完成。测试需直接在TAO Toolkit容器内运行(测试循环中无需构建Docker镜像):将本地源码挂载到第0阶段的镜像标签中,通过安装,直接调用///——使用原生和 lint 二进制文件,请勿使用任何/封装脚本(这些仅存在于NVIDIA内部镜像;公开的镜像无目录)。
setup.py developpytestpylintpydocstyleflake8pytestci/run_functional_tests.pyci/run_static_tests.pygithub.com/NVIDIA-TAO/ci/按顺序执行步骤16–25:验证本地镜像标签(16);在容器内运行tao-core的(17)、tao-pytorch的(18,,)、tao-deploy的(19);静态/lint测试(20,+可选/);构建wheel包(21);端到端流水线(22——在同一个tao-pytorch会话中完成训练试运行+导出,然后在同一个tao-deploy会话中完成gen_trt_engine+推理+评估,因为会丢弃已安装的包);原生与TRT结果交叉校验(23——FP32结果几乎完全一致,FP16结果差异微小,若出现分歧则说明ONNX/TRT存在问题);交互式调试shell(24);可选的发布Docker镜像构建(25,仅用于分发)。完整分步命令和修复-重测循环请参阅:phase-6-container-tests.md;构建脚本、运行模式、依赖、CI约定请参阅:docker-patterns.md。
pytestpytest-m cv_unit--shm-size=16Gpytestpylint --errors-onlypydocstyleflake8--rm第6阶段检查点(完成标准): tao-core/tao-pytorch/tao-deploy的单元测试在各自的TAO Toolkit容器内通过;静态测试通过(或仅存在遗留lint警告);wheel包构建成功;端到端流程生成→→→非空的和;原生与TRT预测结果在容差范围内一致。
<model_name>_model_latest.pthmodel.onnxmodel.engineresult.csvresults.jsonPhase 7 — Optimization & Tuning (conditional)
第7阶段 — 优化与调优(可选)
Enter only if Phase 6 passes but accuracy / latency / model size needs improvement. Ask the user for target metrics first. Diagnose (Step 26) across four categories — accuracy too low, TRT-vs-native gap, training too slow, inference too slow — then apply the relevant technique: hyperparameter tuning (27), INT8 quantization (28), channel pruning + retrain (29), knowledge distillation (30), or resolution tuning (31). Full diagnostics, config blocks, YAML overrides, and decision tree: phase-7-optimization.md.
仅当第6阶段通过,但精度/延迟/模型大小需要改进时进入此阶段。需先询问用户目标指标。从四个维度进行诊断(步骤26)——精度过低、TRT与原生结果差距大、训练过慢、推理过慢,然后应用相应的技术:超参数调优(27)、INT8量化(28)、通道剪枝+重训练(29)、知识蒸馏(30)或分辨率调优(31)。完整诊断方法、配置块、YAML覆盖规则和决策树请参阅:phase-7-optimization.md。
Argument
参数
$ARGUMENTSIf provided, interpret as the HuggingFace model ID or URL to use as the starting point for Phase 1. If credentials or model short-name are not included, ask the user for them before proceeding.
$ARGUMENTS$ARGUMENTS若提供该参数,将解释为HuggingFace模型ID或URL,作为第1阶段的起点。若未包含凭证或模型简称,需先询问用户获取相关信息再继续。
$ARGUMENTS