physical-ai-video-data-augmentation
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePhysical AI Video Data Augmentation Workflow Orchestrator
Physical AI 视频数据增强工作流编排器
Default workflow skill for VDA execution on OSMO. It owns flow selection,
preflight, cache readiness, inference-path decisions, submit-time interpolation,
monitoring, and output retrieval. Component skills are consult-only.
这是在OSMO上执行VDA的默认工作流Skill。它负责流程选择、预检、缓存就绪、推理路径决策、提交时插值、监控以及输出检索。组件Skill仅用于咨询。
Purpose
用途
Run the end-to-end VDA workflow safely and reproducibly from preflight to output
download.
Do NOT use this skill for container-internal tuning-only questions.
安全且可复现地运行从预检到输出下载的端到端VDA工作流。
请勿将此Skill用于仅容器内部调优的问题。
Prerequisites
前提条件
Confirm these before running preflight or any submit. Missing required secrets
surface as from .
USER_INPUT_REQUIRED:scripts/preflight_credentials.sh| Requirement | How it is satisfied | Used for |
|---|---|---|
| NGC API key (optional) | | Optional for |
| Hugging Face token | | Creates the OSMO |
| OSMO CLI access | | Submitting/monitoring workflows and listing/downloading objects |
| GPU pool | At least one | Scheduling setup + worker tasks |
Optional (only for the strict NGC org/team probe): +
(or / ). External VLM/LLM endpoint keys are validated
separately, not by preflight.
NGC_ORGNGC_TEAMNGC_CLI_ORGNGC_CLI_TEAMKey handling rule: tokens are first-class inputs for .
Never reject by token prefix alone; use workflow registry probe results as
source of truth.
nvapi-*nvcr_io在运行预检或任何提交操作前,请确认以下事项。缺少必要密钥会在的输出中显示为。
scripts/preflight_credentials.shUSER_INPUT_REQUIRED:| 要求 | 满足方式 | 用途 |
|---|---|---|
| NGC API密钥(可选) | 在 | 可选用于 |
| Hugging Face令牌 | 设置 | 创建OSMO的 |
| OSMO CLI访问权限 | | 提交/监控工作流,以及列出/下载对象 |
| GPU池 | | 调度设置 + 工作任务 |
可选(仅用于严格的NGC组织/团队探测): + (或/)。外部VLM/LLM端点密钥会单独验证,不由预检操作处理。
NGC_ORGNGC_TEAMNGC_CLI_ORGNGC_CLI_TEAM密钥处理规则:令牌是的一等输入。切勿仅根据令牌前缀拒绝;以工作流注册表探测结果为判断依据。
nvapi-*nvcr_ioInstructions
操作步骤
- Select the workflow (,
auto_labeling,augmentation_and_al,e2e) from user intent.e2e_super_resolution - Provide a tentative execution-time overview before starting run actions.
- Run preflight and readiness checks before submit.
- Derive submit-time values from the active dataset backend (never guess
).
storage_url - Submit the workflow with explicit interpolation values and monitor to completion.
- Retrieve outputs, provide side-by-side comparison evidence for augmented flows, and summarize task outcomes.
Use for script execution. Canonical examples:
run_script(...)python
run_script("bash scripts/preflight_credentials.sh --workflow assets/configs/osmo/augmentation_and_al.yaml")
run_script("python3 scripts/pre_submit_guard.py --workflow assets/configs/osmo/auto_labeling.yaml")
run_script("bash scripts/prepare_demo_assets.sh /srv/sdg/data/vda_inputs")- 根据用户意图选择工作流(、
auto_labeling、augmentation_and_al、e2e)。e2e_super_resolution - 在开始运行操作前,提供暂定的执行时间概述。
- 在提交前运行预检和就绪检查。
- 从活跃数据集后端推导提交时的值(切勿猜测)。
storage_url - 使用明确的插值值提交工作流,并监控至完成。
- 检索输出,为增强流程提供并排对比证据,并总结任务结果。
使用执行脚本。标准示例:
run_script(...)python
run_script("bash scripts/preflight_credentials.sh --workflow assets/configs/osmo/augmentation_and_al.yaml")
run_script("python3 scripts/pre_submit_guard.py --workflow assets/configs/osmo/auto_labeling.yaml")
run_script("bash scripts/prepare_demo_assets.sh /srv/sdg/data/vda_inputs")Available Scripts
可用脚本
Use script-level for exact arguments.
--help| Script | Role |
|---|---|
| Secrets/control-plane preflight and workflow image access checks |
| Submit-time interpolation, cache, and dataset safety checks |
| Demo video pull + flatten for default demo path |
| Setup-time config and cookbook projection generation |
| Augmentation worker execution |
| Original-video auto-labeling worker execution |
| Augmented-video auto-labeling worker execution |
| Multi-node barrier synchronization |
| Local mirror of full run output + input video |
| Side-by-side comparison render from local artifacts |
使用脚本级别的查看确切参数。
--help| 脚本 | 作用 |
|---|---|
| 密钥/控制平面预检和工作流镜像访问检查 |
| 提交时插值、缓存和数据集安全检查 |
| 拉取演示视频并扁平化至默认演示路径 |
| 生成设置时的配置和指南投影 |
| 增强工作任务执行 |
| 原始视频自动标注工作任务执行 |
| 增强后视频自动标注工作任务执行 |
| 多节点屏障同步 |
| 完整运行输出 + 输入视频的本地镜像 |
| 从本地工件渲染并排对比内容 |
Supported Flows
支持的流程
| Flow | OSMO YAML | Group sequence | Typical use |
|---|---|---|---|
| | setup -> augmentation -> auto_labeling_augmented | Augment one or more videos, then auto-label augmented outputs |
| | setup -> auto_labeling | Label original videos only |
| | setup -> (auto_labeling_original + augmentation) -> auto_labeling_augmented | Throughput-first path |
| | setup -> auto_labeling_original -> augmentation -> auto_labeling_augmented | Sequential path with SR gate before augmentation |
Legacy alias remains for
backwards compatibility.
assets/configs/osmo/augmentation_and_pl.yaml| 流程 | OSMO YAML | 组序列 | 典型用途 |
|---|---|---|---|
| | setup -> augmentation -> auto_labeling_augmented | 增强一个或多个视频,然后对增强后的输出进行自动标注 |
| | setup -> auto_labeling | 仅标注原始视频 |
| | setup -> (auto_labeling_original + augmentation) -> auto_labeling_augmented | 优先考虑吞吐量的路径 |
| | setup -> auto_labeling_original -> augmentation -> auto_labeling_augmented | 增强前带有超分辨率门控的顺序路径 |
为了向后兼容,仍保留旧别名。
assets/configs/osmo/augmentation_and_pl.yamlPick the right workflow for the user's request
根据用户请求选择合适的工作流
| User intent | Workflow |
|---|---|
| "Label my source videos" / "PL-only" / "no augmentation" | |
| "Create augmented videos and label them" | |
| "Run the full pipeline quickly" | |
| "Run full pipeline, but gate on SR-enhanced originals first" | |
| 用户意图 | 工作流 |
|---|---|
| "标注我的源视频" / "仅PL" / "不增强" | |
| "创建增强视频并标注它们" | |
| "快速运行完整流水线" | |
| "运行完整流水线,但先基于超分辨率增强的原始视频进行门控" | |
Disambiguation: handle vague requests before committing
歧义处理:提交前处理模糊请求
Default to autonomy: ask only when missing information blocks execution.
默认自主处理:仅当缺少信息阻碍执行时才询问。
Autonomous defaults (do NOT ask)
自主默认规则(无需询问)
- If dataset source is absent, run VDA demo path () and continue with
scripts/prepare_demo_assets.sh.dataset=vda-demo - If flow is not explicitly requested, default to .
augmentation_and_al - If endpoint mode is unspecified, default to in-cluster persistent NIM reuse and automatic NIM deploy/repair when unhealthy.
- If cache is missing, run , rerun pre-submit guard, and continue automatically on success.
setup_model_cache.yaml - After any stage completes successfully, continue to the next stage immediately. Do not pause with "Ready when you are" or equivalent approval prompts.
- 如果数据集源缺失,运行VDA演示路径()并继续使用
scripts/prepare_demo_assets.sh。dataset=vda-demo - 如果未明确请求流程,默认使用。
augmentation_and_al - 如果未指定端点模式,默认使用集群内持久化NIM复用,当NIM不健康时自动部署/修复。
- 如果缓存缺失,运行,重新运行提交前检查器,成功后自动继续。
setup_model_cache.yaml - 任何阶段成功完成后,立即继续下一阶段。请勿以“准备好后告知我”或类似的批准提示暂停。
Triggers that should pause for disambiguation
需要暂停以消除歧义的触发条件
| Missing input | Why it matters | Ask |
|---|---|---|
| Required secret is missing | Ask one concise unblock question for exactly the missing value(s) |
| Storage backend prefix cannot be derived from the active dataset/upload root | Wrong scheme causes runtime storage auth mismatch | "What is the backend-native root prefix for this run?" |
| No ONLINE GPU pool/platform can be selected | Workflow cannot schedule setup/workers | "Which GPU pool/platform should this run target?" |
| 缺失的输入 | 重要性 | 询问内容 |
|---|---|---|
预检输出中的 | 缺少必要密钥 | 针对缺失的值提出一个简洁的解决问题的问题 |
| 无法从活跃数据集/上传根目录推导存储后端前缀 | 错误的方案会导致运行时存储认证不匹配 | "本次运行的后端原生根前缀是什么?" |
| 无法选择ONLINE状态的GPU池/平台 | 工作流无法调度设置/工作任务 | "本次运行应针对哪个GPU池/平台?" |
When NOT to disambiguate
无需消除歧义的情况
- Do not ask for cookbook unless user explicitly asks to change scene profile.
- Do not offer external endpoints by default.
- Do not ask A/B cache strategy questions; default is automatic cache setup.
- Do not ask to scale down existing NIMs; this is forbidden.
- Do not invent, scrape, or generate random videos when input is missing.
- Do not use non-VDA demo sources (for example Carline adaptation assets) unless the user explicitly requests a different dataset.
- 除非用户明确要求更改场景配置文件,否则不要询问指南相关内容。
- 默认不提供外部端点。
- 不要询问A/B缓存策略问题;默认自动设置缓存。
- 不要要求缩减现有NIM的规模;这是被禁止的。
- 输入缺失时,不要发明、抓取或生成随机视频。
- 除非用户明确请求不同的数据集,否则不要使用非VDA演示源(例如Carline适配资产)。
Step 0: Select Flow and Gather Inputs
步骤0:选择流程并收集输入
Input video policy (non-negotiable)
输入视频规则(不可协商)
- Always preserve user-provided video inputs (dataset URL, local path, or upload folder) as first-class and preferred.
- Never replace an explicit user video with demo assets or any other source.
- If no video input is provided, default to VDA demo assets via
(HF dataset flow) without asking extra source-selection questions.
scripts/prepare_demo_assets.sh - If the user explicitly mentions an input video or dataset, prefer and use that input instead of demo assets.
- Use only VDA demo assets () for the default demo path.
nvidia/video-data-augmentation-demo - Never propose arbitrary web clip downloads or placeholder videos unless the user explicitly requests that behavior.
Collect only missing values:
- Dataset source (prefer explicit user-provided or local upload folder; otherwise default to VDA demo assets and proceed).
dataset_url - Flow (,
auto_labeling,augmentation_and_al,e2e); default toe2e_super_resolutionwhen unspecified.augmentation_and_al - OSMO for all VDA resources (auto-select an ONLINE platform when unambiguous; ask only when no valid option exists).
gpu_platform - Endpoint mode (default in-cluster NIM reuse/deploy unless explicitly overridden).
Do not guess (for example ). Use the exact current
platform label shown by (for example ).
gpu_platformmicrok8sosmo pool list --mode freegpuGenerate run stamp before each submit:
bash
STAMP=$(cat /proc/sys/kernel/random/uuid | cut -c1-8)
RUN_ID="run-$STAMP"- 始终将用户提供的视频输入(数据集URL、本地路径或上传文件夹)视为一等优先项。
- 切勿用演示资产或任何其他源替换用户明确提供的视频。
- 如果未提供视频输入,默认通过使用VDA演示资产(HF数据集流程),无需额外询问源选择问题。
scripts/prepare_demo_assets.sh - 如果用户明确提及输入视频或数据集,优先使用该输入而非演示资产。
- 默认演示路径仅使用VDA演示资产()。
nvidia/video-data-augmentation-demo - 除非用户明确要求,否则不要提议任意网络片段下载或占位符视频。
仅收集缺失的值:
- 数据集源(优先选择用户明确提供的或本地上传文件夹;否则默认使用VDA演示资产并继续)。
dataset_url - 流程(、
auto_labeling、augmentation_and_al、e2e);未指定时默认使用e2e_super_resolution。augmentation_and_al - 所有VDA资源的OSMO (无歧义时自动选择ONLINE平台;仅当无有效选项时询问)。
gpu_platform - 端点模式(默认集群内NIM复用/部署,除非明确覆盖)。
切勿猜测(例如)。使用显示的确切当前平台标签(例如)。
gpu_platformmicrok8sosmo pool list --mode freegpu每次提交前生成运行标记:
bash
STAMP=$(cat /proc/sys/kernel/random/uuid | cut -c1-8)
RUN_ID="run-$STAMP"Execution Time Overview (required before run)
执行时间概述(运行前必需)
Before running any mutating command (, NIM install/repair,
cache workflow submit, or target VDA workflow submit), provide a short ETA
overview to the user.
osmo credential setKeep it concise (one short paragraph or 4-6 bullets) and include:
- whether this looks like a cold start (NIM/cache missing) or warm start (NIM/cache already healthy),
- major phases with approximate durations,
- a total expected range for the selected workflow.
Baseline ranges (from observed MicroK8s + OSMO runs):
| Phase | Typical duration |
|---|---|
| Credentials + preflight | ~1-2 min |
| NIM deploy/download/warmup (if needed) | ~10-15 min |
| Demo assets download/upload (if demo path) | ~1-3 min |
| Model cache population (if needed) | ~15-25 min |
| Workflow submit + queue/start | ~1-3 min |
Workflow runtime ranges after submit:
| Flow | Typical runtime |
|---|---|
| ~6-15 min |
| ~20-35 min |
| ~22-40 min |
| ~25-45 min |
Cold-start end-to-end runs are commonly ~45-80 min; warm-start runs are usually
~20-45 min depending on flow and video length.
在运行任何变更命令(、NIM安装/修复、缓存工作流提交或目标VDA工作流提交)之前,向用户提供简短的预计时间概述。
osmo credential set保持简洁(一个短段落或4-6个项目符号),并包含:
- 这看起来像是冷启动(NIM/缓存缺失)还是热启动(NIM/缓存已正常),
- 主要阶段及大致持续时间,
- 所选工作流的总预期时间范围。
基准范围(来自观察到的MicroK8s + OSMO运行):
| 阶段 | 典型持续时间 |
|---|---|
| 凭证 + 预检 | ~1-2分钟 |
| NIM部署/下载/预热(如需) | ~10-15分钟 |
| 演示资产下载/上传(如使用演示路径) | ~1-3分钟 |
| 模型缓存填充(如需) | ~15-25分钟 |
| 工作流提交 + 排队/启动 | ~1-3分钟 |
提交后的工作流运行时间范围:
| 流程 | 典型运行时间 |
|---|---|
| ~6-15分钟 |
| ~20-35分钟 |
| ~22-40分钟 |
| ~25-45分钟 |
冷启动端到端运行通常约45-80分钟;热启动运行通常约20-45分钟,具体取决于流程和视频长度。
Common Preconditions (all flows)
通用前置条件(所有流程)
-
Credential and control-plane preflightbash
bash scripts/preflight_credentials.sh --workflow assets/configs/osmo/<mode>.yamlRestricted egress:bashbash scripts/preflight_credentials.sh --no-probe --workflow assets/configs/osmo/<mode>.yamlPreflight does not require a workload-local. Runtime interpolation is driven by submit-time values (.env,dataset,run_id,gpu_platform,video,storage_url) supplied in oneskills_dirlist.--set-stringPassingvalidates pull access for the active workflow image refs (--workflow) using anonymous bearer access with credential fallback when provided. If replacement NGC/HF secrets are provided in env, preflight refreshes existingworkflow.groups[].tasks[].image/nvcr_ioautomatically when present. Usehf_tokento force overwrite even when no new env secrets were supplied:--refreshbashbash scripts/preflight_credentials.sh --workflow assets/configs/osmo/<mode>.yaml --refreshIf output contains, ask one concise unblock question and stop.USER_INPUT_REQUIRED:On workflow image, report registry access failure after probe checks on the listed image refs; do not claim a key family (for example401/403) is categorically unsupported.nvapi-* -
Storage interpolation policymust be derived from the actual dataset/upload backend for the current run.
storage_urltextdataset_url=azure://storiondevxah69/osmo-workflows/datasets/vda-demo storage_url=azure://storiondevxah69/osmo-workflows dataset=vda-demoNever silently default to stalevalues on non-S3 backends.s3:// -
Inference policy (non-negotiable)
- Reuse healthy in-cluster persistent NIM endpoints by default.
- If missing/unhealthy, deploy automatically — this is a prerequisite, not a user decision. Do NOT pause to ask; run the install with the VDA allow-list:
bashexport NIM_SERVICES="qwen3-vl qwen25-14b" skills/physical-ai-infrastructure-setup-and-resilient-scaling/components/inference-nim-operator/scripts/install.sh- See for full endpoint docs and health checks.
references/nim/README.md - External endpoints are opt-in only (explicit request or explicit URLs); only then skip the in-cluster deploy.
- Never infer external mode from credential presence.
- Never scale down/delete existing NIMs to free GPUs.
-
Readiness guardbash
osmo pool list --mode free osmo config show POD_TEMPLATE python3 scripts/pre_submit_guard.py --workflow assets/configs/osmo/<mode>.yaml -
Cache auto-remediationIfreports cache failure, default action is to run:
pre_submit_guard.pybashosmo workflow submit assets/configs/osmo/setup_model_cache.yaml \ --set-string storage_url=<backend-prefix> path=dataThen rerunand submit the target VDA flow only after it passes. Ask user only when backend/prefix is ambiguous or cache setup fails.pre_submit_guard.py -
Scheduling policyVDA templates schedule setup and workers on(no
gpu_platformpool dependency for user workloads).system
-
凭证和控制平面预检bash
bash scripts/preflight_credentials.sh --workflow assets/configs/osmo/<mode>.yaml受限出口:bashbash scripts/preflight_credentials.sh --no-probe --workflow assets/configs/osmo/<mode>.yaml预检不需要工作负载本地的文件。运行时插值由提交时提供的.env列表中的值(--set-string、dataset、run_id、gpu_platform、video、storage_url)驱动。skills_dir传递参数会使用匿名Bearer访问验证活跃工作流镜像引用(--workflow)的拉取权限,提供凭证时会回退使用凭证。 如果环境中提供了替代的NGC/HF密钥,预检会自动刷新现有的workflow.groups[].tasks[].image/nvcr_io(如果存在)。使用hf_token强制覆盖,即使未提供新的环境密钥:--refreshbashbash scripts/preflight_credentials.sh --workflow assets/configs/osmo/<mode>.yaml --refresh如果输出包含,提出一个简洁的解决问题的问题并停止。USER_INPUT_REQUIRED:如果工作流镜像出现错误,在对列出的镜像引用进行探测检查后报告注册表访问失败;不要断言某个密钥家族(例如401/403)完全不被支持。nvapi-* -
存储插值策略必须从当前运行的实际数据集/上传后端推导。
storage_urltextdataset_url=azure://storiondevxah69/osmo-workflows/datasets/vda-demo storage_url=azure://storiondevxah69/osmo-workflows dataset=vda-demo切勿在非S3后端上默认使用过时的值。s3:// -
推理策略(不可协商)
- 默认复用健康的集群内持久化NIM端点。
- 如果缺失/不健康,自动部署——这是前置条件,而非用户决策。请勿暂停询问;使用VDA允许列表运行安装:
bashexport NIM_SERVICES="qwen3-vl qwen25-14b" skills/physical-ai-infrastructure-setup-and-resilient-scaling/components/inference-nim-operator/scripts/install.sh- 有关完整的端点文档和健康检查,请参阅。
references/nim/README.md - 外部端点仅为可选加入(明确请求或明确URL);仅在此时跳过集群内部署。
- 切勿从凭证存在推断外部模式。
- 切勿缩减/删除现有NIM以释放GPU。
-
就绪检查器bash
osmo pool list --mode free osmo config show POD_TEMPLATE python3 scripts/pre_submit_guard.py --workflow assets/configs/osmo/<mode>.yaml -
缓存自动修复如果报告缓存失败,默认操作是运行:
pre_submit_guard.pybashosmo workflow submit assets/configs/osmo/setup_model_cache.yaml \ --set-string storage_url=<backend-prefix> path=data然后重新运行,仅在通过后提交目标VDA流程。仅当后端/前缀不明确或缓存设置失败时询问用户。pre_submit_guard.py -
调度策略VDA模板在上调度设置和工作任务(用户工作负载不依赖
gpu_platform池)。system
Submit (all flows)
提交(所有流程)
Every flow uses the same submit shape; only the workflow YAML changes. Choose the
YAML for the requested flow, then run the command below. Full per-flow walkthroughs
(stage matrix and flow details) live in the linked references.
| Flow | Workflow YAML | Walkthrough |
|---|---|---|
| Augmentation + auto-labeling | | |
| Auto-labeling only | | |
| E2E (parallel) | | |
| E2E (super-resolution gated) | | |
bash
SKILLS_DIR="$(cd "$(git rev-parse --show-toplevel)/skills/physical-ai-video-data-augmentation" && pwd)"
STAMP=$(cat /proc/sys/kernel/random/uuid | cut -c1-8)
osmo workflow submit assets/configs/osmo/<flow>.yaml \
--pool <pool> \
--set-string \
dataset=<dataset> \
run_id=run-$STAMP \
storage_url=<backend-prefix> \
gpu_platform=<gpu-platform> \
video=<video-stem> \
cosmos_model_cache_url=<backend-prefix>/data/models/cosmos_transfer \
auto_labeling_model_cache_url=<backend-prefix>/data/models/auto_labeling \
skills_dir="$SKILLS_DIR"Compatibility note:
- Use exactly one flag and pass all key/value pairs after it.
--set-string - Do not repeat /
--setflags in the same command; some OSMO builds only honor the last occurrence.--set-string - Do not mix and
--setin one submit command.--set-string - Pass explicit values to avoid nested-template interpolation differences across OSMO environments.
*_model_cache_url - Do not brute-force permutations of flags. Use this shape directly.
Common optional overrides (append key/value pairs to the same list):
--set-stringbash
cookbook=<scene_profile> \
vlm_url=<openai_base_url> \
llm_url=<openai_base_url> \
cosmos_model_cache_url=<url> \
auto_labeling_model_cache_url=<url>The auto-labeling-only flow has no augmentation stage, so it omits
at runtime; passing it is harmless and keeps one submit
shape across flows.
cosmos_model_cache_url所有流程使用相同的提交格式;仅工作流YAML不同。为请求的流程选择YAML,然后运行以下命令。每个流程的完整演练(阶段矩阵和流程详情)位于链接的参考文档中。
| 流程 | 工作流YAML | 演练文档 |
|---|---|---|
| 增强 + 自动标注 | | |
| 仅自动标注 | | |
| E2E(并行) | | |
| E2E(超分辨率门控) | | |
bash
SKILLS_DIR="$(cd "$(git rev-parse --show-toplevel)/skills/physical-ai-video-data-augmentation" && pwd)"
STAMP=$(cat /proc/sys/kernel/random/uuid | cut -c1-8)
osmo workflow submit assets/configs/osmo/<flow>.yaml \
--pool <pool> \
--set-string \
dataset=<dataset> \
run_id=run-$STAMP \
storage_url=<backend-prefix> \
gpu_platform=<gpu-platform> \
video=<video-stem> \
cosmos_model_cache_url=<backend-prefix>/data/models/cosmos_transfer \
auto_labeling_model_cache_url=<backend-prefix>/data/models/auto_labeling \
skills_dir="$SKILLS_DIR"兼容性说明:
- 仅使用一个标志,并在其后传递所有键/值对。
--set-string - 同一命令中不要重复/
--set标志;某些OSMO版本仅识别最后一个。--set-string - 不要在一个提交命令中混合使用和
--set。--set-string - 传递明确的值,以避免不同OSMO环境下的嵌套模板插值差异。
*_model_cache_url - 不要强行尝试标志的排列组合。直接使用此格式。
常见可选覆盖(将键/值对追加到同一个列表中):
--set-stringbash
cookbook=<scene_profile> \
vlm_url=<openai_base_url> \
llm_url=<openai_base_url> \
cosmos_model_cache_url=<url> \
auto_labeling_model_cache_url=<url>仅自动标注流程没有增强阶段,因此运行时会省略;传递该参数不会产生影响,且能保持所有流程的提交格式一致。
cosmos_model_cache_urlOSMO Monitoring
OSMO监控
bash
undefinedbash
undefinedWorkflow status + task states
工作流状态 + 任务状态
osmo workflow query <workflow_id> --format-type json
| jq '{status, tasks: [.groups[].tasks[] | {name, status, exit_code}]}'
| jq '{status, tasks: [.groups[].tasks[] | {name, status, exit_code}]}'
osmo workflow query <workflow_id> --format-type json
| jq '{status, tasks: [.groups[].tasks[] | {name, status, exit_code}]}'
| jq '{status, tasks: [.groups[].tasks[] | {name, status, exit_code}]}'
Logs for a specific task
特定任务的日志
osmo workflow logs <workflow_id> --task <task_name> -n 200
osmo workflow logs <workflow_id> --task <task_name> -n 200
Output retrieval
输出检索
osmo data list --no-pager <output_url>
osmo data download <output_url> <local_dir>/
For completion artifacts, always mirror the full run output into workspace:
```bash
ROOT="$(git rev-parse --show-toplevel)"
RUN_LOCAL_DIR="$ROOT/media/vda/runs/<run_id>"
mkdir -p "$RUN_LOCAL_DIR"
osmo data download "<storage_url>/datasets/<dataset>-outputs/<run_id>/" "$RUN_LOCAL_DIR/"For runs expected to exceed two minutes, send heartbeat updates at least every
two minutes. For media evidence, emit one standalone
line per message bubble.
MEDIA:<absolute-path>Execution continuity requirement:
- Heartbeats must report progress while continuing work; they are status updates, not permission prompts.
- Do not stop between green stages waiting for approval.
- Pause only on blocking failures or explicit user stop/redirect.
- If submit fails on interpolation, rerun once with the same canonical single-flag shape and corrected values; do not loop through ad-hoc flag experiments.
MEDIA formatting is strict:
- Emit exactly one line:
MEDIA:/absolute/path/to/file.mp4 - Keep contiguous on a single line (never split across lines).
MEDIA: - No extra text in the same bubble.
- No code fences, bullets, or quotes around the directive.
- If render fails: retry once from a stable workspace path, then emit PNG fallback.
osmo data list --no-pager <output_url>
osmo data download <output_url> <local_dir>/
对于完成工件,始终将完整的运行输出镜像到工作区:
```bash
ROOT="$(git rev-parse --show-toplevel)"
RUN_LOCAL_DIR="$ROOT/media/vda/runs/<run_id>"
mkdir -p "$RUN_LOCAL_DIR"
osmo data download "<storage_url>/datasets/<dataset>-outputs/<run_id>/" "$RUN_LOCAL_DIR/"对于预计运行时间超过两分钟的任务,至少每两分钟发送一次心跳更新。对于媒体证据,每个消息气泡中输出一行独立的。
MEDIA:<absolute-path>执行连续性要求:
- 心跳必须在报告进度的同时继续工作;它们是状态更新,而非权限提示。
- 不要在成功阶段之间停止等待批准。
- 仅在出现阻塞故障或用户明确停止/重定向时暂停。
- 如果提交因插值失败,使用相同的标准单标志格式重新运行一次并修正值;不要循环尝试临时标志组合。
MEDIA格式严格要求:
- 仅输出一行:
MEDIA:/absolute/path/to/file.mp4 - 必须在同一行连续(切勿跨行拆分)。
MEDIA: - 同一气泡中不要包含额外文本。
- 不要在指令周围使用代码块、项目符号或引号。
- 如果渲染失败:从稳定的工作区路径重试一次,然后输出PNG作为备选。
Post-Run Comparison Evidence (required for augmented flows)
运行后对比证据(增强流程必需)
Applies to , , and after a
successful run.
augmentation_and_ale2ee2e_super_resolutionRequired completion output (do not stop at raw output URLs):
-
Stage full outputs + input video into workspace-local path:bash
bash scripts/stage_run_artifacts.sh \ --storage-url <storage_url> --dataset <dataset> --run-id <run_id> --video <video> -
Render side-by-side from that local run copy:bash
bash scripts/render_side_by_side.sh \ --run-local-dir "<repo>/media/vda/runs/<run_id>" --dataset <dataset> --video <video> -
Emit MEDIA from the local run copy and include:
- augmentation summary from (
<run_local_dir>/setup_b0/configs/manifest.yamlforsampled_vars)<video>_aug0 - auto-labeling summary from
<run_local_dir>/outputs/pseudo_labeled_augmented/<video>_aug0 - for /
e2e, original-label summary frome2e_super_resolution<run_local_dir>/outputs/pseudo_labeled/<video>
- augmentation summary from
If is unavailable, emit input and augmented MEDIA from the same local
run copy and still provide augmentation + auto-labeling summaries.
ffmpegFor demo runs (no user video provided), explicitly state that input came from
.
nvidia/video-data-augmentation-demo适用于、和流程成功运行后。
augmentation_and_ale2ee2e_super_resolution必需的完成输出(不要停留在原始输出URL):
-
将完整输出 + 输入视频复制到工作区本地路径:bash
bash scripts/stage_run_artifacts.sh \ --storage-url <storage_url> --dataset <dataset> --run-id <run_id> --video <video> -
从本地运行副本渲染并排对比内容:bash
bash scripts/render_side_by_side.sh \ --run-local-dir "<repo>/media/vda/runs/<run_id>" --dataset <dataset> --video <video> -
从本地运行副本输出MEDIA,并包含:
- 来自的增强摘要(
<run_local_dir>/setup_b0/configs/manifest.yaml的<video>_aug0)sampled_vars - 来自的自动标注摘要
<run_local_dir>/outputs/pseudo_labeled_augmented/<video>_aug0 - 对于/
e2e,来自e2e_super_resolution的原始标注摘要<run_local_dir>/outputs/pseudo_labeled/<video>
- 来自
如果不可用,从同一本地运行副本输出输入和增强后的MEDIA,并仍提供增强 + 自动标注摘要。
ffmpeg对于演示运行(未提供用户视频),明确说明输入来自。
nvidia/video-data-augmentation-demoSupporting files
支持文件
Use these canonical locations:
- Workflows:
assets/configs/osmo/*.yaml - Runtime scripts: ,
scripts/*.shscripts/*.py - Flow walkthroughs:
references/flows/*.md - Setup and triage: ,
references/setup.mdreferences/troubleshooting.md - Images and endpoint policy: ,
references/container-images.mdreferences/nim/README.md - Cookbook tuning:
assets/cookbooks/TUNING_GUIDE.md
使用以下标准位置:
- 工作流:
assets/configs/osmo/*.yaml - 运行时脚本:、
scripts/*.shscripts/*.py - 流程演练文档:
references/flows/*.md - 设置和故障排查:、
references/setup.mdreferences/troubleshooting.md - 镜像和端点策略:、
references/container-images.mdreferences/nim/README.md - 指南调优:
assets/cookbooks/TUNING_GUIDE.md