holoscan-install-container

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Holoscan NGC Container Installation

Holoscan NGC容器安装

Purpose

用途

Pull and verify the official Holoscan SDK container from NGC (
nvcr.io/nvidia/clara-holoscan/holoscan
), selecting the right CUDA/arch tag for the host GPU and validating with the bundled Python and C++ examples.
从NGC拉取并验证官方Holoscan SDK容器(
nvcr.io/nvidia/clara-holoscan/holoscan
),为宿主机GPU选择合适的CUDA/架构标签,并通过附带的Python和C++示例进行验证。

Prerequisites

前提条件

  • Linux host with an NVIDIA GPU and a working driver (
    nvidia-smi
    ).
  • Docker installed and the user in the
    docker
    group (or
    sudo
    ).
  • NVIDIA Container Toolkit installed (
    docker run --gpus all
    works).
  • ~10–20 GB free disk for the image pull.
  • Network access to
    nvcr.io
    and
    docs.nvidia.com
    .
  • 配备NVIDIA GPU且驱动正常运行的Linux宿主机(可通过
    nvidia-smi
    验证)。
  • 已安装Docker,且用户属于
    docker
    组(或可使用
    sudo
    )。
  • 已安装NVIDIA Container Toolkit(确保
    docker run --gpus all
    可正常运行)。
  • 约10–20 GB的可用磁盘空间用于拉取镜像。
  • 可访问
    nvcr.io
    docs.nvidia.com
    的网络环境。

Limitations

限制条件

  • Container images cover only the tag matrix below — no Conda/pip env inside.
  • GUI examples require X11 forwarding; this skill runs Holoviz headless to avoid that.
  • Tag suffix must match the host GPU/driver (cuda13 / cuda12-dgpu / cuda12-igpu) — wrong suffix → CUDA init failures.
  • 容器镜像仅支持下方的标签矩阵——容器内无Conda/pip环境。
  • GUI示例需要X11转发;本技能以无头模式运行Holoviz以避免该需求。
  • 标签后缀必须与宿主机GPU/驱动匹配(cuda13 / cuda12-dgpu / cuda12-igpu)——后缀错误会导致CUDA初始化失败。

Instructions

操作步骤

  • Container repo:
    nvcr.io/nvidia/clara-holoscan/holoscan
    .
  • The doc page at https://docs.nvidia.com/holoscan/sdk-user-guide/sdk_installation.html is canonical — fetch it if anything below disagrees.
  • Work through the steps below in order: pick the tag, verify GPU passthrough and pull, verify with the six examples, then hand off the launch command.

Step 1: Pick the tag

步骤1:选择标签

Tag =
<version>-<suffix>
, e.g.
v4.1.0-cuda13
. Get the current SDK version from the doc page above; pick the suffix from
nvidia-smi
(the "CUDA Version" field, top-right of the table header):
nvidia-smi
CUDA Version
Suffix
13.x+
cuda13
12.x, Ampere/Ada dGPU
cuda12-dgpu
12.x, ARM64 iGPU (nvgpu)
cuda12-igpu
The "CUDA Forward Compatibility mode ENABLED" banner is expected — not an error — when the container ships a newer CUDA minor version than the host driver supports. The forward-compat shim lets the container's CUDA runtime work against the older host driver within the same major version.
标签格式为
<version>-<suffix>
,例如
v4.1.0-cuda13
。从上述文档页面获取当前SDK版本;根据
nvidia-smi
(表格右上角的“CUDA Version”字段)选择后缀:
nvidia-smi
显示的CUDA版本
后缀
13.x+
cuda13
12.x,Ampere/Ada独立GPU
cuda12-dgpu
12.x,ARM64集成GPU(nvgpu)
cuda12-igpu
当容器搭载的CUDA次版本高于宿主机驱动支持的版本时,出现“CUDA Forward Compatibility mode ENABLED”提示是正常现象——并非错误。向前兼容垫片可让容器内的CUDA运行时在相同主版本下适配旧版宿主机驱动。

Step 2: Verify GPU passthrough, then pull

步骤2:验证GPU透传,然后拉取镜像

bash
docker run --rm --gpus all ubuntu:22.04 nvidia-smi 2>&1 | tail -5
If Docker is missing → install from https://docs.docker.com/engine/install/. If GPU passthrough fails → install the NVIDIA Container Toolkit per https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html, then retry.
Pull (~10–20 GB — warn the user before starting):
bash
docker pull nvcr.io/nvidia/clara-holoscan/holoscan:<TAG>
bash
docker run --rm --gpus all ubuntu:22.04 nvidia-smi 2>&1 | tail -5
拉取镜像(约10–20 GB——开始前请告知用户):
bash
docker pull nvcr.io/nvidia/clara-holoscan/holoscan:<TAG>

Step 3: Verify with six examples

步骤3:通过六个示例验证

Tests cover: bare Python binding (1a), bare C++ runtime (1b, 2a), Python + Holoviz/Vulkan (2b, 3a), and C++ + Holoviz/Vulkan (3b). Holoviz examples always run headless (inject
headless: true
into the YAML) — this works whether or not a display is attached and avoids GUI failure modes over SSH.
bash
IMG=nvcr.io/nvidia/clara-holoscan/holoscan:<TAG>
RUN=(docker run --rm --runtime=nvidia --gpus all --cap-add CAP_SYS_PTRACE --ipc=host --ulimit memlock=-1 --ulimit stack=67108864)
测试涵盖:基础Python绑定(1a)、基础C++运行时(1b、2a)、Python + Holoviz/Vulkan(2b、3a)以及C++ + Holoviz/Vulkan(3b)。Holoviz示例始终以无头模式运行(在YAML中注入
headless: true
)——无论是否连接显示器均可正常运行,避免SSH环境下的GUI故障。
bash
IMG=nvcr.io/nvidia/clara-holoscan/holoscan:<TAG>
RUN=(docker run --rm --runtime=nvidia --gpus all --cap-add CAP_SYS_PTRACE --ipc=host --ulimit memlock=-1 --ulimit stack=67108864)

1a. hello_world (Python) — expect "Hello World!"

1a. hello_world(Python版)——预期输出"Hello World!"

"${RUN[@]}" "$IMG" bash -c
"ulimit -s 32768 && python3 /opt/nvidia/holoscan/examples/hello_world/python/hello_world.py"
"${RUN[@]}" "$IMG" bash -c
"ulimit -s 32768 && python3 /opt/nvidia/holoscan/examples/hello_world/python/hello_world.py"

1b. hello_world (C++) — expect "Hello World!"

1b. hello_world(C++版)——预期输出"Hello World!"

"${RUN[@]}" "$IMG" bash -c
"ulimit -s 32768 && /opt/nvidia/holoscan/examples/hello_world/cpp/hello_world"
"${RUN[@]}" "$IMG" bash -c
"ulimit -s 32768 && /opt/nvidia/holoscan/examples/hello_world/cpp/hello_world"

2a. tensor_interop (C++) — expect tensors doubling each pass, "Graph execution finished."

2a. tensor_interop(C++版)——预期张量每次传递翻倍,输出"Graph execution finished."

"${RUN[@]}" "$IMG" bash -c
"ulimit -s 32768 && /opt/nvidia/holoscan/examples/tensor_interop/cpp/tensor_interop"
"${RUN[@]}" "$IMG" bash -c
"ulimit -s 32768 && /opt/nvidia/holoscan/examples/tensor_interop/cpp/tensor_interop"

2b. tensor_interop (Python, 10 frames) — Holoviz, headless. The YAML has no

2b. tensor_interop(Python版,10帧)——Holoviz无头模式。默认YAML中无headless字段,

headless field by default, so inject one under
holoviz:
. Expect

需在
holoviz:
下添加该字段。预期输出"message received (count: 10)"。

"message received (count: 10)".

"${RUN[@]}" "$IMG" bash -c " ulimit -s 32768 sed -e 's/count: 0/count: 10/'
-e 's/repeat: true/repeat: false/'
-e 's/realtime: true/realtime: false/'
-e 's/^holoviz:/holoviz:\n headless: true/'
/opt/nvidia/holoscan/examples/tensor_interop/python/tensor_interop.yaml > /tmp/ti.yaml cd /opt/nvidia/holoscan/examples/tensor_interop/python python3 tensor_interop.py --config /tmp/ti.yaml "
"${RUN[@]}" "$IMG" bash -c " ulimit -s 32768 sed -e 's/count: 0/count: 10/'
-e 's/repeat: true/repeat: false/'
-e 's/realtime: true/realtime: false/'
-e 's/^holoviz:/holoviz:\n headless: true/'
/opt/nvidia/holoscan/examples/tensor_interop/python/tensor_interop.yaml > /tmp/ti.yaml cd /opt/nvidia/holoscan/examples/tensor_interop/python python3 tensor_interop.py --config /tmp/ti.yaml "

3a. video_replayer (Python, 10 frames) — Holoviz, headless. Inject
headless: true

3a. video_replayer(Python版,10帧)——Holoviz无头模式。在
holoviz:
下(
width: 854
上方)注入
headless: true

under
holoviz:
(above
width: 854
). Same sed works for the C++ YAML in 3b —

步骤3b中的C++版YAML可使用相同的sed命令——两个文件的
holoviz:
部分结构一致。

both files share the same
holoviz:
section shape.

"${RUN[@]}" "$IMG" bash -c " ulimit -s 32768 sed -e 's/count: 0/count: 10/'
-e 's/repeat: true/repeat: false/'
-e 's/realtime: true/realtime: false/'
-e 's/^ width: 854/ headless: true\n width: 854/'
/opt/nvidia/holoscan/examples/video_replayer/python/video_replayer.yaml > /tmp/vr.yaml cd /opt/nvidia/holoscan/examples/video_replayer/python HOLOSCAN_INPUT_PATH=/opt/nvidia/holoscan/data python3 video_replayer.py --config /tmp/vr.yaml "
"${RUN[@]}" "$IMG" bash -c " ulimit -s 32768 sed -e 's/count: 0/count: 10/'
-e 's/repeat: true/repeat: false/'
-e 's/realtime: true/realtime: false/'
-e 's/^ width: 854/ headless: true\n width: 854/'
/opt/nvidia/holoscan/examples/video_replayer/python/video_replayer.yaml > /tmp/vr.yaml cd /opt/nvidia/holoscan/examples/video_replayer/python HOLOSCAN_INPUT_PATH=/opt/nvidia/holoscan/data python3 video_replayer.py --config /tmp/vr.yaml "

3b. video_replayer (C++, 10 frames) — same headless injection as 3a. The C++

3b. video_replayer(C++版,10帧)——无头模式注入方式与3a相同。C++版YAML硬编码了
directory: "../data/racerx"

YAML hard-codes
directory: "../data/racerx"
, but HOLOSCAN_INPUT_PATH

但HOLOSCAN_INPUT_PATH会覆盖该路径,因此无需修改该字段。

overrides it, so we don't need to patch that field.

"${RUN[@]}" "$IMG" bash -c " ulimit -s 32768 sed -e 's/count: 0/count: 10/'
-e 's/repeat: true/repeat: false/'
-e 's/realtime: true/realtime: false/'
-e 's/^ width: 854/ headless: true\n width: 854/'
/opt/nvidia/holoscan/examples/video_replayer/cpp/video_replayer.yaml > /tmp/vr_cpp.yaml cd /opt/nvidia/holoscan/examples/video_replayer/cpp HOLOSCAN_INPUT_PATH=/opt/nvidia/holoscan/data ./video_replayer --config /tmp/vr_cpp.yaml "
undefined
"${RUN[@]}" "$IMG" bash -c " ulimit -s 32768 sed -e 's/count: 0/count: 10/'
-e 's/repeat: true/repeat: false/'
-e 's/realtime: true/realtime: false/'
-e 's/^ width: 854/ headless: true\n width: 854/'
/opt/nvidia/holoscan/examples/video_replayer/cpp/video_replayer.yaml > /tmp/vr_cpp.yaml cd /opt/nvidia/holoscan/examples/video_replayer/cpp HOLOSCAN_INPUT_PATH=/opt/nvidia/holoscan/data ./video_replayer --config /tmp/vr_cpp.yaml "
undefined

Step 4: Launch command

步骤4:启动命令

bash
docker run -it --rm \
  --runtime=nvidia --gpus all --cap-add CAP_SYS_PTRACE \
  --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
  nvcr.io/nvidia/clara-holoscan/holoscan:<TAG>
bash
docker run -it --rm \
  --runtime=nvidia --gpus all --cap-add CAP_SYS_PTRACE \
  --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
  nvcr.io/nvidia/clara-holoscan/holoscan:<TAG>

Examples: /opt/nvidia/holoscan/examples/

示例路径:/opt/nvidia/holoscan/examples/

Mount files: -v /host/path:/container/path

挂载文件:-v /host/path:/container/path

GUI examples: add -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY

GUI示例:添加参数 -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY


Next:
- Explore: `ls /opt/nvidia/holoscan/examples/`
- Walk through one: `/holoscan-explain-example`

下一步:
- 探索示例:`ls /opt/nvidia/holoscan/examples/`
- 学习示例:`/holoscan-explain-example`

Troubleshooting

故障排除

  • docker: Error response from daemon: could not select device driver "nvidia"
    .
    NVIDIA Container Toolkit is missing or not configured. Install per the link in Step 2 and restart Docker.
  • CUDA init failure inside the container. Tag suffix doesn't match the host. Re-check
    nvidia-smi
    CUDA Version and the table in Step 1.
  • Segmentation fault when launching an example.
    ulimit -s 32768
    wasn't applied inside the container. Use the
    bash -c "ulimit -s 32768 && ..."
    pattern shown in Step 3.
  • Holoviz example hangs / no window over SSH. YAML wasn't patched to
    headless: true
    . Use the
    sed
    injection shown in Step 3.
  • video_replayer
    can't find data.
    Set
    HOLOSCAN_INPUT_PATH=/opt/nvidia/holoscan/data
    — overrides the YAML's hard-coded path.
  • docker: Error response from daemon: could not select device driver "nvidia"
    .
    缺少或未配置NVIDIA Container Toolkit。按照步骤2中的链接安装并重启Docker。
  • 容器内CUDA初始化失败:标签后缀与宿主机不匹配。重新检查
    nvidia-smi
    的CUDA版本和步骤1中的表格。
  • 启动示例时出现段错误:未在容器内应用
    ulimit -s 32768
    。使用步骤3中所示的
    bash -c "ulimit -s 32768 && ..."
    格式。
  • Holoviz示例挂起 / SSH环境下无窗口:未将YAML修改为
    headless: true
    。使用步骤3中所示的sed注入方式。
  • video_replayer
    无法找到数据
    :设置
    HOLOSCAN_INPUT_PATH=/opt/nvidia/holoscan/data
    ——该环境变量会覆盖YAML中的硬编码路径。