nemotron-speech

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Nemotron Speech Skills

Nemotron Speech技能

Note: "Nemotron Speech" is the public-facing name for what NVIDIA documents today as Riva / Riva NIM. All commands, container images, gRPC APIs, Python imports, and documentation URLs still use "Riva" — the rename is brand-only. Do not rename commands, images, or doc URLs.
Agent: When walking the user through a multi-step workflow, announce each step before presenting it: Step N/M — Step Title (e.g., "Step 1/4 — Deploy the Container").
注意:“Nemotron Speech”是NVIDIA当前文档中称为Riva / Riva NIM的对外公开名称。所有命令、容器镜像、gRPC API、Python导入以及文档URL仍使用**"Riva"**——此次更名仅涉及品牌层面。请勿重命名命令、镜像或文档URL。
**Agent要求:**在引导用户完成多步骤工作流时,需在展示步骤前先告知步骤信息:第N/M步 — 步骤标题(例如:"第1/4步 — 部署容器")。

Purpose

用途

Single entry point for all NVIDIA Nemotron Speech (Riva) NIM workflows: ASR (speech-to-text), TTS (text-to-speech), and NMT (translation). Covers cloud-hosted inference via build.nvidia.com, self-hosted Docker deployment, client-protocol choice for ASR (gRPC, HTTP, WebSocket), custom NeMo model deployment via
riva-build
, ASR pipeline tuning (VAD, diarization, language models), and the prerequisite Docker / NGC / driver setup.
所有NVIDIA Nemotron Speech(Riva)NIM工作流的统一入口:包括ASR(语音转文本)、TTS(文本转语音)和NMT(机器翻译)。涵盖通过build.nvidia.com进行的云托管推理、自托管Docker部署、ASR的客户端协议选择(gRPC、HTTP、WebSocket)、通过
riva-build
部署自定义NeMo模型、ASR流水线调优(VAD、说话人分离、语言模型),以及Docker / NGC / 驱动程序的前置设置。

When to Use This Skill

何时使用此技能

Use this skill for any Nemotron Speech / Riva NIM task — deployment, testing, custom model build, system requirements check, or model selection across ASR / TTS / NMT modalities.
任何Nemotron Speech / Riva NIM相关任务均可使用此技能——包括部署、测试、自定义模型构建、系统需求检查,或在ASR / TTS / NMT模态中选择模型。

Workflow

工作流程

Identify the user's task type, then load the corresponding reference file from
references/
. The reference files contain the detailed per-workflow content; this SKILL.md is a routing surface. Load only the reference relevant to the task at hand.
识别用户的任务类型,然后从
references/
加载对应的参考文件。参考文件包含各工作流的详细内容;本SKILL.md仅作为路由入口。仅加载与当前任务相关的参考文件。

Prerequisites

前提条件

  • For self-hosted deployment: NVIDIA AI Enterprise (NVAIE) entitlement, then complete the environment setup — NVIDIA drivers, Docker, Container Toolkit, NGC API key, Riva Python client. See
    references/setup.md
    .
  • For cloud-hosted inference:
    pip install -U nvidia-riva-client
    and a valid
    NVIDIA_API_KEY
    from https://build.nvidia.com.
  • Treat
    NVIDIA_API_KEY
    and
    NGC_API_KEY
    as secrets: never print, paste, commit, or log real key values. Prefer
    --password-stdin
    for Docker login and store persistent keys in a credential manager or a
    chmod 600
    env file rather than world-readable shell startup files.
  • For self-hosted Docker model caching: host directories mounted at
    /opt/nim/.cache
    must be writable by the container user (the NIM container runs as
    nvs:1000
    internally), not just the host user. Run
    sudo chown 1000:1000 $LOCAL_NIM_CACHE
    after creating the directory so the container can write to it. Avoid world-writable modes — they let any local user replace cached model artifacts. Also avoid
    -u "$(id -u):$(id -g)"
    on the docker run —
    /opt/nim/workspace
    inside the container isn't writable to arbitrary UIDs. If you see
    I/O error Permission denied (os error 13)
    during model download, the host directory ownership is the issue.
  • 对于自托管部署:需拥有NVIDIA AI Enterprise(NVAIE)授权,然后完成环境设置——NVIDIA驱动程序、Docker、Container Toolkit、NGC API密钥、Riva Python客户端。请查看
    references/setup.md
  • 对于云托管推理:需执行
    pip install -U nvidia-riva-client
    并拥有从https://build.nvidia.com获取的有效`NVIDIA_API_KEY`。
  • 请将
    NVIDIA_API_KEY
    NGC_API_KEY
    视为机密信息:切勿打印、粘贴、提交或记录真实密钥值。Docker登录时优先使用
    --password-stdin
    ,并将持久化密钥存储在凭证管理器或权限为
    chmod 600
    的环境文件中,而非全局可读的Shell启动文件。
  • 对于自托管Docker模型缓存:挂载到
    /opt/nim/.cache
    的主机目录必须允许容器用户(NIM容器内部以
    nvs:1000
    身份运行)写入,而非仅允许主机用户写入。创建目录后执行
    sudo chown 1000:1000 $LOCAL_NIM_CACHE
    ,以便容器能够写入该目录。避免使用全局可写模式——这会让任何本地用户替换缓存的模型工件。同时避免在docker run命令中使用
    -u "$(id -u):$(id -g)"
    ——容器内的
    /opt/nim/workspace
    不允许任意UID写入。如果在模型下载过程中出现
    I/O error Permission denied (os error 13)
    错误,说明主机目录权限存在问题。

Instructions

操作说明

  • Match the user's task to one reference file and load only that file; the references are detailed, so progressive disclosure keeps context tight.
  • Route setup requests for drivers, Docker, Container Toolkit, and NGC to
    references/setup.md
    .
  • Route GPU compatibility, deployment readiness, and container health checks to
    references/deployment-readiness-checks.md
    .
  • Route model choice across ASR, TTS, and NMT to
    references/model-selection.md
    .
  • Route ASR deployment or inference for Parakeet, Canary, Whisper, and Nemotron ASR Streaming to
    references/asr.md
    .
  • Route custom-trained NeMo ASR deployment (
    .nemo
    → RMIR → NIM) to
    references/asr-custom.md
    .
  • Route ASR pipeline configuration for VAD, diarization, language models, and chunk size to
    references/pipelines.md
    .
  • Route TTS deployment or inference for Magpie to
    references/tts.md
    .
  • Route NMT deployment or inference for Riva Translate, language pairs, and DNT tags to
    references/nmt.md
    .
  • 将用户任务与一个参考文件匹配,仅加载该文件;参考文件内容详细,逐步披露可保持上下文简洁。
  • 驱动程序、Docker、Container Toolkit和NGC的设置请求,请路由至
    references/setup.md
  • GPU兼容性、部署就绪性和容器健康检查请求,请路由至
    references/deployment-readiness-checks.md
  • ASR、TTS和NMT的模型选择请求,请路由至
    references/model-selection.md
  • Parakeet、Canary、Whisper和Nemotron ASR Streaming的ASR部署或推理请求,请路由至
    references/asr.md
  • 自定义训练的NeMo ASR部署(
    .nemo
    → RMIR → NIM)请求,请路由至
    references/asr-custom.md
  • VAD、说话人分离、语言模型和块大小的ASR流水线配置请求,请路由至
    references/pipelines.md
  • Magpie的TTS部署或推理请求,请路由至
    references/tts.md
  • Riva Translate、语言对和DNT标签的NMT部署或推理请求,请路由至
    references/nmt.md

Source of truth

权威来源

For per-release detail — current model catalog, container IDs, function IDs, voice lists, VRAM minimums, per-model feature support — fetch or open the canonical NVIDIA doc rather than relying on text in this SKILL.md or the references. Each reference file includes its own routing table to the relevant doc pages.
Top-level landing pages:
关于每个版本的详细信息——当前模型目录、容器ID、功能ID、语音列表、最低显存要求、各模型的功能支持——请获取或查看NVIDIA官方文档,而非依赖本SKILL.md或参考文件中的文本。每个参考文件都包含指向相关文档页面的路由表。
顶级登录页面:

Examples

示例

"Deploy a Parakeet ASR NIM" → load
references/asr.md
, follow Option B (self-hosted), Steps 1–4.
"Synthesize speech with Magpie" → load
references/tts.md
, follow Option A (cloud) or Option B (self-hosted).
"Translate English to German" → load
references/nmt.md
, follow the 4-step flow.
"Convert my fine-tuned
.nemo
to a NIM"
→ load
references/asr-custom.md
for the 4-phase pipeline and
references/pipelines.md
for build-time config.
"Can my GPU run this?" → load
references/deployment-readiness-checks.md
and run the 6-step system check.
"Which Riva model should I use?" → load
references/model-selection.md
, apply the decision framework, then fetch the support matrix for the specific current model name.
"部署Parakeet ASR NIM" → 加载
references/asr.md
,遵循选项B(自托管)的步骤1–4。
"使用Magpie合成语音" → 加载
references/tts.md
,遵循选项A(云)或选项B(自托管)。
"将英语翻译成德语" → 加载
references/nmt.md
,遵循4步流程。
"将我微调后的
.nemo
转换为NIM"
→ 加载
references/asr-custom.md
查看4阶段流水线,并加载
references/pipelines.md
查看构建时配置。
"我的GPU能否运行这个?" → 加载
references/deployment-readiness-checks.md
并运行6步系统检查。
"我应该使用哪个Riva模型?" → 加载
references/model-selection.md
,应用决策框架,然后查看特定当前模型名称的支持矩阵。

Naming & Terminology

命名与术语

  • Skill brand: Nemotron Speech (public-facing name).
  • Internal naming preserved: commands (
    riva-build
    ,
    riva-deploy
    ,
    riva_streaming_asr_client
    ), Python client (
    riva.client
    ), gRPC namespace (
    nvidia.riva.asr.*
    ), container registry (
    nvcr.io/nim/nvidia/*
    ), and all NVIDIA documentation URLs still use "Riva". Do not rename these in code, commands, or docs.
  • 技能品牌:Nemotron Speech(对外公开名称)。
  • 保留内部命名:命令(
    riva-build
    riva-deploy
    riva_streaming_asr_client
    )、Python客户端(
    riva.client
    )、gRPC命名空间(
    nvidia.riva.asr.*
    )、容器注册表(
    nvcr.io/nim/nvidia/*
    )以及所有NVIDIA文档URL仍使用**"Riva"**。请勿在代码、命令或文档中重命名这些内容。

Troubleshooting

故障排除

For task-specific runtime or modality issues, use the relevant reference file (
references/<task>.md
). Cross-cutting readiness checks:
  • Container does not become ready
    references/deployment-readiness-checks.md
    (system check + health check table)
  • Health check fails
    references/deployment-readiness-checks.md
  • docker pull
    from
    nvcr.io
    returns 403
    references/setup.md
    (Step 5 — Docker login)
  • Wrong base image / model architecture mismatch
    references/asr-custom.md
    (Phase 2 base image)
  • VRAM / GPU compatibility
    references/deployment-readiness-checks.md
    , then verify on the support matrix
针对特定任务的运行时或模态问题,请使用相关参考文件(
references/<task>.md
)。跨领域的就绪性检查:
  • 容器无法就绪
    references/deployment-readiness-checks.md
    (系统检查+健康检查表)
  • 健康检查失败
    references/deployment-readiness-checks.md
  • nvcr.io
    执行
    docker pull
    返回403错误
    references/setup.md
    (步骤5 — Docker登录)
  • 基础镜像错误/模型架构不匹配
    references/asr-custom.md
    (阶段2基础镜像)
  • 显存/GPU兼容性
    references/deployment-readiness-checks.md
    ,然后在支持矩阵中验证

Limitations

限制条件

  • x86_64 architecture only — WSL2 on Windows requires Podman and supports a subset of NIMs (see
    references/setup.md
    )
  • Self-hosted deployment requires an NVIDIA AI Enterprise license
  • Cloud-hosted inference requires an active
    NVIDIA_API_KEY
    and internet access
  • Public skill branding is "Nemotron Speech"; commands, container images, Python imports (
    riva.client
    ), gRPC services (
    nvidia.riva.*
    ), and NVIDIA documentation URLs still use "Riva" — follow official docs and catalogs for naming, do not rename these in commands or code
  • 仅支持x86_64架构——Windows上的WSL2需要Podman,且仅支持部分NIM(请查看
    references/setup.md
  • 自托管部署需要NVIDIA AI Enterprise许可证
  • 云托管推理需要有效的
    NVIDIA_API_KEY
    和互联网访问
  • 公开技能品牌为**"Nemotron Speech";命令、容器镜像、Python导入(
    riva.client
    )、gRPC服务(
    nvidia.riva.*
    )以及NVIDIA文档URL仍使用
    "Riva"**——请遵循官方文档和目录中的命名规则,切勿在命令或代码中重命名这些内容

Next Steps

后续步骤

  • Verify hardware compatibility:
    references/deployment-readiness-checks.md
  • Set up the environment:
    references/setup.md
  • Pick a model:
    references/model-selection.md
  • Deploy:
    references/asr.md
    ,
    references/tts.md
    , or
    references/nmt.md
  • 验证硬件兼容性:
    references/deployment-readiness-checks.md
  • 设置环境:
    references/setup.md
  • 选择模型:
    references/model-selection.md
  • 部署:
    references/asr.md
    references/tts.md
    references/nmt.md