nemotron-speech

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Nemotron Speech Skills

Nemotron Speech技能

Note: "Nemotron Speech" is the public-facing name for what NVIDIA documents today as Riva / Riva NIM. All commands, container images, gRPC APIs, Python imports, and documentation URLs still use "Riva" — the rename is brand-only. Do not rename commands, images, or doc URLs.

Agent: When walking the user through a multi-step workflow, announce each step before presenting it: Step N/M — Step Title (e.g., "Step 1/4 — Deploy the Container").

注意：“Nemotron Speech”是NVIDIA当前文档中称为Riva / Riva NIM的对外公开名称。所有命令、容器镜像、gRPC API、Python导入以及文档URL仍使用**"Riva"**——此次更名仅涉及品牌层面。请勿重命名命令、镜像或文档URL。

**Agent要求：**在引导用户完成多步骤工作流时，需在展示步骤前先告知步骤信息：第N/M步 — 步骤标题（例如："第1/4步 — 部署容器"）。

Purpose

用途

Single entry point for all NVIDIA Nemotron Speech (Riva) NIM workflows: ASR (speech-to-text), TTS (text-to-speech), and NMT (translation). Covers cloud-hosted inference via build.nvidia.com, self-hosted Docker deployment, client-protocol choice for ASR (gRPC, HTTP, WebSocket), custom NeMo model deployment via

riva-build

, ASR pipeline tuning (VAD, diarization, language models), and the prerequisite Docker / NGC / driver setup.

所有NVIDIA Nemotron Speech（Riva）NIM工作流的统一入口：包括ASR（语音转文本）、TTS（文本转语音）和NMT（机器翻译）。涵盖通过build.nvidia.com进行的云托管推理、自托管Docker部署、ASR的客户端协议选择（gRPC、HTTP、WebSocket）、通过

riva-build

部署自定义NeMo模型、ASR流水线调优（VAD、说话人分离、语言模型），以及Docker / NGC / 驱动程序的前置设置。

When to Use This Skill

何时使用此技能

Use this skill for any Nemotron Speech / Riva NIM task — deployment, testing, custom model build, system requirements check, or model selection across ASR / TTS / NMT modalities.

任何Nemotron Speech / Riva NIM相关任务均可使用此技能——包括部署、测试、自定义模型构建、系统需求检查，或在ASR / TTS / NMT模态中选择模型。

Workflow

工作流程

Identify the user's task type, then load the corresponding reference file from

references/

. The reference files contain the detailed per-workflow content; this SKILL.md is a routing surface. Load only the reference relevant to the task at hand.

识别用户的任务类型，然后从

references/

加载对应的参考文件。参考文件包含各工作流的详细内容；本SKILL.md仅作为路由入口。仅加载与当前任务相关的参考文件。

Prerequisites

前提条件

For self-hosted deployment: NVIDIA AI Enterprise (NVAIE) entitlement, then complete the environment setup — NVIDIA drivers, Docker, Container Toolkit, NGC API key, Riva Python client. See
```
references/setup.md
```
.
For cloud-hosted inference:
```
pip install -U nvidia-riva-client
```
and a valid
```
NVIDIA_API_KEY
```
from https://build.nvidia.com.
Treat
```
NVIDIA_API_KEY
```
and
```
NGC_API_KEY
```
as secrets: never print, paste, commit, or log real key values. Prefer
```
--password-stdin
```
for Docker login and store persistent keys in a credential manager or a
```
chmod 600
```
env file rather than world-readable shell startup files.
For self-hosted Docker model caching: host directories mounted at
```
/opt/nim/.cache
```
must be writable by the container user (the NIM container runs as
```
nvs:1000
```
internally), not just the host user. Run
```
sudo chown 1000:1000 $LOCAL_NIM_CACHE
```
after creating the directory so the container can write to it. Avoid world-writable modes — they let any local user replace cached model artifacts. Also avoid
```
-u "$(id -u):$(id -g)"
```
on the docker run —
```
/opt/nim/workspace
```
inside the container isn't writable to arbitrary UIDs. If you see
```
I/O error Permission denied (os error 13)
```
during model download, the host directory ownership is the issue.

对于自托管部署：需拥有NVIDIA AI Enterprise（NVAIE）授权，然后完成环境设置——NVIDIA驱动程序、Docker、Container Toolkit、NGC API密钥、Riva Python客户端。请查看
```
references/setup.md
```
。
对于云托管推理：需执行
```
pip install -U nvidia-riva-client
```
并拥有从https://build.nvidia.com获取的有效`NVIDIA_API_KEY`。
请将
```
NVIDIA_API_KEY
```
和
```
NGC_API_KEY
```
视为机密信息：切勿打印、粘贴、提交或记录真实密钥值。Docker登录时优先使用
```
--password-stdin
```
，并将持久化密钥存储在凭证管理器或权限为
```
chmod 600
```
的环境文件中，而非全局可读的Shell启动文件。
对于自托管Docker模型缓存：挂载到
```
/opt/nim/.cache
```
的主机目录必须允许容器用户（NIM容器内部以
```
nvs:1000
```
身份运行）写入，而非仅允许主机用户写入。创建目录后执行
```
sudo chown 1000:1000 $LOCAL_NIM_CACHE
```
，以便容器能够写入该目录。避免使用全局可写模式——这会让任何本地用户替换缓存的模型工件。同时避免在docker run命令中使用
```
-u "$(id -u):$(id -g)"
```
——容器内的
```
/opt/nim/workspace
```
不允许任意UID写入。如果在模型下载过程中出现
```
I/O error Permission denied (os error 13)
```
错误，说明主机目录权限存在问题。

Instructions

操作说明

Match the user's task to one reference file and load only that file; the references are detailed, so progressive disclosure keeps context tight.
Route setup requests for drivers, Docker, Container Toolkit, and NGC to
```
references/setup.md
```
.
Route GPU compatibility, deployment readiness, and container health checks to
```
references/deployment-readiness-checks.md
```
.
Route model choice across ASR, TTS, and NMT to
```
references/model-selection.md
```
.
Route ASR deployment or inference for Parakeet, Canary, Whisper, and Nemotron ASR Streaming to
```
references/asr.md
```
.
Route custom-trained NeMo ASR deployment (
```
.nemo
```
→ RMIR → NIM) to
```
references/asr-custom.md
```
.
Route ASR pipeline configuration for VAD, diarization, language models, and chunk size to
```
references/pipelines.md
```
.
Route TTS deployment or inference for Magpie to
```
references/tts.md
```
.
Route NMT deployment or inference for Riva Translate, language pairs, and DNT tags to
```
references/nmt.md
```
.

将用户任务与一个参考文件匹配，仅加载该文件；参考文件内容详细，逐步披露可保持上下文简洁。
驱动程序、Docker、Container Toolkit和NGC的设置请求，请路由至
```
references/setup.md
```
。
GPU兼容性、部署就绪性和容器健康检查请求，请路由至
```
references/deployment-readiness-checks.md
```
。
ASR、TTS和NMT的模型选择请求，请路由至
```
references/model-selection.md
```
。
Parakeet、Canary、Whisper和Nemotron ASR Streaming的ASR部署或推理请求，请路由至
```
references/asr.md
```
。
自定义训练的NeMo ASR部署（
```
.nemo
```
→ RMIR → NIM）请求，请路由至
```
references/asr-custom.md
```
。
VAD、说话人分离、语言模型和块大小的ASR流水线配置请求，请路由至
```
references/pipelines.md
```
。
Magpie的TTS部署或推理请求，请路由至
```
references/tts.md
```
。
Riva Translate、语言对和DNT标签的NMT部署或推理请求，请路由至
```
references/nmt.md
```
。

Source of truth

权威来源

For per-release detail — current model catalog, container IDs, function IDs, voice lists, VRAM minimums, per-model feature support — fetch or open the canonical NVIDIA doc rather than relying on text in this SKILL.md or the references. Each reference file includes its own routing table to the relevant doc pages.

Top-level landing pages:

Topic	URL
ASR support matrix	https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/asr.html
TTS support matrix	https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/tts.html
NMT support matrix	https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/nmt.html
Prerequisites (driver / GPU / OS)	https://docs.nvidia.com/nim/speech/latest/get-started/prerequisites.html
ASR pipeline configuration	https://docs.nvidia.com/nim/speech/latest/asr/customization/pipeline-configuration.html
ASR runtime customization	https://docs.nvidia.com/nim/speech/latest/asr/customization/customization.html
Cloud function IDs (per model)	`https://build.nvidia.com/<org>/<model>/api`
NGC catalog	https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/models

关于每个版本的详细信息——当前模型目录、容器ID、功能ID、语音列表、最低显存要求、各模型的功能支持——请获取或查看NVIDIA官方文档，而非依赖本SKILL.md或参考文件中的文本。每个参考文件都包含指向相关文档页面的路由表。

顶级登录页面：

主题	URL
ASR支持矩阵	https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/asr.html
TTS支持矩阵	https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/tts.html
NMT支持矩阵	https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/nmt.html
前提条件（驱动程序/GPU/操作系统）	https://docs.nvidia.com/nim/speech/latest/get-started/prerequisites.html
ASR流水线配置	https://docs.nvidia.com/nim/speech/latest/asr/customization/pipeline-configuration.html
ASR运行时自定义	https://docs.nvidia.com/nim/speech/latest/asr/customization/customization.html
云功能ID（按模型）	`https://build.nvidia.com/<org>/<model>/api`
NGC目录	https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/models

Examples

示例

"Deploy a Parakeet ASR NIM" → load

references/asr.md

, follow Option B (self-hosted), Steps 1–4.

"Synthesize speech with Magpie" → load

references/tts.md

, follow Option A (cloud) or Option B (self-hosted).

"Translate English to German" → load

references/nmt.md

, follow the 4-step flow.

"Convert my fine-tuned
.nemo
to a NIM" → load

references/asr-custom.md

for the 4-phase pipeline and

references/pipelines.md

for build-time config.

"Can my GPU run this?" → load

references/deployment-readiness-checks.md

and run the 6-step system check.

"Which Riva model should I use?" → load

references/model-selection.md

, apply the decision framework, then fetch the support matrix for the specific current model name.

"部署Parakeet ASR NIM" → 加载

references/asr.md

，遵循选项B（自托管）的步骤1–4。

"使用Magpie合成语音" → 加载

references/tts.md

，遵循选项A（云）或选项B（自托管）。

"将英语翻译成德语" → 加载

references/nmt.md

，遵循4步流程。

"将我微调后的
.nemo
转换为NIM" → 加载

references/asr-custom.md

查看4阶段流水线，并加载

references/pipelines.md

查看构建时配置。

"我的GPU能否运行这个？" → 加载

references/deployment-readiness-checks.md

并运行6步系统检查。

"我应该使用哪个Riva模型？" → 加载

references/model-selection.md

，应用决策框架，然后查看特定当前模型名称的支持矩阵。

Naming & Terminology

命名与术语

Skill brand: Nemotron Speech (public-facing name).
Internal naming preserved: commands (
```
riva-build
```
,
```
riva-deploy
```
,
```
riva_streaming_asr_client
```
), Python client (
```
riva.client
```
), gRPC namespace (
```
nvidia.riva.asr.*
```
), container registry (
```
nvcr.io/nim/nvidia/*
```
), and all NVIDIA documentation URLs still use "Riva". Do not rename these in code, commands, or docs.

技能品牌：Nemotron Speech（对外公开名称）。
保留内部命名：命令（
```
riva-build
```
、
```
riva-deploy
```
、
```
riva_streaming_asr_client
```
）、Python客户端（
```
riva.client
```
）、gRPC命名空间（
```
nvidia.riva.asr.*
```
）、容器注册表（
```
nvcr.io/nim/nvidia/*
```
）以及所有NVIDIA文档URL仍使用**"Riva"**。请勿在代码、命令或文档中重命名这些内容。

Troubleshooting

故障排除

For task-specific runtime or modality issues, use the relevant reference file (

references/<task>.md

). Cross-cutting readiness checks:

Container does not become ready →
```
references/deployment-readiness-checks.md
```
(system check + health check table)

Health check fails →

references/deployment-readiness-checks.md

docker pull
from
nvcr.io
returns 403 →
```
references/setup.md
```
(Step 5 — Docker login)
Wrong base image / model architecture mismatch →
```
references/asr-custom.md
```
(Phase 2 base image)
VRAM / GPU compatibility →
```
references/deployment-readiness-checks.md
```
, then verify on the support matrix

针对特定任务的运行时或模态问题，请使用相关参考文件（

references/<task>.md

）。跨领域的就绪性检查：

容器无法就绪 →
```
references/deployment-readiness-checks.md
```
（系统检查+健康检查表）

健康检查失败 →

references/deployment-readiness-checks.md

从
nvcr.io
执行
docker pull
返回403错误 →
```
references/setup.md
```
（步骤5 — Docker登录）
基础镜像错误/模型架构不匹配 →
```
references/asr-custom.md
```
（阶段2基础镜像）
显存/GPU兼容性 →
```
references/deployment-readiness-checks.md
```
，然后在支持矩阵中验证

Limitations

限制条件

x86_64 architecture only — WSL2 on Windows requires Podman and supports a subset of NIMs (see
```
references/setup.md
```
)
Self-hosted deployment requires an NVIDIA AI Enterprise license
Cloud-hosted inference requires an active
```
NVIDIA_API_KEY
```
and internet access
Public skill branding is "Nemotron Speech"; commands, container images, Python imports (
```
riva.client
```
), gRPC services (
```
nvidia.riva.*
```
), and NVIDIA documentation URLs still use "Riva" — follow official docs and catalogs for naming, do not rename these in commands or code

仅支持x86_64架构——Windows上的WSL2需要Podman，且仅支持部分NIM（请查看
```
references/setup.md
```
）
自托管部署需要NVIDIA AI Enterprise许可证
云托管推理需要有效的
```
NVIDIA_API_KEY
```
和互联网访问
公开技能品牌为**"Nemotron Speech"；命令、容器镜像、Python导入（
riva.client
）、gRPC服务（
nvidia.riva.*
）以及NVIDIA文档URL仍使用"Riva"**——请遵循官方文档和目录中的命名规则，切勿在命令或代码中重命名这些内容

Next Steps

后续步骤

Verify hardware compatibility:

references/deployment-readiness-checks.md

Set up the environment:
```
references/setup.md
```
Pick a model:
```
references/model-selection.md
```

Deploy:

references/asr.md

references/tts.md

, or

references/nmt.md

验证硬件兼容性：

references/deployment-readiness-checks.md

设置环境：
```
references/setup.md
```
选择模型：
```
references/model-selection.md
```

部署：

references/asr.md

、

references/tts.md

或

references/nmt.md