nemotron-speech
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNemotron Speech Skills
Nemotron Speech技能
Note: "Nemotron Speech" is the public-facing name for what NVIDIA documents today as Riva / Riva NIM. All commands, container images, gRPC APIs, Python imports, and documentation URLs still use "Riva" — the rename is brand-only. Do not rename commands, images, or doc URLs.Agent: When walking the user through a multi-step workflow, announce each step before presenting it: Step N/M — Step Title (e.g., "Step 1/4 — Deploy the Container").
注意:“Nemotron Speech”是NVIDIA当前文档中称为Riva / Riva NIM的对外公开名称。所有命令、容器镜像、gRPC API、Python导入以及文档URL仍使用**"Riva"**——此次更名仅涉及品牌层面。请勿重命名命令、镜像或文档URL。**Agent要求:**在引导用户完成多步骤工作流时,需在展示步骤前先告知步骤信息:第N/M步 — 步骤标题(例如:"第1/4步 — 部署容器")。
Purpose
用途
Single entry point for all NVIDIA Nemotron Speech (Riva) NIM workflows: ASR (speech-to-text), TTS (text-to-speech), and NMT (translation). Covers cloud-hosted inference via build.nvidia.com, self-hosted Docker deployment, client-protocol choice for ASR (gRPC, HTTP, WebSocket), custom NeMo model deployment via , ASR pipeline tuning (VAD, diarization, language models), and the prerequisite Docker / NGC / driver setup.
riva-build所有NVIDIA Nemotron Speech(Riva)NIM工作流的统一入口:包括ASR(语音转文本)、TTS(文本转语音)和NMT(机器翻译)。涵盖通过build.nvidia.com进行的云托管推理、自托管Docker部署、ASR的客户端协议选择(gRPC、HTTP、WebSocket)、通过部署自定义NeMo模型、ASR流水线调优(VAD、说话人分离、语言模型),以及Docker / NGC / 驱动程序的前置设置。
riva-buildWhen to Use This Skill
何时使用此技能
Use this skill for any Nemotron Speech / Riva NIM task — deployment, testing, custom model build, system requirements check, or model selection across ASR / TTS / NMT modalities.
任何Nemotron Speech / Riva NIM相关任务均可使用此技能——包括部署、测试、自定义模型构建、系统需求检查,或在ASR / TTS / NMT模态中选择模型。
Workflow
工作流程
Identify the user's task type, then load the corresponding reference file from . The reference files contain the detailed per-workflow content; this SKILL.md is a routing surface. Load only the reference relevant to the task at hand.
references/识别用户的任务类型,然后从加载对应的参考文件。参考文件包含各工作流的详细内容;本SKILL.md仅作为路由入口。仅加载与当前任务相关的参考文件。
references/Prerequisites
前提条件
- For self-hosted deployment: NVIDIA AI Enterprise (NVAIE) entitlement, then complete the environment setup — NVIDIA drivers, Docker, Container Toolkit, NGC API key, Riva Python client. See .
references/setup.md - For cloud-hosted inference: and a valid
pip install -U nvidia-riva-clientfrom https://build.nvidia.com.NVIDIA_API_KEY - Treat and
NVIDIA_API_KEYas secrets: never print, paste, commit, or log real key values. PreferNGC_API_KEYfor Docker login and store persistent keys in a credential manager or a--password-stdinenv file rather than world-readable shell startup files.chmod 600 - For self-hosted Docker model caching: host directories mounted at must be writable by the container user (the NIM container runs as
/opt/nim/.cacheinternally), not just the host user. Runnvs:1000after creating the directory so the container can write to it. Avoid world-writable modes — they let any local user replace cached model artifacts. Also avoidsudo chown 1000:1000 $LOCAL_NIM_CACHEon the docker run —-u "$(id -u):$(id -g)"inside the container isn't writable to arbitrary UIDs. If you see/opt/nim/workspaceduring model download, the host directory ownership is the issue.I/O error Permission denied (os error 13)
- 对于自托管部署:需拥有NVIDIA AI Enterprise(NVAIE)授权,然后完成环境设置——NVIDIA驱动程序、Docker、Container Toolkit、NGC API密钥、Riva Python客户端。请查看。
references/setup.md - 对于云托管推理:需执行并拥有从https://build.nvidia.com获取的有效`NVIDIA_API_KEY`。
pip install -U nvidia-riva-client - 请将和
NVIDIA_API_KEY视为机密信息:切勿打印、粘贴、提交或记录真实密钥值。Docker登录时优先使用NGC_API_KEY,并将持久化密钥存储在凭证管理器或权限为--password-stdin的环境文件中,而非全局可读的Shell启动文件。chmod 600 - 对于自托管Docker模型缓存:挂载到的主机目录必须允许容器用户(NIM容器内部以
/opt/nim/.cache身份运行)写入,而非仅允许主机用户写入。创建目录后执行nvs:1000,以便容器能够写入该目录。避免使用全局可写模式——这会让任何本地用户替换缓存的模型工件。同时避免在docker run命令中使用sudo chown 1000:1000 $LOCAL_NIM_CACHE——容器内的-u "$(id -u):$(id -g)"不允许任意UID写入。如果在模型下载过程中出现/opt/nim/workspace错误,说明主机目录权限存在问题。I/O error Permission denied (os error 13)
Instructions
操作说明
- Match the user's task to one reference file and load only that file; the references are detailed, so progressive disclosure keeps context tight.
- Route setup requests for drivers, Docker, Container Toolkit, and NGC to .
references/setup.md - Route GPU compatibility, deployment readiness, and container health checks to .
references/deployment-readiness-checks.md - Route model choice across ASR, TTS, and NMT to .
references/model-selection.md - Route ASR deployment or inference for Parakeet, Canary, Whisper, and Nemotron ASR Streaming to .
references/asr.md - Route custom-trained NeMo ASR deployment (→ RMIR → NIM) to
.nemo.references/asr-custom.md - Route ASR pipeline configuration for VAD, diarization, language models, and chunk size to .
references/pipelines.md - Route TTS deployment or inference for Magpie to .
references/tts.md - Route NMT deployment or inference for Riva Translate, language pairs, and DNT tags to .
references/nmt.md
- 将用户任务与一个参考文件匹配,仅加载该文件;参考文件内容详细,逐步披露可保持上下文简洁。
- 驱动程序、Docker、Container Toolkit和NGC的设置请求,请路由至。
references/setup.md - GPU兼容性、部署就绪性和容器健康检查请求,请路由至。
references/deployment-readiness-checks.md - ASR、TTS和NMT的模型选择请求,请路由至。
references/model-selection.md - Parakeet、Canary、Whisper和Nemotron ASR Streaming的ASR部署或推理请求,请路由至。
references/asr.md - 自定义训练的NeMo ASR部署(→ RMIR → NIM)请求,请路由至
.nemo。references/asr-custom.md - VAD、说话人分离、语言模型和块大小的ASR流水线配置请求,请路由至。
references/pipelines.md - Magpie的TTS部署或推理请求,请路由至。
references/tts.md - Riva Translate、语言对和DNT标签的NMT部署或推理请求,请路由至。
references/nmt.md
Source of truth
权威来源
For per-release detail — current model catalog, container IDs, function IDs, voice lists, VRAM minimums, per-model feature support — fetch or open the canonical NVIDIA doc rather than relying on text in this SKILL.md or the references. Each reference file includes its own routing table to the relevant doc pages.
Top-level landing pages:
| Topic | URL |
|---|---|
| ASR support matrix | https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/asr.html |
| TTS support matrix | https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/tts.html |
| NMT support matrix | https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/nmt.html |
| Prerequisites (driver / GPU / OS) | https://docs.nvidia.com/nim/speech/latest/get-started/prerequisites.html |
| ASR pipeline configuration | https://docs.nvidia.com/nim/speech/latest/asr/customization/pipeline-configuration.html |
| ASR runtime customization | https://docs.nvidia.com/nim/speech/latest/asr/customization/customization.html |
| Cloud function IDs (per model) | |
| NGC catalog | https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/models |
关于每个版本的详细信息——当前模型目录、容器ID、功能ID、语音列表、最低显存要求、各模型的功能支持——请获取或查看NVIDIA官方文档,而非依赖本SKILL.md或参考文件中的文本。每个参考文件都包含指向相关文档页面的路由表。
顶级登录页面:
Examples
示例
"Deploy a Parakeet ASR NIM" → load , follow Option B (self-hosted), Steps 1–4.
references/asr.md"Synthesize speech with Magpie" → load , follow Option A (cloud) or Option B (self-hosted).
references/tts.md"Translate English to German" → load , follow the 4-step flow.
references/nmt.md"Convert my fine-tuned to a NIM" → load for the 4-phase pipeline and for build-time config.
.nemoreferences/asr-custom.mdreferences/pipelines.md"Can my GPU run this?" → load and run the 6-step system check.
references/deployment-readiness-checks.md"Which Riva model should I use?" → load , apply the decision framework, then fetch the support matrix for the specific current model name.
references/model-selection.md"部署Parakeet ASR NIM" → 加载,遵循选项B(自托管)的步骤1–4。
references/asr.md"使用Magpie合成语音" → 加载,遵循选项A(云)或选项B(自托管)。
references/tts.md"将英语翻译成德语" → 加载,遵循4步流程。
references/nmt.md"将我微调后的转换为NIM" → 加载查看4阶段流水线,并加载查看构建时配置。
.nemoreferences/asr-custom.mdreferences/pipelines.md"我的GPU能否运行这个?" → 加载并运行6步系统检查。
references/deployment-readiness-checks.md"我应该使用哪个Riva模型?" → 加载,应用决策框架,然后查看特定当前模型名称的支持矩阵。
references/model-selection.mdNaming & Terminology
命名与术语
- Skill brand: Nemotron Speech (public-facing name).
- Internal naming preserved: commands (,
riva-build,riva-deploy), Python client (riva_streaming_asr_client), gRPC namespace (riva.client), container registry (nvidia.riva.asr.*), and all NVIDIA documentation URLs still use "Riva". Do not rename these in code, commands, or docs.nvcr.io/nim/nvidia/*
- 技能品牌:Nemotron Speech(对外公开名称)。
- 保留内部命名:命令(、
riva-build、riva-deploy)、Python客户端(riva_streaming_asr_client)、gRPC命名空间(riva.client)、容器注册表(nvidia.riva.asr.*)以及所有NVIDIA文档URL仍使用**"Riva"**。请勿在代码、命令或文档中重命名这些内容。nvcr.io/nim/nvidia/*
Troubleshooting
故障排除
For task-specific runtime or modality issues, use the relevant reference file (). Cross-cutting readiness checks:
references/<task>.md- Container does not become ready → (system check + health check table)
references/deployment-readiness-checks.md - Health check fails →
references/deployment-readiness-checks.md - from
docker pullreturns 403 →nvcr.io(Step 5 — Docker login)references/setup.md - Wrong base image / model architecture mismatch → (Phase 2 base image)
references/asr-custom.md - VRAM / GPU compatibility → , then verify on the support matrix
references/deployment-readiness-checks.md
针对特定任务的运行时或模态问题,请使用相关参考文件()。跨领域的就绪性检查:
references/<task>.md- 容器无法就绪 → (系统检查+健康检查表)
references/deployment-readiness-checks.md - 健康检查失败 →
references/deployment-readiness-checks.md - 从执行
nvcr.io返回403错误 →docker pull(步骤5 — Docker登录)references/setup.md - 基础镜像错误/模型架构不匹配 → (阶段2基础镜像)
references/asr-custom.md - 显存/GPU兼容性 → ,然后在支持矩阵中验证
references/deployment-readiness-checks.md
Limitations
限制条件
- x86_64 architecture only — WSL2 on Windows requires Podman and supports a subset of NIMs (see )
references/setup.md - Self-hosted deployment requires an NVIDIA AI Enterprise license
- Cloud-hosted inference requires an active and internet access
NVIDIA_API_KEY - Public skill branding is "Nemotron Speech"; commands, container images, Python imports (), gRPC services (
riva.client), and NVIDIA documentation URLs still use "Riva" — follow official docs and catalogs for naming, do not rename these in commands or codenvidia.riva.*
- 仅支持x86_64架构——Windows上的WSL2需要Podman,且仅支持部分NIM(请查看)
references/setup.md - 自托管部署需要NVIDIA AI Enterprise许可证
- 云托管推理需要有效的和互联网访问
NVIDIA_API_KEY - 公开技能品牌为**"Nemotron Speech";命令、容器镜像、Python导入()、gRPC服务(
riva.client)以及NVIDIA文档URL仍使用"Riva"**——请遵循官方文档和目录中的命名规则,切勿在命令或代码中重命名这些内容nvidia.riva.*
Next Steps
后续步骤
- Verify hardware compatibility:
references/deployment-readiness-checks.md - Set up the environment:
references/setup.md - Pick a model:
references/model-selection.md - Deploy: ,
references/asr.md, orreferences/tts.mdreferences/nmt.md
- 验证硬件兼容性:
references/deployment-readiness-checks.md - 设置环境:
references/setup.md - 选择模型:
references/model-selection.md - 部署:、
references/asr.md或references/tts.mdreferences/nmt.md