Loading...
Loading...
Compare original and translation side by side
| User Intent | Action |
|---|---|
| Deploy, install, set up, start RAG | Read and follow |
| Configure, enable, change, toggle a feature | Use the Configure section below |
| Troubleshoot, debug, fix, error, unhealthy | Read and follow |
| Stop, shutdown, tear down, clean up | Read and follow |
| 用户意图 | 操作 |
|---|---|
| 部署、安装、搭建、启动RAG | 阅读并遵循 |
| 配置、启用、修改、切换功能 | 使用下方的配置章节 |
| 故障排查、调试、修复、错误、异常状态 | 阅读并遵循 |
| 停止、关闭、拆除、清理 | 阅读并遵循 |
references/deploy.md| Feature Keywords | Reference |
|---|---|
| VLM, VLM embeddings, image captioning | |
| NeMo Guardrails | |
| Query rewriting, decomposition, multi-turn | |
| Ingestion (text-only, audio, Nemotron Parse, OCR, batch CLI, NV-Ingest, volume mount, performance) | |
| Search, retrieval, hybrid search, multi-collection, metadata, filters, reranker, topK, accuracy/performance | |
| LLM/embedding/ranking model changes, vector DB, Milvus/Elasticsearch auth, service keys, model profiles, ports/GPU | |
| Reasoning, self-reflection, prompts, generation params (tokens, temperature, citations), per-request LLM params | |
| Summarization | |
| Observability (tracing, Zipkin, Grafana, Prometheus) | |
| Multimodal query (image + text) | |
| Data catalog (collection/document metadata) | |
| User interface (UI settings) | |
| API reference (endpoints, schemas) | |
| Evaluation (RAGAS metrics) | |
| MCP server & client, agent toolkit | |
| Migration (version upgrades) | |
| Notebooks (setup and catalog) | |
references/deploy.md| 功能关键词 | 参考文件 |
|---|---|
| VLM、VLM嵌入、图像字幕生成 | |
| NeMo Guardrails | |
| 查询重写、分解、多轮对话 | |
| 数据摄入(纯文本、音频、Nemotron Parse、OCR、批量CLI、NV-Ingest、卷挂载、性能) | |
| 搜索、检索、混合搜索、多集合、元数据、过滤器、重排序器、topK、准确性/性能 | |
| LLM/嵌入/排序模型变更、向量数据库、Milvus/Elasticsearch认证、服务密钥、模型配置文件、端口/GPU | |
| 推理、自我反思、提示词、生成参数(令牌、温度、引用)、单请求LLM参数 | |
| 摘要生成 | |
| 可观测性(追踪、Zipkin、Grafana、Prometheus) | |
| 多模态查询(图像+文本) | |
| 数据目录(集合/文档元数据) | |
| 用户界面(UI设置) | |
| API参考(端点、模式) | |
| 评估(RAGAS指标) | |
| MCP服务器与客户端、Agent工具包 | |
| 迁移(版本升级) | |
| 笔记本(设置与目录) | |
echo "=== NIM ===" && docker ps --format '{{.Names}}' 2>/dev/null | grep -iE '(nim-llm|nemoretriever-embedding|nemoretriever-ranking|nemo-vlm|nemotron-vlm)' || echo "NO_LOCAL_NIMS"; echo "=== RAG ===" && docker ps --format '{{.Names}}' 2>/dev/null | grep -iE '(rag-server|ingestor-server|milvus)' || echo "NO_DOCKER_RAG"; echo "=== K8S ===" && kubectl get pods -n rag 2>/dev/null | head -5 || echo "NO_K8S"; echo "=== LIBRARY ===" && ps aux 2>/dev/null | grep -E '(nvidia_rag|uvicorn.*rag)' | grep -v grep || echo "NO_LIBRARY"| Local NIMs running? | RAG services running? | Deployment Type | Config Location |
|---|---|---|---|
| Yes (Docker) | Any | Self-hosted | |
| No | Yes (Docker) | NVIDIA-hosted | |
| Yes (K8s pods) | Any | Self-hosted | |
| No | Yes (K8s pods) | NVIDIA-hosted | |
| — | Library processes | Library mode | |
| No | No | Not running | Deploy first via |
deploy/compose/.envdocker exec rag-server env 2>/dev/null | grep -E "<VAR_NAME>"kubectl get pod -n rag -l app=rag-server -o jsonpath='{.items[0].spec.containers[0].env}' 2>/dev/nullnvidia-smi --query-gpu=index,name,memory.total,memory.used --format=csv,noheader 2>/dev/null || echo "NO_GPU"source <env-file> && docker compose -f deploy/compose/<compose-file> up -d| Service | Compose File |
|---|---|
| rag-server | |
| ingestor-server | |
| milvus, etcd, minio | |
| NIM containers (LLM, embedding, ranking, VLM, OCR) | |
| guardrails | |
| observability (Grafana, Prometheus, Zipkin) | |
values.yamlhelm upgrade rag <chart> -n rag -f values.yamlnotebooks/config.yamldocker ps --format "table {{.Names}}\t{{.Status}}" | head -20; curl -s http://localhost:8081/v1/health?check_dependencies=true 2>/dev/null | head -1kubectl get pods -n rag; kubectl rollout status deployment/rag-server -n rag --timeout=120scurl -s http://localhost:8081/v1/health 2>/dev/null | head -1references/troubleshoot.mdecho "=== NIM ===" && docker ps --format '{{.Names}}' 2>/dev/null | grep -iE '(nim-llm|nemoretriever-embedding|nemoretriever-ranking|nemo-vlm|nemotron-vlm)' || echo "NO_LOCAL_NIMS"; echo "=== RAG ===" && docker ps --format '{{.Names}}' 2>/dev/null | grep -iE '(rag-server|ingestor-server|milvus)' || echo "NO_DOCKER_RAG"; echo "=== K8S ===" && kubectl get pods -n rag 2>/dev/null | head -5 || echo "NO_K8S"; echo "=== LIBRARY ===" && ps aux 2>/dev/null | grep -E '(nvidia_rag|uvicorn.*rag)' | grep -v grep || echo "NO_LIBRARY"| 是否运行本地NIM? | 是否运行RAG服务? | 部署类型 | 配置位置 |
|---|---|---|---|
| 是(Docker) | 任意 | 自托管 | |
| 否 | 是(Docker) | NVIDIA托管 | |
| 是(K8s Pod) | 任意 | 自托管 | |
| 否 | 是(K8s Pod) | NVIDIA托管 | |
| — | 库进程 | 库模式 | |
| 否 | 否 | 未运行 | 先通过 |
deploy/compose/.envdocker exec rag-server env 2>/dev/null | grep -E "<VAR_NAME>"kubectl get pod -n rag -l app=rag-server -o jsonpath='{.items[0].spec.containers[0].env}' 2>/dev/nullnvidia-smi --query-gpu=index,name,memory.total,memory.used --format=csv,noheader 2>/dev/null || echo "NO_GPU"source <env-file> && docker compose -f deploy/compose/<compose-file> up -d| 服务 | Compose文件 |
|---|---|
| rag-server | |
| ingestor-server | |
| milvus, etcd, minio | |
| NIM容器(LLM、嵌入、排序、VLM、OCR) | |
| guardrails | |
| 可观测性(Grafana、Prometheus、Zipkin) | |
values.yamlhelm upgrade rag <chart> -n rag -f values.yamlnotebooks/config.yamldocker ps --format "table {{.Names}}\t{{.Status}}" | head -20; curl -s http://localhost:8081/v1/health?check_dependencies=true 2>/dev/null | head -1kubectl get pods -n rag; kubectl rollout status deployment/rag-server -n rag --timeout=120scurl -s http://localhost:8081/v1/health 2>/dev/null | head -1references/troubleshoot.mdgrep -E "^(export )?(ENABLE_|APP_)" <config-file> 2>/dev/null | sortgrep -E "^(export )?(ENABLE_|APP_)" <config-file> 2>/dev/null | sortdocs/support-matrix.mddocs/service-port-gpu-reference.md| GPU | Feature Restrictions |
|---|---|
| B200 | No VLM, No Guardrails, No Nemotron Parse. May need multi-GPU LLM ( |
| RTX PRO 6000 | No Nemotron Parse. No Audio on Helm. |
docs/support-matrix.mddocs/service-port-gpu-reference.md| GPU | 功能限制 |
|---|---|
| B200 | 不支持VLM、不支持防护机制、不支持Nemotron Parse。可能需要多GPU LLM( |
| RTX PRO 6000 | 不支持Nemotron Parse。Helm部署不支持音频功能。 |