vss-deploy-video-embedding
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVSS Video Embedding (RT-Embed)
VSS视频嵌入服务(RT-Embed)
Use this skill when you need to:
- Deploy the VSS Video Embedding microservice from a Docker Compose file.
- Generate text or video embeddings against the Cosmos-Embed1-448p model.
- Embed an uploaded file, an HTTP/S3/file/data URL, or a live RTSP stream.
- Wire the service into a VSS deployment alongside Redis, Kafka, and OpenTelemetry.
- Triage readiness, model-download, GPU, or stream-reconnection failures.
Trigger phrases: , , , , , , , , .
vss-deploy-video-embeddingRT-Embedrtvi-embedvideo embedding serviceCosmos-Embed1embed live streamembed video filegenerate video embeddingstext embedding for video search当你需要完成以下操作时使用本技能:
- 通过Docker Compose文件部署VSS视频嵌入微服务。
- 基于Cosmos-Embed1-448p模型生成文本或视频嵌入。
- 对上传文件、HTTP/S3/file/data URL或实时RTSP流进行嵌入处理。
- 将该服务与Redis、Kafka和OpenTelemetry集成到VSS部署架构中。
- 排查就绪状态、模型下载、GPU或流重连相关故障。
触发短语: , , , , , , , , 。
vss-deploy-video-embeddingRT-Embedrtvi-embedvideo embedding serviceCosmos-Embed1embed live streamembed video filegenerate video embeddingstext embedding for video searchService Snapshot
服务快照
- VSS 3.2 GA skill: .
vss-deploy-video-embedding - Legacy 3.1 name: RT-Embed.
- Compose service: .
rtvi-embed - Container name: .
vss-rtvi-embed - Image: (override with
nvcr.io/nvstaging/vss-core/vss-rt-embed).RTVI_EMBED_IMAGE - Default tag: (override with
3.2.0-26.05.4).RTVI_EMBED_TAG - Profile: .
bp_developer_search_2d - Container port: (host-side
8000).${RTVI_EMBED_PORT} - Default model: from
cosmos-embed1-448p.nvidia/Cosmos-Embed1-448p - Health endpoint: .
GET /v1/ready - Healthcheck startup grace: (20 minutes) on first boot.
1200s
- VSS 3.2 GA技能: 。
vss-deploy-video-embedding - 旧版3.1名称: RT-Embed。
- Compose服务名: 。
rtvi-embed - 容器名称: 。
vss-rtvi-embed - 镜像: (可通过
nvcr.io/nvstaging/vss-core/vss-rt-embed覆盖)。RTVI_EMBED_IMAGE - 默认标签: (可通过
3.2.0-26.05.4覆盖)。RTVI_EMBED_TAG - 配置文件: 。
bp_developer_search_2d - 容器端口: (主机侧为
8000)。${RTVI_EMBED_PORT} - 默认模型: 来自的
nvidia/Cosmos-Embed1-448p。cosmos-embed1-448p - 健康检查端点: 。
GET /v1/ready - 健康检查启动宽限期: 首次启动时为(20分钟)。
1200s
Prerequisites
前提条件
Before bringing the service up:
- NVIDIA driver + NVIDIA Container Toolkit installed; default runtime set to .
nvidia - Docker Engine and Docker Compose plugin recent enough to support conditional volume substitution.
${VAR:+value} - completed with
docker login nvcr.ioand a valid NGC API key.$oauthtoken - Host environment provides at minimum: ,
RTVI_EMBED_PORT,VSS_DATA_DIR, and optionallyNGC_API_KEYto avoid Hugging Face 429 rate-limit errors during the Cosmos-Embed1 weights download.HF_TOKEN - Free disk space for persistent caches: ,
rtvi-hf-cache,rtvi-ngc-model-cache(multi-GB).rtvi-triton-model-repo
See for the full prerequisite list and for the variable matrix.
references/deploy-vss-deploy-video-embedding.mdreferences/environment.md启动服务前需满足:
- 已安装NVIDIA驱动 + NVIDIA Container Toolkit;默认运行时设置为。
nvidia - Docker Engine和Docker Compose插件版本足够新,支持条件式卷替换。
${VAR:+value} - 已使用和有效的NGC API密钥完成
$oauthtoken登录。docker login nvcr.io - 主机环境至少提供:、
RTVI_EMBED_PORT、VSS_DATA_DIR,可选提供NGC_API_KEY以避免Cosmos-Embed1权重下载时出现Hugging Face 429限流错误。HF_TOKEN - 为持久化缓存预留足够磁盘空间:、
rtvi-hf-cache、rtvi-ngc-model-cache(多GB级)。rtvi-triton-model-repo
完整前提条件列表请查看,变量矩阵请查看。
references/deploy-vss-deploy-video-embedding.mdreferences/environment.mdDeploy
部署
For standalone RT-Embed, work from the service directory:
bash
cd "{{repo_root}}/deploy/docker/services/rtvi/rtvi-embed"Do not use or for this standalone deployment.
/vss-deploy-profilescripts/dev-profile.shSet a minimal standalone environment before :
docker compose upbash
export RTVI_EMBED_PORT=8017
export VSS_DATA_DIR="${VSS_DATA_DIR:-$(pwd)/.standalone-data}"
export NGC_API_KEY="<your-ngc-api-key>"
export HOST_IP="$(hostname -I | awk '{print $1}')"
export HF_TOKEN="${HF_TOKEN:-}" # optional, but recommended to avoid HF 429s
mkdir -p "${VSS_DATA_DIR}/data_log/vst/clip_storage"
export RTVI_EMBED_KAFKA_ENABLED=false
export ENABLE_REDIS_ERROR_MESSAGES=falseThis avoids mounting from filesystem root when is unset, and prevents startup stalls from missing Kafka/Redis peers in standalone mode.
/data_log/vst/clip_storageVSS_DATA_DIRbash
undefined对于独立部署的RT-Embed,请从服务目录开始操作:
bash
cd "{{repo_root}}/deploy/docker/services/rtvi/rtvi-embed"请勿使用或进行此独立部署。
/vss-deploy-profilescripts/dev-profile.sh在执行前设置最小化独立环境:
docker compose upbash
export RTVI_EMBED_PORT=8017
export VSS_DATA_DIR="${VSS_DATA_DIR:-$(pwd)/.standalone-data}"
export NGC_API_KEY="<your-ngc-api-key>"
export HOST_IP="$(hostname -I | awk '{print $1}')"
export HF_TOKEN="${HF_TOKEN:-}" # 可选,但建议设置以避免HF 429限流错误
mkdir -p "${VSS_DATA_DIR}/data_log/vst/clip_storage"
export RTVI_EMBED_KAFKA_ENABLED=false
export ENABLE_REDIS_ERROR_MESSAGES=false此设置可避免当未设置时从文件系统根目录挂载,并防止独立模式下因缺少Kafka/Redis节点导致启动停滞。
VSS_DATA_DIR/data_log/vst/clip_storagebash
undefinedBring up the service under the required Compose profile.
在所需的Compose配置文件下启动服务。
docker compose -f rtvi-embed-docker-compose.yml
--profile bp_developer_search_2d up -d rtvi-embed
--profile bp_developer_search_2d up -d rtvi-embed
docker compose -f rtvi-embed-docker-compose.yml
--profile bp_developer_search_2d up -d rtvi-embed
--profile bp_developer_search_2d up -d rtvi-embed
Watch logs while the model downloads and Triton repo builds.
查看日志,监控模型下载和Triton仓库构建过程。
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
First-boot startup may take 20 minutes for the Cosmos-Embed1 download and Triton model repository build. Do not shorten the `start_period: 1200s` healthcheck during the first boot or the container will be marked unhealthy while still warming up.docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
首次启动时,Cosmos-Embed1模型下载和Triton模型仓库构建可能需要20分钟。首次启动期间请勿缩短`start_period: 1200s`的健康检查时间,否则容器在预热阶段会被标记为不健康。Verify
验证
bash
BASE_URL="http://localhost:${RTVI_EMBED_PORT}"
curl -fsS "$BASE_URL/v1/ready" # 200 when warm.
curl -fsS "$BASE_URL/v1/ready?detailed=true" # Component-level status.
curl -fsS "$BASE_URL/v1/version"
MODELS_JSON=$(curl -fsS "$BASE_URL/v1/models")
echo "$MODELS_JSON" # Confirms cosmos-embed1-448p is loaded.
MODEL_ID="$(echo "$MODELS_JSON" | jq -r '.data[0].id // empty')"
test -n "$MODEL_ID" || { echo "ERROR: /v1/models has no model id — wait until /v1/ready is 200" >&2; exit 1; }The sections below that call the API reuse and from this block.
$BASE_URL$MODEL_IDbash
BASE_URL="http://localhost:${RTVI_EMBED_PORT}"
curl -fsS "$BASE_URL/v1/ready" # 服务预热完成后返回200。
curl -fsS "$BASE_URL/v1/ready?detailed=true" # 组件级状态。
curl -fsS "$BASE_URL/v1/version"
MODELS_JSON=$(curl -fsS "$BASE_URL/v1/models")
echo "$MODELS_JSON" # 确认cosmos-embed1-448p已加载。
MODEL_ID="$(echo "$MODELS_JSON" | jq -r '.data[0].id // empty')"
test -n "$MODEL_ID" || { echo "ERROR: /v1/models未返回模型ID — 等待/v1/ready返回200" >&2; exit 1; }后续调用API的章节会复用此代码块中的和。
$BASE_URL$MODEL_IDCommon Operations
常见操作
Generate video embeddings from an uploaded file
从上传文件生成视频嵌入
bash
FILE_ID=$(curl -fsS -X POST "$BASE_URL/v1/files" \
-F purpose=vision \
-F media_type=video \
-F file=@/path/to/clip.mp4 | jq -r .id)
curl -fsS -X POST "$BASE_URL/v1/generate_video_embeddings" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$FILE_ID\",
\"model\": \"$MODEL_ID\",
\"chunk_duration\": 60,
\"chunk_overlap_duration\": 10
}"bash
FILE_ID=$(curl -fsS -X POST "$BASE_URL/v1/files" \
-F purpose=vision \
-F media_type=video \
-F file=@/path/to/clip.mp4 | jq -r .id)
curl -fsS -X POST "$BASE_URL/v1/generate_video_embeddings" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$FILE_ID\",
\"model\": \"$MODEL_ID\",
\"chunk_duration\": 60,
\"chunk_overlap_duration\": 10
}"Generate text embeddings (for text-to-video search)
生成文本嵌入(用于文本到视频搜索)
bash
curl -fsS -X POST "$BASE_URL/v1/generate_text_embeddings" \
-H "Content-Type: application/json" \
-d "{\"text_input\":\"a forklift moving pallets\",\"model\":\"${MODEL_ID}\"}"bash
curl -fsS -X POST "$BASE_URL/v1/generate_text_embeddings" \
-H "Content-Type: application/json" \
-d "{\"text_input\":\"a forklift moving pallets\",\"model\":\"${MODEL_ID}\"}"Embed a live RTSP stream
嵌入实时RTSP流
Live streams require and . A synchronous call returns , and the returned by is a placeholder — it must be overridden on the embed request or you get .
stream: truechunk_duration > 0400 BadParameters: "Only streaming output is supported for live-streams"chunk_duration: 0streams/add400 BadParameter: "chunk_duration must be greater than 0"POST /v1/streams/addliveStreamUrlstream_idGET /v1/streams/get-stream-infobash
STREAM_ID=$(curl -fsS -X POST "$BASE_URL/v1/streams/add" \
-H "Content-Type: application/json" \
-d '{"streams":[{"liveStreamUrl":"rtsp://host:port/live/video","description":"camera-001"}]}' \
| jq -r '.results[0].id')
curl -N -X POST "$BASE_URL/v1/generate_video_embeddings" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d "{
\"id\": \"$STREAM_ID\",
\"model\": \"$MODEL_ID\",
\"stream\": true,
\"chunk_duration\": 10,
\"chunk_overlap_duration\": 2
}"实时流必须设置且。同步调用会返回,返回的是占位符 — 必须在嵌入请求中覆盖该值,否则会返回。
stream: truechunk_duration > 0400 BadParameters: "Only streaming output is supported for live-streams"streams/addchunk_duration: 0400 BadParameter: "chunk_duration must be greater than 0"POST /v1/streams/addliveStreamUrlstream_idGET /v1/streams/get-stream-infobash
STREAM_ID=$(curl -fsS -X POST "$BASE_URL/v1/streams/add" \
-H "Content-Type: application/json" \
-d '{"streams":[{"liveStreamUrl":"rtsp://host:port/live/video","description":"camera-001"}]}' \
| jq -r '.results[0].id')
curl -N -X POST "$BASE_URL/v1/generate_video_embeddings" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d "{
\"id\": \"$STREAM_ID\",
\"model\": \"$MODEL_ID\",
\"stream\": true,
\"chunk_duration\": 10,
\"chunk_overlap_duration\": 2
}"List registered live streams (use this to recover stream_ids across sessions).
列出已注册的实时流(用于跨会话恢复stream_id)。
curl -fsS "$BASE_URL/v1/streams/get-stream-info"
curl -fsS "$BASE_URL/v1/streams/get-stream-info"
Stop embedding for the stream when done (terminates SSE with data: [DONE]).
完成后停止流嵌入(终止SSE并返回data: [DONE])。
curl -fsS -X DELETE "$BASE_URL/v1/generate_video_embeddings/$STREAM_ID"
See `references/rest-api.md` for the full endpoint catalog, SSE streaming, and single-stream control-plane patterns.curl -fsS -X DELETE "$BASE_URL/v1/generate_video_embeddings/$STREAM_ID"
完整端点目录、SSE流和单流控制平面模式请查看`references/rest-api.md`。Logs, Metrics, And Status
日志、指标与状态
bash
docker compose -f rtvi-embed-docker-compose.yml ps
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
docker stats vss-rtvi-embed
curl -fsS "$BASE_URL/v1/metrics" # Prometheus.
curl -fsS "$BASE_URL/v1/assets/stats" # Asset storage counts and TTL.If is bound to a host directory, log files are also available at on the host.
RTVI_EMBED_LOG_DIR/opt/nvidia/rtvi/log/rtvi/bash
docker compose -f rtvi-embed-docker-compose.yml ps
docker compose -f rtvi-embed-docker-compose.yml logs -f rtvi-embed
docker stats vss-rtvi-embed
curl -fsS "$BASE_URL/v1/metrics" # Prometheus指标。
curl -fsS "$BASE_URL/v1/assets/stats" # 资产存储计数与TTL。如果绑定到主机目录,日志文件也可在主机的路径下获取。
RTVI_EMBED_LOG_DIR/opt/nvidia/rtvi/log/rtvi/Integration Surface
集成接口
- Inputs: REST API on (
:${RTVI_EMBED_PORT},POST /v1/files,POST /v1/generate_text_embeddings, live-stream control endpoints).POST /v1/generate_video_embeddings - Outputs: Synchronous REST responses, optional SSE for chunked video embeddings, optional Kafka messages on the topics named by (container
RTVI_EMBED_KAFKA_TOPIC) andKAFKA_TOPIC(containerRTVI_EMBED_ERROR_MESSAGE_TOPIC) when Kafka is enabled (host:ERROR_MESSAGE_TOPIC, which Compose maps to containerRTVI_EMBED_KAFKA_ENABLED=true).KAFKA_ENABLED - Optional peers: Redis (), Kafka (host:
ENABLE_REDIS_ERROR_MESSAGES=true→ containerRTVI_EMBED_KAFKA_ENABLED=true), OpenTelemetry collector (host:KAFKA_ENABLED→ containerRTVI_EMBED_ENABLE_OTEL_MONITORING=true).ENABLE_OTEL_MONITORING
references/integrate-vss-deploy-video-embedding.md- 输入: 上的REST API(
:${RTVI_EMBED_PORT}、POST /v1/files、POST /v1/generate_text_embeddings、实时流控制端点)。POST /v1/generate_video_embeddings - 输出: 同步REST响应、可选的分块视频嵌入SSE流、当Kafka启用时(主机侧:,Compose映射到容器侧
RTVI_EMBED_KAFKA_ENABLED=true),可选的Kafka消息发送到KAFKA_ENABLED(容器侧RTVI_EMBED_KAFKA_TOPIC)和KAFKA_TOPIC(容器侧RTVI_EMBED_ERROR_MESSAGE_TOPIC指定的主题)。ERROR_MESSAGE_TOPIC - 可选依赖: Redis()、Kafka(主机侧:
ENABLE_REDIS_ERROR_MESSAGES=true→ 容器侧RTVI_EMBED_KAFKA_ENABLED=true)、OpenTelemetry收集器(主机侧:KAFKA_ENABLED→ 容器侧RTVI_EMBED_ENABLE_OTEL_MONITORING=true)。ENABLE_OTEL_MONITORING
完整集成协议请查看。
references/integrate-vss-deploy-video-embedding.mdTroubleshooting
故障排查
For common failure patterns and resolutions, see . Frequent issues:
references/troubleshooting.md- stuck at 503 → check for missing
/v1/ready, Hugging Face 429 rate-limit failures during the first-boot model download (setNGC_API_KEYto avoid), or unreachable Redis/Kafka peers when those flags are enabled.HF_TOKEN - Healthcheck flipping unhealthy in the first 20 minutes → restore .
start_period: 1200s - Permission errors on bind-mounted cache directories → on the host paths.
chown -R 1001:1001
常见故障模式与解决方案请查看。高频问题:
references/troubleshooting.md- 返回503状态 → 检查是否缺少
/v1/ready、首次启动模型下载时出现Hugging Face 429限流错误(设置NGC_API_KEY可避免),或启用相关标志时无法连接Redis/Kafka节点。HF_TOKEN - 首次启动20分钟内健康检查频繁切换为不健康 → 恢复设置。
start_period: 1200s - 绑定挂载的缓存目录出现权限错误 → 在主机路径上执行。
chown -R 1001:1001
Upgrade And Rollback
升级与回滚
- Update and
RTVI_EMBED_IMAGEto the target build.RTVI_EMBED_TAG - .
docker compose -f rtvi-embed-docker-compose.yml pull rtvi-embed - .
docker compose -f rtvi-embed-docker-compose.yml --profile bp_developer_search_2d up -d rtvi-embed - Watch until it returns 200.
/v1/ready - To roll back, re-pin to the previous build and repeat. Named volumes persist across the swap.
RTVI_EMBED_TAG
- 将和
RTVI_EMBED_IMAGE更新为目标版本。RTVI_EMBED_TAG - 执行。
docker compose -f rtvi-embed-docker-compose.yml pull rtvi-embed - 执行。
docker compose -f rtvi-embed-docker-compose.yml --profile bp_developer_search_2d up -d rtvi-embed - 监控直到返回200状态。
/v1/ready - 如需回滚,将重新指定为之前的版本并重复上述步骤。命名卷会在版本切换时保留。
RTVI_EMBED_TAG
Tear Down
服务拆除
bash
undefinedbash
undefinedPreserve caches (named volumes survive).
保留缓存(命名卷会保留)。
docker compose -f rtvi-embed-docker-compose.yml down
docker compose -f rtvi-embed-docker-compose.yml down
WARNING: removes rtvi-hf-cache, rtvi-ngc-model-cache, rtvi-triton-model-repo.
警告:此操作会删除rtvi-hf-cache、rtvi-ngc-model-cache、rtvi-triton-model-repo。
Next start will re-download the model and rebuild the Triton repo (20+ min).
下次启动时需重新下载模型并重建Triton仓库(耗时20+分钟)。
docker compose -f rtvi-embed-docker-compose.yml down -v
undefineddocker compose -f rtvi-embed-docker-compose.yml down -v
undefinedReferences
参考文档
| File | When to read |
|---|---|
| references/README.md | Table of contents for all reference files. |
| references/deploy-vss-deploy-video-embedding.md | Build Vision Agent deployment reference: image, GPU, storage, startup, prerequisites, known issues. |
| references/integrate-vss-deploy-video-embedding.md | Build Vision Agent integration reference: peers, inputs/outputs, env vars, network, example Compose snippet. |
| references/rest-api.md | Full REST endpoint catalog with worked |
| references/environment.md | Complete environment-variable matrix, including host-to-container renames and secret-sensitive variables. |
| references/troubleshooting.md | Operational diagnostics for startup, model/cache, runtime, and observability issues. |
| 文件 | 阅读场景 |
|---|---|
| references/README.md | 所有参考文件的目录。 |
| references/deploy-vss-deploy-video-embedding.md | 视觉代理部署参考:镜像、GPU、存储、启动流程、前提条件、已知问题。 |
| references/integrate-vss-deploy-video-embedding.md | 视觉代理集成参考:依赖组件、输入/输出、环境变量、网络、Compose示例片段。 |
| references/rest-api.md | 完整REST端点目录,包含文件上传、视频/文本嵌入、实时流、健康/指标的 |
| references/environment.md | 完整环境变量矩阵,包含主机到容器的变量重命名和敏感变量说明。 |
| references/troubleshooting.md | 启动、模型/缓存、运行时、可观测性问题的运维诊断指南。 |