vss-summarize-video

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Instructions

操作说明

Follow the routing tables and step-by-step workflows below. Each section that ends in workflow, quick start, or flow is intended to be executed top-to-bottom. Detailed reference material lives in
references/
and helper scripts live in
scripts/
— call them via
run_script
when the skill points to a script by name.
请遵循以下路由表和分步工作流。所有以workflowquick startflow结尾的部分都需要从上到下依次执行。详细参考资料位于
references/
目录下,辅助脚本位于
scripts/
目录下——当技能指向某个脚本名称时,通过
run_script
调用它们。

Examples

示例

Worked end-to-end examples are kept under
evals/
(each
*.json
manifest contains a runnable scenario) and inline in the per-workflow
curl
blocks below. Run a Tier-3 evaluation with
nv-base validate <this-skill-dir> --agent-eval
to replay them.
You are a video summarization assistant. You call the VLM NIM or the video summarization microservice directly. Always run
curl
commands yourself; never instruct the user to run them.
Primary video workflow query type: "Summarize this video." Direct video summarization API and service-ops requests are handled by the reference-routed sections below.
完整的端到端示例存放在
evals/
目录下(每个
*.json
清单文件包含一个可运行的场景),同时也内嵌在下方每个工作流的
curl
代码块中。使用
nv-base validate <this-skill-dir> --agent-eval
命令运行三级评估即可复现这些示例。
您是视频总结助手。您可以直接调用VLM NIM或视频总结微服务。请自行运行
curl
命令;切勿指示用户运行。
主要视频工作流查询类型:"总结这段视频。" 直接的视频总结API和服务运维请求由下方的参考路由部分处理。

Purpose

用途

Produce a single, polished narrative summary of one recorded video clip, with timestamped events when the LVS microservice path is reachable.
Do NOT use this skill for:
  • Live RTSP captioning — use
    vss-deploy-dense-captioning
    .
  • Incident-range or alert-window reports — use
    vss-generate-video-report
    Mode B.
  • Semantic search across the archive — use
    vss-search-archive
    .
生成单个录制视频片段的完整、精炼的叙事性总结,当LVS微服务路径可用时,还会包含带时间戳的事件信息。
请勿将此技能用于:
  • 实时RTSP字幕生成——请使用
    vss-deploy-dense-captioning
  • 事件范围或告警窗口报告——请使用
    vss-generate-video-report
    的模式B。
  • 跨归档文件的语义搜索——请使用
    vss-search-archive

Prerequisites

前提条件

  • VSS
    lvs
    profile running on
    $HOST_IP
    (port 38111) OR a reachable VLM/RT-VLM endpoint as a fallback. The
    vss-deploy-profile
    skill brings these up.
  • Network reachability from the agent host to both endpoints; clip URLs from VIOS must be fetchable by the chosen backend.
  • jq
    and
    curl
    available on the agent host.
  • VSS的
    lvs
    配置文件在
    $HOST_IP
    (端口38111)上运行,或者有一个可访问的VLM/RT-VLM端点作为备选。
    vss-deploy-profile
    技能可启动这些服务。
  • 代理主机与两个端点之间网络可达;所选后端必须能够获取来自VIOS的片段URL。
  • 代理主机上已安装
    jq
    curl

Limitations

限制条件

  • Direct VLM fallback uses a single fixed prompt and cannot target scenario/events — output quality is lower than the LVS path.
  • Remote VLM endpoints generally cannot reach
    localhost
    /private clip URLs.
  • One backend call per request; no parallel hedging or multi-pass summaries.
  • 直接使用VLM备选方案时采用单一固定提示词,无法针对特定场景/事件——输出质量低于LVS路径。
  • 远程VLM端点通常无法访问
    localhost
    /私有片段URL。
  • 每个请求仅调用一次后端;不支持并行对冲或多轮总结。

Troubleshooting

故障排查

SymptomCauseFix
/v1/ready
returns 503 repeatedly
LVS service still warming upRetry up to ~30 s as shown in Setup; if it never returns 200 the service may not be deployed
Empty
video_summary
and
events
Clip does not contain the requested eventsRe-run with broader
scenario
or different
events
VLM returns
<think>
block
Cosmos Reason 2 reasoning modeStrip everything up to
</think>
before rendering
Empty stdout from
curl /v1/ready
Service legitimately returns 200 with empty bodyAlways check HTTP status with
-o /dev/null -w '%{http_code}'
, never inspect the body
See
references/video-summarization-debugging.md
for deeper diagnostics.
症状原因解决方法
/v1/ready
反复返回503
LVS服务仍在预热中按照设置部分的说明重试约30秒;如果始终无法返回200,可能服务未部署
video_summary
events
为空
视频片段中不包含请求的事件使用更宽泛的
scenario
或不同的
events
重新运行
VLM返回
<think>
Cosmos Reason 2推理模式在输出前移除
</think>
之前的所有内容
curl /v1/ready
返回空标准输出
服务合法返回200但响应体为空始终使用
-o /dev/null -w '%{http_code}'
检查HTTP状态码,不要检查响应体
如需更深入的诊断,请查看
references/video-summarization-debugging.md

Reference Map

参考指南

Use these references only when the user asks for the relevant detail, or when the core workflow below needs deeper video summarization information:
  • video summarization API details:
    references/video-summarization-api.md
    for
    /v1/summarize
    ,
    /summarize
    ,
    /v1/generate_captions
    ,
    /v1/stream_summarize
    , health probes,
    /models
    ,
    /recommended_config
    ,
    /metrics
    , request fields, response shapes, and API gotchas.
  • video summarization service configuration and ops:
    references/video-summarization-deployment.md
    for the VSS
    lvs
    profile, ports, required env vars, logs, status, dry-runs, teardown, model/backend swaps, Elasticsearch/Neo4j/ArangoDB backend selection, and service-level troubleshooting.
  • Extended video summarization ops references:
    references/video-summarization-environment-variables.md
    ,
    references/video-summarization-debugging.md
    , and
    assets/video-summarization.env.example
    .
Load
video-summarization-api.md
only when you need a request field, response shape, or endpoint that is not already covered by the Step 2 LVS or fallback VLM example below, or when handling a direct video summarization API request. Load
video-summarization-deployment.md
only for deployment, configuration, or service operations.
仅当用户询问相关细节,或核心工作流需要更深入的视频总结信息时,才使用以下参考资料:
  • 视频总结API详情
    references/video-summarization-api.md
    包含
    /v1/summarize
    /summarize
    /v1/generate_captions
    /v1/stream_summarize
    、健康探测、
    /models
    /recommended_config
    /metrics
    、请求字段、响应结构以及API注意事项。
  • 视频总结服务配置与运维
    references/video-summarization-deployment.md
    包含VSS的
    lvs
    配置文件、端口、所需环境变量、日志、状态、试运行、销毁、模型/后端切换、Elasticsearch/Neo4j/ArangoDB后端选择以及服务级故障排查。
  • 扩展视频总结运维参考
    references/video-summarization-environment-variables.md
    references/video-summarization-debugging.md
    以及
    assets/video-summarization.env.example
仅当需要以下内容时才加载
video-summarization-api.md
:下方步骤2的LVS或备选VLM示例未涵盖的请求字段、响应结构或端点,或者处理直接的视频总结API请求。仅在部署、配置或服务运维场景下加载
video-summarization-deployment.md

Video Summarization API And Service Ops Requests

视频总结API与服务运维请求

If the user asks to call or debug video summarization endpoints directly, answer from
references/video-summarization-api.md
instead of running the end-to-end video summarization workflow. Examples: list video summarization models, check readiness, get recommended chunking config, inspect metrics, explain a 422 response, or build a
/v1/summarize
request body.
If the user asks to configure, deploy, restart, tear down, or troubleshoot the video summarization service, prefer the
vss-deploy-profile
skill for full VSS profile deployment and use
references/video-summarization-deployment.md
for video summarization-specific service details.
如果用户要求直接调用或调试视频总结端点,请从
references/video-summarization-api.md
中获取答案,而非运行端到端视频总结工作流。示例:列出视频总结模型、检查就绪状态、获取推荐的分块配置、查看指标、解释422响应或构建
/v1/summarize
请求体。
如果用户要求配置、部署、重启、销毁或排查视频总结服务,请优先使用
vss-deploy-profile
技能进行完整的VSS配置文件部署,并使用
references/video-summarization-deployment.md
获取视频总结特定的服务细节。

Routing

路由规则

Decide purely from video summarization service availability (probed in Setup → Availability checks below). Duration does not drive routing.
/v1/ready
BackendEndpoint
HTTP 200LVS microservice with HITL
POST ${LVS_BACKEND_URL}/v1/summarize
Anything elseVLM / RT-VLM with the default prompt + fallback note
POST ${VLM_BASE_URL}/v1/chat/completions
Fallback message when the LVS service is unreachable — copy verbatim above the summary:
Note: Input video
<name>
is
<N>
s long. The video summarization service is not deployed, so this summary was produced by the VLM alone with a generic default prompt. Deploy the
lvs
profile for higher-quality summaries with scenario/events targeting.
仅根据视频总结服务的可用性(在设置 → 可用性检查部分探测)决定路由。时长不影响路由。
/v1/ready
响应
后端端点
HTTP 200带HITL的LVS微服务
POST ${LVS_BACKEND_URL}/v1/summarize
其他响应带默认提示词的VLM / RT-VLM + 备选说明
POST ${VLM_BASE_URL}/v1/chat/completions
当LVS服务不可用时的备选提示信息——请逐字复制到总结上方:
注意: 输入视频
<name>
时长为
<N>
秒。 视频总结服务未部署,因此本总结仅由VLM通过通用默认提示词生成。部署
lvs
配置文件可获得更高质量的总结,支持场景/事件定向。

Deployment prerequisite

部署前提

The VSS lvs profile on
$HOST_IP
is the primary backend. If the
/v1/ready
probe (see Setup → Availability checks) returns anything other than 200 after the warmup retries, ask the user:
"The VSS
lvs
profile isn't running on
$HOST_IP
. Shall I deploy it now using the
/vss-deploy-profile
skill with
-p lvs
? Reply
no
to summarize with the VLM-only fallback instead (lower quality, no scenario/events targeting)."
  • Yes → hand off to
    /vss-deploy-profile
    , then re-probe and continue with Step 2 (LVS + HITL).
  • No → go straight to Step 2 fallback (VLM with default prompt) and prepend the Routing fallback note. Do not ask again, and do not run scenario/events HITL.
  • Pre-authorized to deploy autonomously (caller said so explicitly) → skip the confirmation and invoke
    /vss-deploy-profile
    directly.
  • Pre-authorized to use VLM fallback ("skip lvs, just use the VLM") → go straight to Step 2 fallback without prompting.

$HOST_IP
上的VSS lvs配置文件是主要后端。如果
/v1/ready
探测(见设置 → 可用性检查)在预热重试后返回非200响应,请询问用户:
"VSS的
lvs
配置文件未在
$HOST_IP
上运行。是否现在使用
/vss-deploy-profile
技能并加上
-p lvs
参数进行部署?回复
no
将仅使用VLM备选方案进行总结(质量较低,不支持场景/事件定向)。"
  • → 转交至
    /vss-deploy-profile
    ,然后重新探测并继续执行步骤2(LVS + HITL)。
  • → 直接进入步骤2备选方案(带默认提示词的VLM),并在开头添加路由备选提示信息。请勿再次询问,也不要运行场景/事件HITL。
  • 已授权自主部署(调用者明确说明) → 跳过确认,直接调用
    /vss-deploy-profile
  • 已授权使用VLM备选方案("跳过lvs,直接使用VLM") → 无需提示,直接进入步骤2备选方案。

Setup

设置

Endpoints (defaults for a local VSS
lvs
deployment):
  • VLM / RT-VLM:
    ${VLM_BASE_URL}
    — default
    ${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}
  • LVS service:
    ${LVS_BACKEND_URL}
    — default
    http://${HOST_IP:-localhost}:38111
  • VIOS: owned by
    vss-manage-video-io-storage
    ; refer there.
Use env vars when set (strip trailing
/v1
from the VLM base — the skill appends it). Otherwise use the defaults. If neither works, ask the user — do not scan ports or read config files to guess.
Model name: read
${VLM_NAME}
(default
nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8
). It must match the id RT-VLM
/v1/models
advertises; do not substitute the friendly
nvidia/cosmos-reason2-8b
.
For endpoint schemas, optional fields, response envelopes, and error handling, see
references/video-summarization-api.md
.
Availability checks (run both before routing). Readiness is determined by the HTTP status code only — the LVS
/v1/ready
may legitimately return
200
with an empty body, so do not inspect the body.
bash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"
端点(本地VSS
lvs
部署的默认值):
  • VLM / RT-VLM:
    ${VLM_BASE_URL}
    —— 默认值为
    ${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}
  • LVS服务:
    ${LVS_BACKEND_URL}
    —— 默认值为
    http://${HOST_IP:-localhost}:38111
  • VIOS:由
    vss-manage-video-io-storage
    负责;请参考该技能。
如果设置了环境变量则使用(移除VLM基础URL末尾的
/v1
——技能会自动追加)。否则使用默认值。如果两者都不可用,请询问用户——不要扫描端口或读取配置文件猜测。
模型名称: 读取
${VLM_NAME}
(默认值为
nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8
)。必须与RT-VLM的
/v1/models
接口返回的ID匹配;请勿替换为友好名称
nvidia/cosmos-reason2-8b
如需端点模式、可选字段、响应包和错误处理的信息,请查看
references/video-summarization-api.md
可用性检查(路由前需运行两项检查)。仅通过HTTP状态码判断就绪状态——LVS的
/v1/ready
可能合法返回
200
但响应体为空,因此不要检查响应体。
bash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"

VLM / RT-VLM: 200 on /v1/models

VLM / RT-VLM: /v1/models返回200

vlm_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3
"$VLM/v1/models") [ "$vlm_code" = "200" ] && echo "VLM OK" || echo "VLM not reachable (HTTP $vlm_code)"
vlm_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3
"$VLM/v1/models") [ "$vlm_code" = "200" ] && echo "VLM OK" || echo "VLM not reachable (HTTP $vlm_code)"

Video summarization service: 200 on /v1/ready, with retry on 503 (warmup) for up to ~30s

Video summarization service: /v1/ready返回200,503时重试(预热)最多约30秒

VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111} video_sum_code=000 for i in $(seq 1 10); do video_sum_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3 "$VIDEO_SUMMARIZATION_URL/v1/ready") case "$video_sum_code" in 200) echo "video summarization OK"; break ;; 503) sleep 3 ;; # warming up; keep polling *) break ;; # any other code = not reachable, stop retrying esac done [ "$video_sum_code" = "200" ] || echo "video summarization service not reachable (HTTP $video_sum_code)"

**How to interpret the results:**

- `video_sum_code = 200` → **Step 2 (LVS + HITL)** for every video.
- `video_sum_code != 200`, `vlm_code = 200` → **Step 2 fallback (VLM)**; prepend the Routing fallback note.
- `vlm_code != 200` → fail; at least one backend must be reachable.
- A non-200 LVS code after the retry loop is the ONLY signal of unavailability. Empty stdout or missing JSON fields are NOT "unavailable."

---
VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111} video_sum_code=000 for i in $(seq 1 10); do video_sum_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3 "$VIDEO_SUMMARIZATION_URL/v1/ready") case "$video_sum_code" in 200) echo "video summarization OK"; break ;; 503) sleep 3 ;; # 正在预热;继续轮询 *) break ;; # 其他状态码 = 不可访问,停止重试 esac done [ "$video_sum_code" = "200" ] || echo "video summarization service not reachable (HTTP $video_sum_code)"

**结果解读:**

- `video_sum_code = 200` → 所有视频均执行**步骤2(LVS + HITL)**。
- `video_sum_code != 200`且`vlm_code = 200` → 执行**步骤2备选方案(VLM)**;在开头添加路由备选提示信息。
- `vlm_code != 200` → 执行失败;至少需要一个后端可达。
- 重试循环后LVS返回非200状态码是唯一表示不可用的信号。空标准输出或缺失JSON字段不代表"不可用"。

---

Step 1 - Get the clip URL via
vss-manage-video-io-storage
(sub-task, NOT the final answer)

步骤1 - 通过
vss-manage-video-io-storage
获取片段URL(子任务,非最终答案)

Use the
vss-manage-video-io-storage
skill for all VIOS interactions
— it owns the canonical curl recipes, parameter defaults, and delete/upload flows. Do not fabricate URLs or hand-roll VIOS calls; they will drift.
This step is a sub-task — do NOT end your turn here; do NOT return the clip URL as the final answer. From VIOS collect three values:
  1. streamId
    (via
    sensor/list
    sensor/<id>/streams
    , or directly from an upload response).
  2. Timeline -
    {startTime, endTime}
    (ISO 8601 UTC).
    endTime - startTime
    is the duration; needed only for the user-facing header (routing is driven solely by
    /v1/ready
    ).
  3. Temporary MP4 clip URL — the
    /storage/file/<streamId>/url
    variant with
    container=mp4
    . Response field:
    .videoUrl
    . Both backends need an HTTP(S) URL they can
    GET
    .
Everything else (auth, upload,
disableAudio
, expiry, etc.) lives in the
vss-manage-video-io-storage
skill — refer users there if VIOS fails.

所有VIOS交互请使用
vss-manage-video-io-storage
技能
——该技能拥有标准的curl命令模板、参数默认值以及删除/上传流程。请勿自行构造URL或手动编写VIOS调用;否则会出现偏差。
此步骤为子任务——请勿在此结束操作;请勿将片段URL作为最终答案返回。从VIOS收集三个值:
  1. streamId
    (通过
    sensor/list
    sensor/<id>/streams
    获取,或直接从上传响应中获取)。
  2. 时间轴 -
    {startTime, endTime}
    (ISO 8601 UTC格式)。
    endTime - startTime
    为时长;仅用于面向用户的标题(路由仅由
    /v1/ready
    驱动)。
  3. 临时MP4片段URL —— 使用
    /storage/file/<streamId>/url
    接口并指定
    container=mp4
    。响应字段:
    .videoUrl
    。两个后端都需要一个可通过
    GET
    访问的HTTP(S) URL。
其他所有内容(认证、上传、
disableAudio
、过期时间等)均由
vss-manage-video-io-storage
技能处理——如果VIOS出现故障,请引导用户参考该技能。

Step 2 — Primary: video summarization microservice with HITL

步骤2 — 主流程:带HITL的视频总结微服务

Use this path whenever
/v1/ready
returned 200 in Setup. Duration is irrelevant.
For advanced fields (
media_info
,
schema
, structured output, stream captioning, metrics, recommended config) see
references/video-summarization-api.md
.
只要在设置步骤中
/v1/ready
返回200,就使用此路径。时长无关紧要。
如需高级字段(
media_info
schema
、结构化输出、流字幕生成、指标、推荐配置),请查看
references/video-summarization-api.md

HITL: collect scenario and events first (REQUIRED — do not skip)

HITL:先收集场景和事件(必填——请勿跳过)

Full walk-through is in
references/hitl-prompts.md
. Always run HITL before calling the LVS service.
Autonomous-mode defaults. When the caller has bypassed HITL ("run autonomously without prompting") AND the original query asks for
default
/
defaults
(or gives none), use
scenario="activity monitoring"
and
events=["notable activity"]
verbatim — do not infer from filename or sensor name. Note the defaults in the final reply and offer a re-run with more specific parameters. This is the ONLY supported HITL bypass; "the video is short" or "the user seems in a hurry" are not valid reasons.
Prefer
POST /v1/summarize
(3.2 GA route);
/summarize
is a compatibility alias.
bash
VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111}
完整流程请查看
references/hitl-prompts.md
。调用LVS服务前必须先运行HITL。
自主模式默认值。当调用者绕过HITL("无需提示,自主运行")且原始查询要求使用
default
/
defaults
(或未指定)时,请使用
scenario="activity monitoring"
events=["notable activity"]
逐字使用——不要从文件名或传感器名称推断。在最终回复中注明默认值,并提供使用更具体参数重新运行的选项。这是唯一支持的HITL绕过方式;"视频较短"或"用户似乎很着急"均不是合法理由。
优先使用
POST /v1/summarize
(3.2 GA版本路由);
/summarize
是兼容别名。
bash
VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111}

From HITL reply:

来自HITL的回复:

SCENARIO='warehouse monitoring' EVENTS_JSON='["notable activity"]' OBJECTS_JSON='' # '' to omit, else '["forklifts","pallets","workers"]'
curl -s -X POST "$VIDEO_SUMMARIZATION_URL/v1/summarize"
-H "Content-Type: application/json"
-d "$(jq -n --arg url "<clip_url_from_vss_manage_video_io_storage>"
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}"
--arg scenario "$SCENARIO"
--argjson events "$EVENTS_JSON"
--argjson objects "${OBJECTS_JSON:-null}" '{ url: $url, model: $model, scenario: $scenario, events: $events, chunk_duration: 10, num_frames_per_second_or_fixed_frames_chunk: 20, use_fps_for_chunking: false, seed: 1 } + (if $objects == null then {} else {objects_of_interest: $objects} end)')"
| jq -r '.choices[0].message.content'
| jq '{video_summary, events}'

If both `video_summary` and `events` are empty, the clip probably doesn't contain the requested events — re-run with broader `scenario`/`events`, don't report "no content".

**Tuning:** `chunk_duration` (default `10`s; `0` = single chunk),
`num_frames_per_second_or_fixed_frames_chunk` (default `20`; meaning depends
on `use_fps_for_chunking`), `seed` (default `1`). `num_frames_per_chunk` is
deprecated.

---
SCENARIO='warehouse monitoring' EVENTS_JSON='["notable activity"]' OBJECTS_JSON='' # 留空则省略,否则使用'["forklifts","pallets","workers"]'
curl -s -X POST "$VIDEO_SUMMARIZATION_URL/v1/summarize"
-H "Content-Type: application/json"
-d "$(jq -n --arg url "<clip_url_from_vss_manage_video_io_storage>"
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}"
--arg scenario "$SCENARIO"
--argjson events "$EVENTS_JSON"
--argjson objects "${OBJECTS_JSON:-null}" '{ url: $url, model: $model, scenario: $scenario, events: $events, chunk_duration: 10, num_frames_per_second_or_fixed_frames_chunk: 20, use_fps_for_chunking: false, seed: 1 } + (if $objects == null then {} else {objects_of_interest: $objects} end)')"
| jq -r '.choices[0].message.content'
| jq '{video_summary, events}'

如果`video_summary`和`events`均为空,可能视频片段中不包含请求的事件——使用更宽泛的`scenario`/`events`重新运行,不要报告"无内容"。

**调优参数:** `chunk_duration`(默认值为10秒;`0`表示单个分块)、`num_frames_per_second_or_fixed_frames_chunk`(默认值为20;含义取决于`use_fps_for_chunking`)、`seed`(默认值为1)。`num_frames_per_chunk`已废弃。

---

Step 2 fallback — VLM direct with default prompt

步骤2备选方案 — 直接使用带默认提示词的VLM

Use this path only when
/v1/ready
did not return 200 after warmup. Do NOT run HITL — the user did not opt in; you fell back because the service was missing. Prepend the Routing fallback note to the response.
bash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"
PROMPT='Describe in detail what is happening in this video,
including all visible people, vehicles, equipments, objects,
actions, and environmental conditions.
OUTPUT REQUIREMENTS:
[timestamp-timestamp] Description of what is happening.
EXAMPLE:
[0.0s-4.0s] <description of the first event>
[4.0s-12.0s] <description of the second event>'

curl -s -X POST "$VLM/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d "$(jq -n \
        --arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}" \
        --arg text "$PROMPT" \
        --arg url "<clip_url_from_vss_manage_video_io_storage>" \
        '{
          model: $model,
          temperature: 0.0,
          max_tokens: 1024,
          messages: [{
            role: "user",
            content: [
              {type: "text", text: $text},
              {type: "video_url", video_url: {url: $url}}
            ]
          }]
        }')" | jq -r '.choices[0].message.content'
Response: standard OpenAI chat-completion envelope. The summary is in
choices[0].message.content
.
Cosmos-model notes: Cosmos Reason 2 supports reasoning via
<think>...</think><answer>...</answer>
blocks. Omit the reasoning instructions if you want a plain summary. Frame sampling and pixel limits are applied server-side; no client-side prep is required when you pass a
video_url
.

仅当预热后
/v1/ready
未返回200时使用此路径。请勿运行HITL——用户未选择该选项;使用备选方案是因为服务不可用。在响应开头添加路由备选提示信息。
bash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"
PROMPT='Describe in detail what is happening in this video,
including all visible people, vehicles, equipments, objects,
actions, and environmental conditions.
OUTPUT REQUIREMENTS:
[timestamp-timestamp] Description of what is happening.
EXAMPLE:
[0.0s-4.0s] <description of the first event>
[4.0s-12.0s] <description of the second event>'

curl -s -X POST "$VLM/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d "$(jq -n \
        --arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}" \
        --arg text "$PROMPT" \
        --arg url "<clip_url_from_vss_manage_video_io_storage>" \
        '{
          model: $model,
          temperature: 0.0,
          max_tokens: 1024,
          messages: [{
            role: "user",
            content: [
              {type: "text", text: $text},
              {type: "video_url", video_url: {url: $url}}
            ]
          }]
        }')" | jq -r '.choices[0].message.content'
响应: 标准OpenAI聊天补全包。总结内容位于
choices[0].message.content
Cosmos模型说明: Cosmos Reason 2支持通过
<think>...</think><answer>...</answer>
块进行推理。如果需要纯总结内容,请省略推理指令。帧采样和像素限制在服务器端应用;传递
video_url
时无需客户端预处理。

End-to-end example

端到端示例

See
references/end-to-end-example.md
for the full LVS-or-VLM-fallback script that probes
/v1/ready
and runs the appropriate path.

完整的LVS或VLM备选方案脚本(包含探测
/v1/ready
并运行对应路径)请查看
references/end-to-end-example.md

Responses

响应说明

  • VLM returns an OpenAI chat-completion envelope; summary is
    choices[0].message.content
    .
  • LVS service returns the same envelope but
    content
    is a JSON string — run
    jq -r '.choices[0].message.content' | jq
    to reach
    {video_summary, events}
    .
  • Errors surface as HTTP non-2xx plus JSON
    {error: ...}
    . LVS
    503
    usually means warmup — retry
    /v1/ready
    .
  • VLM 返回OpenAI聊天补全包;总结内容为
    choices[0].message.content
  • LVS服务 返回相同的包,但
    content
    是JSON字符串——需运行
    jq -r '.choices[0].message.content' | jq
    才能获取
    {video_summary, events}
  • 错误 表现为HTTP非2xx状态码加JSON格式的
    {error: ...}
    。LVS返回503通常表示正在预热——请重试
    /v1/ready

Presenting the output to the user

向用户展示输出

Surface backend output with minimal transformation — do not paraphrase, re-voice, add emojis, or reformat. One backend call → one rendering: no parallel hedging, no duplicate headers, never call both LVS and VLM for the same video.
Header line. Start with exactly one:
Summary of <video_name> (<duration>)
<duration>
=
Ns
for
< 60 s
, else
Mm Ss
(e.g.
3m 30s
).
LVS output: render
video_summary
verbatim (polished, tone-controlled report — rewriting loses fidelity). Render each
events
entry with its
start_time
,
end_time
,
type
, and full
description
verbatim (table when the client renders one cleanly, otherwise a per-event list). You MAY add a one-line header and a closing offer to re-run with different parameters.
VLM output: render
choices[0].message.content
verbatim. If the model produced
<think>…</think><answer>…</answer>
blocks, drop the
<think>
block and show the answer.
Fallback warning (when applicable) goes above the summary, never mixed into it.
对后端输出进行最小化转换——请勿改写、调整语气、添加表情符号或重新格式化。一次后端调用 → 一次输出:不支持并行对冲,不添加重复标题,同一视频永远不要同时调用LVS和VLM。
标题行。开头必须包含以下内容:
Summary of <video_name> (<duration>)
<duration>
格式:时长<60秒时为
Ns
,否则为
Mm Ss
(例如
3m 30s
)。
LVS输出: 直接输出
video_summary
内容(经过精炼、语气可控的报告——改写会丢失准确性)。直接输出每个
events
条目的
start_time
end_time
type
和完整
description
内容(如果客户端能清晰渲染表格则使用表格,否则使用事件列表)。您可以添加一行标题,并在结尾提供使用不同参数重新运行的选项。
VLM输出: 直接输出
choices[0].message.content
内容。如果模型生成了
<think>…</think><answer>…</answer>
块,请移除
<think>
块并展示答案部分。
备选警告(如适用)必须放在总结上方,切勿混入总结内容中。

Tips

提示

  • Route by service availability, not by duration. Probe
    /v1/ready
    once in Setup; HTTP 200 → LVS+HITL for every clip; anything else → VLM fallback.
  • HITL is mandatory on the LVS path. The
    defaults
    opt-in is the only sanctioned bypass. The VLM fallback path is silent (no HITL).
  • Readiness = HTTP 200 on
    /v1/ready
    . Nothing else.
    Body may be empty. Always use
    curl -s -o /dev/null -w '%{http_code}'
    — never pipe through
    jq
    /
    grep
    /
    head
    .
  • Delegate VIOS to
    vss-manage-video-io-storage
    — it is a sub-task; the final answer is the Step 2 summary, not the clip URL.
  • jq
    twice for LVS output.
    First unwraps the OpenAI envelope, second parses the JSON string inside
    content
    .
  • Prefer
    /v1/summarize
    for 3.2 GA
    ;
    /summarize
    is a compatibility alias.
  • Use the exact VLM model id advertised by the endpoint (default
    nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8
    ).
  • Render output verbatim — no paraphrasing, no reformatting, no rewriting the
    video_summary
    or
    choices[0].message.content
    .
  • One call, one render. No parallel hedging, no double renderings.
  • 根据服务可用性路由,而非时长。 在设置步骤中探测一次
    /v1/ready
    ;HTTP 200 → 所有片段使用LVS+HITL;其他情况 → 使用VLM备选方案。
  • LVS路径必须运行HITL。 只有选择默认值的情况是唯一认可的绕过方式。VLM备选方案无需运行HITL。
  • 就绪状态 =
    /v1/ready
    返回HTTP 200。仅此一项。
    响应体可能为空。始终使用
    curl -s -o /dev/null -w '%{http_code}'
    ——切勿通过
    jq
    /
    grep
    /
    head
    处理。
  • 将VIOS操作委托给
    vss-manage-video-io-storage
    ——这是子任务;最终答案是步骤2的总结,而非片段URL。
  • LVS输出需两次使用
    jq
    第一次解析OpenAI包,第二次解析
    content
    中的JSON字符串。
  • 3.2 GA版本优先使用
    /v1/summarize
    /summarize
    是兼容别名。
  • 使用端点返回的准确VLM模型ID(默认值为
    nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8
    )。
  • 直接输出内容——请勿改写、重新格式化或修改
    video_summary
    choices[0].message.content
  • 一次调用,一次输出。 不支持并行对冲,不重复输出。

Cross-reference

交叉引用

  • vss-deploy-profile — bring up the
    base
    (VLM only) or
    lvs
    (VLM + video summarization service) profile
  • vss-manage-video-io-storage (VIOS API) — upload videos, list streams, get clip URLs
  • vss-search-archive — semantic search across the archive (different profile)
  • vss-query-analytics — query incidents/events from Elasticsearch
  • video summarization API reference
    references/video-summarization-api.md
  • video summarization service ops reference
    references/video-summarization-deployment.md
bump:1
  • vss-deploy-profile —— 启动
    base
    (仅VLM)或
    lvs
    (VLM + 视频总结服务)配置文件
  • vss-manage-video-io-storage(VIOS API)—— 上传视频、列出流、获取片段URL
  • vss-search-archive —— 跨归档文件的语义搜索(不同配置文件)
  • vss-query-analytics —— 从Elasticsearch查询事件/告警
  • 视频总结API参考 ——
    references/video-summarization-api.md
  • 视频总结服务运维参考 ——
    references/video-summarization-deployment.md
bump:1