vss-summarize-video
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseInstructions
操作说明
Follow the routing tables and step-by-step workflows below. Each section that ends in workflow, quick start, or flow is intended to be executed top-to-bottom. Detailed reference material lives in and helper scripts live in — call them via when the skill points to a script by name.
references/scripts/run_script请遵循以下路由表和分步工作流。所有以workflow、quick start或flow结尾的部分都需要从上到下依次执行。详细参考资料位于目录下,辅助脚本位于目录下——当技能指向某个脚本名称时,通过调用它们。
references/scripts/run_scriptExamples
示例
Worked end-to-end examples are kept under (each manifest contains a runnable scenario) and inline in the per-workflow blocks below. Run a Tier-3 evaluation with to replay them.
evals/*.jsoncurlnv-base validate <this-skill-dir> --agent-evalYou are a video summarization assistant. You call the VLM NIM or the video summarization
microservice directly. Always run commands yourself; never instruct the user to run them.
curlPrimary video workflow query type: "Summarize this video." Direct video summarization API
and service-ops requests are handled by the reference-routed sections below.
完整的端到端示例存放在目录下(每个清单文件包含一个可运行的场景),同时也内嵌在下方每个工作流的代码块中。使用命令运行三级评估即可复现这些示例。
evals/*.jsoncurlnv-base validate <this-skill-dir> --agent-eval您是视频总结助手。您可以直接调用VLM NIM或视频总结微服务。请自行运行命令;切勿指示用户运行。
curl主要视频工作流查询类型:"总结这段视频。" 直接的视频总结API和服务运维请求由下方的参考路由部分处理。
Purpose
用途
Produce a single, polished narrative summary of one recorded video clip, with
timestamped events when the LVS microservice path is reachable.
Do NOT use this skill for:
- Live RTSP captioning — use .
vss-deploy-dense-captioning - Incident-range or alert-window reports — use Mode B.
vss-generate-video-report - Semantic search across the archive — use .
vss-search-archive
生成单个录制视频片段的完整、精炼的叙事性总结,当LVS微服务路径可用时,还会包含带时间戳的事件信息。
请勿将此技能用于:
- 实时RTSP字幕生成——请使用。
vss-deploy-dense-captioning - 事件范围或告警窗口报告——请使用的模式B。
vss-generate-video-report - 跨归档文件的语义搜索——请使用。
vss-search-archive
Prerequisites
前提条件
- VSS profile running on
lvs(port 38111) OR a reachable VLM/RT-VLM endpoint as a fallback. The$HOST_IPskill brings these up.vss-deploy-profile - Network reachability from the agent host to both endpoints; clip URLs from VIOS must be fetchable by the chosen backend.
- and
jqavailable on the agent host.curl
- VSS的配置文件在
lvs(端口38111)上运行,或者有一个可访问的VLM/RT-VLM端点作为备选。$HOST_IP技能可启动这些服务。vss-deploy-profile - 代理主机与两个端点之间网络可达;所选后端必须能够获取来自VIOS的片段URL。
- 代理主机上已安装和
jq。curl
Limitations
限制条件
- Direct VLM fallback uses a single fixed prompt and cannot target scenario/events — output quality is lower than the LVS path.
- Remote VLM endpoints generally cannot reach /private clip URLs.
localhost - One backend call per request; no parallel hedging or multi-pass summaries.
- 直接使用VLM备选方案时采用单一固定提示词,无法针对特定场景/事件——输出质量低于LVS路径。
- 远程VLM端点通常无法访问/私有片段URL。
localhost - 每个请求仅调用一次后端;不支持并行对冲或多轮总结。
Troubleshooting
故障排查
| Symptom | Cause | Fix |
|---|---|---|
| LVS service still warming up | Retry up to ~30 s as shown in Setup; if it never returns 200 the service may not be deployed |
Empty | Clip does not contain the requested events | Re-run with broader |
VLM returns | Cosmos Reason 2 reasoning mode | Strip everything up to |
Empty stdout from | Service legitimately returns 200 with empty body | Always check HTTP status with |
See for deeper diagnostics.
references/video-summarization-debugging.md| 症状 | 原因 | 解决方法 |
|---|---|---|
| LVS服务仍在预热中 | 按照设置部分的说明重试约30秒;如果始终无法返回200,可能服务未部署 |
| 视频片段中不包含请求的事件 | 使用更宽泛的 |
VLM返回 | Cosmos Reason 2推理模式 | 在输出前移除 |
| 服务合法返回200但响应体为空 | 始终使用 |
如需更深入的诊断,请查看。
references/video-summarization-debugging.mdReference Map
参考指南
Use these references only when the user asks for the relevant detail, or when
the core workflow below needs deeper video summarization information:
- video summarization API details: for
references/video-summarization-api.md,/v1/summarize,/summarize,/v1/generate_captions, health probes,/v1/stream_summarize,/models,/recommended_config, request fields, response shapes, and API gotchas./metrics - video summarization service configuration and ops:
for the VSS
references/video-summarization-deployment.mdprofile, ports, required env vars, logs, status, dry-runs, teardown, model/backend swaps, Elasticsearch/Neo4j/ArangoDB backend selection, and service-level troubleshooting.lvs - Extended video summarization ops references:
,
references/video-summarization-environment-variables.md, andreferences/video-summarization-debugging.md.assets/video-summarization.env.example
Load only when you need a request field, response shape, or
endpoint that is not already covered by the Step 2 LVS or fallback VLM
example below, or when handling a direct video summarization API
request. Load only for deployment,
configuration, or service operations.
video-summarization-api.mdvideo-summarization-deployment.md仅当用户询问相关细节,或核心工作流需要更深入的视频总结信息时,才使用以下参考资料:
- 视频总结API详情:包含
references/video-summarization-api.md、/v1/summarize、/summarize、/v1/generate_captions、健康探测、/v1/stream_summarize、/models、/recommended_config、请求字段、响应结构以及API注意事项。/metrics - 视频总结服务配置与运维:包含VSS的
references/video-summarization-deployment.md配置文件、端口、所需环境变量、日志、状态、试运行、销毁、模型/后端切换、Elasticsearch/Neo4j/ArangoDB后端选择以及服务级故障排查。lvs - 扩展视频总结运维参考:、
references/video-summarization-environment-variables.md以及references/video-summarization-debugging.md。assets/video-summarization.env.example
仅当需要以下内容时才加载:下方步骤2的LVS或备选VLM示例未涵盖的请求字段、响应结构或端点,或者处理直接的视频总结API请求。仅在部署、配置或服务运维场景下加载。
video-summarization-api.mdvideo-summarization-deployment.mdVideo Summarization API And Service Ops Requests
视频总结API与服务运维请求
If the user asks to call or debug video summarization endpoints directly, answer from
instead of running the
end-to-end video summarization workflow. Examples: list video summarization models, check
readiness, get recommended chunking config, inspect metrics, explain a 422
response, or build a request body.
references/video-summarization-api.md/v1/summarizeIf the user asks to configure, deploy, restart, tear down, or troubleshoot the
video summarization service, prefer the skill for full VSS profile
deployment and use
for video summarization-specific service details.
vss-deploy-profilereferences/video-summarization-deployment.md如果用户要求直接调用或调试视频总结端点,请从中获取答案,而非运行端到端视频总结工作流。示例:列出视频总结模型、检查就绪状态、获取推荐的分块配置、查看指标、解释422响应或构建请求体。
references/video-summarization-api.md/v1/summarize如果用户要求配置、部署、重启、销毁或排查视频总结服务,请优先使用技能进行完整的VSS配置文件部署,并使用获取视频总结特定的服务细节。
vss-deploy-profilereferences/video-summarization-deployment.mdRouting
路由规则
Decide purely from video summarization service availability (probed in
Setup → Availability checks below). Duration does not drive routing.
| Backend | Endpoint |
|---|---|---|
| HTTP 200 | LVS microservice with HITL | |
| Anything else | VLM / RT-VLM with the default prompt + fallback note | |
Fallback message when the LVS service is unreachable — copy verbatim above the summary:
⚠ Note: Input videois<name>s long. The video summarization service is not deployed, so this summary was produced by the VLM alone with a generic default prompt. Deploy the<N>profile for higher-quality summaries with scenario/events targeting.lvs
仅根据视频总结服务的可用性(在设置 → 可用性检查部分探测)决定路由。时长不影响路由。
| 后端 | 端点 |
|---|---|---|
| HTTP 200 | 带HITL的LVS微服务 | |
| 其他响应 | 带默认提示词的VLM / RT-VLM + 备选说明 | |
当LVS服务不可用时的备选提示信息——请逐字复制到总结上方:
⚠ 注意: 输入视频时长为<name>秒。 视频总结服务未部署,因此本总结仅由VLM通过通用默认提示词生成。部署<N>配置文件可获得更高质量的总结,支持场景/事件定向。lvs
Deployment prerequisite
部署前提
The VSS lvs profile on is the primary backend. If the
probe (see Setup → Availability checks) returns anything
other than 200 after the warmup retries, ask the user:
$HOST_IP/v1/ready"The VSSprofile isn't running onlvs. Shall I deploy it now using the$HOST_IPskill with/vss-deploy-profile? Reply-p lvsto summarize with the VLM-only fallback instead (lower quality, no scenario/events targeting)."no
- Yes → hand off to , then re-probe and continue with Step 2 (LVS + HITL).
/vss-deploy-profile - No → go straight to Step 2 fallback (VLM with default prompt) and prepend the Routing fallback note. Do not ask again, and do not run scenario/events HITL.
- Pre-authorized to deploy autonomously (caller said so explicitly) → skip the confirmation and invoke directly.
/vss-deploy-profile - Pre-authorized to use VLM fallback ("skip lvs, just use the VLM") → go straight to Step 2 fallback without prompting.
$HOST_IP/v1/ready"VSS的配置文件未在lvs上运行。是否现在使用$HOST_IP技能并加上/vss-deploy-profile参数进行部署?回复-p lvs将仅使用VLM备选方案进行总结(质量较低,不支持场景/事件定向)。"no
- 是 → 转交至,然后重新探测并继续执行步骤2(LVS + HITL)。
/vss-deploy-profile - 否 → 直接进入步骤2备选方案(带默认提示词的VLM),并在开头添加路由备选提示信息。请勿再次询问,也不要运行场景/事件HITL。
- 已授权自主部署(调用者明确说明) → 跳过确认,直接调用。
/vss-deploy-profile - 已授权使用VLM备选方案("跳过lvs,直接使用VLM") → 无需提示,直接进入步骤2备选方案。
Setup
设置
Endpoints (defaults for a local VSS deployment):
lvs- VLM / RT-VLM: — default
${VLM_BASE_URL}${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018} - LVS service: — default
${LVS_BACKEND_URL}http://${HOST_IP:-localhost}:38111 - VIOS: owned by ; refer there.
vss-manage-video-io-storage
Use env vars when set (strip trailing from the VLM base — the skill appends it). Otherwise use the defaults. If neither works, ask the user — do not scan ports or read config files to guess.
/v1Model name: read (default
). It must match the id RT-VLM
advertises; do not substitute the friendly
.
${VLM_NAME}nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8/v1/modelsnvidia/cosmos-reason2-8bFor endpoint schemas, optional fields, response envelopes, and error handling, see .
references/video-summarization-api.mdAvailability checks (run both before routing).
Readiness is determined by the HTTP status code only — the LVS
may legitimately return with an empty body, so do not
inspect the body.
/v1/ready200bash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"端点(本地VSS 部署的默认值):
lvs- VLM / RT-VLM:—— 默认值为
${VLM_BASE_URL}${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018} - LVS服务:—— 默认值为
${LVS_BACKEND_URL}http://${HOST_IP:-localhost}:38111 - VIOS:由负责;请参考该技能。
vss-manage-video-io-storage
如果设置了环境变量则使用(移除VLM基础URL末尾的——技能会自动追加)。否则使用默认值。如果两者都不可用,请询问用户——不要扫描端口或读取配置文件猜测。
/v1模型名称: 读取(默认值为)。必须与RT-VLM的接口返回的ID匹配;请勿替换为友好名称。
${VLM_NAME}nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8/v1/modelsnvidia/cosmos-reason2-8b如需端点模式、可选字段、响应包和错误处理的信息,请查看。
references/video-summarization-api.md可用性检查(路由前需运行两项检查)。仅通过HTTP状态码判断就绪状态——LVS的可能合法返回但响应体为空,因此不要检查响应体。
/v1/ready200bash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"VLM / RT-VLM: 200 on /v1/models
VLM / RT-VLM: /v1/models返回200
vlm_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3
"$VLM/v1/models") [ "$vlm_code" = "200" ] && echo "VLM OK" || echo "VLM not reachable (HTTP $vlm_code)"
"$VLM/v1/models") [ "$vlm_code" = "200" ] && echo "VLM OK" || echo "VLM not reachable (HTTP $vlm_code)"
vlm_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3
"$VLM/v1/models") [ "$vlm_code" = "200" ] && echo "VLM OK" || echo "VLM not reachable (HTTP $vlm_code)"
"$VLM/v1/models") [ "$vlm_code" = "200" ] && echo "VLM OK" || echo "VLM not reachable (HTTP $vlm_code)"
Video summarization service: 200 on /v1/ready, with retry on 503 (warmup) for up to ~30s
Video summarization service: /v1/ready返回200,503时重试(预热)最多约30秒
VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111}
video_sum_code=000
for i in $(seq 1 10); do
video_sum_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3 "$VIDEO_SUMMARIZATION_URL/v1/ready")
case "$video_sum_code" in
200) echo "video summarization OK"; break ;;
503) sleep 3 ;; # warming up; keep polling
*) break ;; # any other code = not reachable, stop retrying
esac
done
[ "$video_sum_code" = "200" ] || echo "video summarization service not reachable (HTTP $video_sum_code)"
**How to interpret the results:**
- `video_sum_code = 200` → **Step 2 (LVS + HITL)** for every video.
- `video_sum_code != 200`, `vlm_code = 200` → **Step 2 fallback (VLM)**; prepend the Routing fallback note.
- `vlm_code != 200` → fail; at least one backend must be reachable.
- A non-200 LVS code after the retry loop is the ONLY signal of unavailability. Empty stdout or missing JSON fields are NOT "unavailable."
---VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111}
video_sum_code=000
for i in $(seq 1 10); do
video_sum_code=$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 3 "$VIDEO_SUMMARIZATION_URL/v1/ready")
case "$video_sum_code" in
200) echo "video summarization OK"; break ;;
503) sleep 3 ;; # 正在预热;继续轮询
*) break ;; # 其他状态码 = 不可访问,停止重试
esac
done
[ "$video_sum_code" = "200" ] || echo "video summarization service not reachable (HTTP $video_sum_code)"
**结果解读:**
- `video_sum_code = 200` → 所有视频均执行**步骤2(LVS + HITL)**。
- `video_sum_code != 200`且`vlm_code = 200` → 执行**步骤2备选方案(VLM)**;在开头添加路由备选提示信息。
- `vlm_code != 200` → 执行失败;至少需要一个后端可达。
- 重试循环后LVS返回非200状态码是唯一表示不可用的信号。空标准输出或缺失JSON字段不代表"不可用"。
---Step 1 - Get the clip URL via vss-manage-video-io-storage
(sub-task, NOT the final answer)
vss-manage-video-io-storage步骤1 - 通过vss-manage-video-io-storage
获取片段URL(子任务,非最终答案)
vss-manage-video-io-storageUse the skill for all VIOS interactions — it
owns the canonical curl recipes, parameter defaults, and delete/upload flows.
Do not fabricate URLs or hand-roll VIOS calls; they will drift.
vss-manage-video-io-storageThis step is a sub-task — do NOT end your turn here; do NOT return the clip
URL as the final answer. From VIOS collect three values:
- (via
streamId→sensor/list, or directly from an upload response).sensor/<id>/streams - Timeline - (ISO 8601 UTC).
{startTime, endTime}is the duration; needed only for the user-facing header (routing is driven solely byendTime - startTime)./v1/ready - Temporary MP4 clip URL — the variant with
/storage/file/<streamId>/url. Response field:container=mp4. Both backends need an HTTP(S) URL they can.videoUrl.GET
Everything else (auth, upload, , expiry, etc.) lives in the
skill — refer users there if VIOS fails.
disableAudiovss-manage-video-io-storage所有VIOS交互请使用技能——该技能拥有标准的curl命令模板、参数默认值以及删除/上传流程。请勿自行构造URL或手动编写VIOS调用;否则会出现偏差。
vss-manage-video-io-storage此步骤为子任务——请勿在此结束操作;请勿将片段URL作为最终答案返回。从VIOS收集三个值:
- (通过
streamId→sensor/list获取,或直接从上传响应中获取)。sensor/<id>/streams - 时间轴 - (ISO 8601 UTC格式)。
{startTime, endTime}为时长;仅用于面向用户的标题(路由仅由endTime - startTime驱动)。/v1/ready - 临时MP4片段URL —— 使用接口并指定
/storage/file/<streamId>/url。响应字段:container=mp4。两个后端都需要一个可通过.videoUrl访问的HTTP(S) URL。GET
其他所有内容(认证、上传、、过期时间等)均由技能处理——如果VIOS出现故障,请引导用户参考该技能。
disableAudiovss-manage-video-io-storageStep 2 — Primary: video summarization microservice with HITL
步骤2 — 主流程:带HITL的视频总结微服务
Use this path whenever returned 200 in Setup. Duration is irrelevant.
/v1/readyFor advanced fields (, , structured output, stream captioning, metrics, recommended config) see .
media_infoschemareferences/video-summarization-api.md只要在设置步骤中返回200,就使用此路径。时长无关紧要。
/v1/ready如需高级字段(、、结构化输出、流字幕生成、指标、推荐配置),请查看。
media_infoschemareferences/video-summarization-api.mdHITL: collect scenario and events first (REQUIRED — do not skip)
HITL:先收集场景和事件(必填——请勿跳过)
Full walk-through is in . Always run HITL before calling the LVS service.
references/hitl-prompts.mdAutonomous-mode defaults. When the caller has bypassed HITL ("run
autonomously without prompting") AND the original query asks for
/ (or gives none), use
and
verbatim — do not infer from filename or sensor name. Note the
defaults in the final reply and offer a re-run with more specific
parameters. This is the ONLY supported HITL bypass; "the video is
short" or "the user seems in a hurry" are not valid reasons.
defaultdefaultsscenario="activity monitoring"events=["notable activity"]Prefer (3.2 GA route); is a compatibility alias.
POST /v1/summarize/summarizebash
VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111}完整流程请查看。调用LVS服务前必须先运行HITL。
references/hitl-prompts.md自主模式默认值。当调用者绕过HITL("无需提示,自主运行")且原始查询要求使用/(或未指定)时,请使用和逐字使用——不要从文件名或传感器名称推断。在最终回复中注明默认值,并提供使用更具体参数重新运行的选项。这是唯一支持的HITL绕过方式;"视频较短"或"用户似乎很着急"均不是合法理由。
defaultdefaultsscenario="activity monitoring"events=["notable activity"]优先使用(3.2 GA版本路由);是兼容别名。
POST /v1/summarize/summarizebash
VIDEO_SUMMARIZATION_URL=${LVS_BACKEND_URL:-http://${HOST_IP:-localhost}:38111}From HITL reply:
来自HITL的回复:
SCENARIO='warehouse monitoring'
EVENTS_JSON='["notable activity"]'
OBJECTS_JSON='' # '' to omit, else '["forklifts","pallets","workers"]'
curl -s -X POST "$VIDEO_SUMMARIZATION_URL/v1/summarize"
-H "Content-Type: application/json"
-d "$(jq -n --arg url "<clip_url_from_vss_manage_video_io_storage>"
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}"
--arg scenario "$SCENARIO"
--argjson events "$EVENTS_JSON"
--argjson objects "${OBJECTS_JSON:-null}" '{ url: $url, model: $model, scenario: $scenario, events: $events, chunk_duration: 10, num_frames_per_second_or_fixed_frames_chunk: 20, use_fps_for_chunking: false, seed: 1 } + (if $objects == null then {} else {objects_of_interest: $objects} end)')"
| jq -r '.choices[0].message.content'
| jq '{video_summary, events}'
-H "Content-Type: application/json"
-d "$(jq -n --arg url "<clip_url_from_vss_manage_video_io_storage>"
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}"
--arg scenario "$SCENARIO"
--argjson events "$EVENTS_JSON"
--argjson objects "${OBJECTS_JSON:-null}" '{ url: $url, model: $model, scenario: $scenario, events: $events, chunk_duration: 10, num_frames_per_second_or_fixed_frames_chunk: 20, use_fps_for_chunking: false, seed: 1 } + (if $objects == null then {} else {objects_of_interest: $objects} end)')"
| jq -r '.choices[0].message.content'
| jq '{video_summary, events}'
If both `video_summary` and `events` are empty, the clip probably doesn't contain the requested events — re-run with broader `scenario`/`events`, don't report "no content".
**Tuning:** `chunk_duration` (default `10`s; `0` = single chunk),
`num_frames_per_second_or_fixed_frames_chunk` (default `20`; meaning depends
on `use_fps_for_chunking`), `seed` (default `1`). `num_frames_per_chunk` is
deprecated.
---SCENARIO='warehouse monitoring'
EVENTS_JSON='["notable activity"]'
OBJECTS_JSON='' # 留空则省略,否则使用'["forklifts","pallets","workers"]'
curl -s -X POST "$VIDEO_SUMMARIZATION_URL/v1/summarize"
-H "Content-Type: application/json"
-d "$(jq -n --arg url "<clip_url_from_vss_manage_video_io_storage>"
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}"
--arg scenario "$SCENARIO"
--argjson events "$EVENTS_JSON"
--argjson objects "${OBJECTS_JSON:-null}" '{ url: $url, model: $model, scenario: $scenario, events: $events, chunk_duration: 10, num_frames_per_second_or_fixed_frames_chunk: 20, use_fps_for_chunking: false, seed: 1 } + (if $objects == null then {} else {objects_of_interest: $objects} end)')"
| jq -r '.choices[0].message.content'
| jq '{video_summary, events}'
-H "Content-Type: application/json"
-d "$(jq -n --arg url "<clip_url_from_vss_manage_video_io_storage>"
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}"
--arg scenario "$SCENARIO"
--argjson events "$EVENTS_JSON"
--argjson objects "${OBJECTS_JSON:-null}" '{ url: $url, model: $model, scenario: $scenario, events: $events, chunk_duration: 10, num_frames_per_second_or_fixed_frames_chunk: 20, use_fps_for_chunking: false, seed: 1 } + (if $objects == null then {} else {objects_of_interest: $objects} end)')"
| jq -r '.choices[0].message.content'
| jq '{video_summary, events}'
如果`video_summary`和`events`均为空,可能视频片段中不包含请求的事件——使用更宽泛的`scenario`/`events`重新运行,不要报告"无内容"。
**调优参数:** `chunk_duration`(默认值为10秒;`0`表示单个分块)、`num_frames_per_second_or_fixed_frames_chunk`(默认值为20;含义取决于`use_fps_for_chunking`)、`seed`(默认值为1)。`num_frames_per_chunk`已废弃。
---Step 2 fallback — VLM direct with default prompt
步骤2备选方案 — 直接使用带默认提示词的VLM
Use this path only when did not return 200 after warmup. Do NOT run HITL — the user did not opt in; you fell back because the service was missing. Prepend the Routing fallback note to the response.
/v1/readybash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"
PROMPT='Describe in detail what is happening in this video,
including all visible people, vehicles, equipments, objects,
actions, and environmental conditions.
OUTPUT REQUIREMENTS:
[timestamp-timestamp] Description of what is happening.
EXAMPLE:
[0.0s-4.0s] <description of the first event>
[4.0s-12.0s] <description of the second event>'
curl -s -X POST "$VLM/v1/chat/completions" \
-H "Content-Type: application/json" \
-d "$(jq -n \
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}" \
--arg text "$PROMPT" \
--arg url "<clip_url_from_vss_manage_video_io_storage>" \
'{
model: $model,
temperature: 0.0,
max_tokens: 1024,
messages: [{
role: "user",
content: [
{type: "text", text: $text},
{type: "video_url", video_url: {url: $url}}
]
}]
}')" | jq -r '.choices[0].message.content'Response: standard OpenAI chat-completion envelope. The summary is in
.
choices[0].message.contentCosmos-model notes: Cosmos Reason 2 supports reasoning via
blocks. Omit the reasoning
instructions if you want a plain summary. Frame sampling and pixel limits
are applied server-side; no client-side prep is required when you pass a
.
<think>...</think><answer>...</answer>video_url仅当预热后未返回200时使用此路径。请勿运行HITL——用户未选择该选项;使用备选方案是因为服务不可用。在响应开头添加路由备选提示信息。
/v1/readybash
VLM="${VLM_BASE_URL:-${RTVI_VLM_BASE_URL:-http://${HOST_IP:-localhost}:8018}}"
VLM="${VLM%/v1}"
PROMPT='Describe in detail what is happening in this video,
including all visible people, vehicles, equipments, objects,
actions, and environmental conditions.
OUTPUT REQUIREMENTS:
[timestamp-timestamp] Description of what is happening.
EXAMPLE:
[0.0s-4.0s] <description of the first event>
[4.0s-12.0s] <description of the second event>'
curl -s -X POST "$VLM/v1/chat/completions" \
-H "Content-Type: application/json" \
-d "$(jq -n \
--arg model "${VLM_NAME:-nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8}" \
--arg text "$PROMPT" \
--arg url "<clip_url_from_vss_manage_video_io_storage>" \
'{
model: $model,
temperature: 0.0,
max_tokens: 1024,
messages: [{
role: "user",
content: [
{type: "text", text: $text},
{type: "video_url", video_url: {url: $url}}
]
}]
}')" | jq -r '.choices[0].message.content'响应: 标准OpenAI聊天补全包。总结内容位于。
choices[0].message.contentCosmos模型说明: Cosmos Reason 2支持通过块进行推理。如果需要纯总结内容,请省略推理指令。帧采样和像素限制在服务器端应用;传递时无需客户端预处理。
<think>...</think><answer>...</answer>video_urlEnd-to-end example
端到端示例
See for
the full LVS-or-VLM-fallback script that probes and runs the
appropriate path.
references/end-to-end-example.md/v1/ready完整的LVS或VLM备选方案脚本(包含探测并运行对应路径)请查看。
/v1/readyreferences/end-to-end-example.mdResponses
响应说明
- VLM returns an OpenAI chat-completion envelope; summary is
.
choices[0].message.content - LVS service returns the same envelope but is a JSON string — run
contentto reachjq -r '.choices[0].message.content' | jq.{video_summary, events} - Errors surface as HTTP non-2xx plus JSON . LVS
{error: ...}usually means warmup — retry503./v1/ready
- VLM 返回OpenAI聊天补全包;总结内容为。
choices[0].message.content - LVS服务 返回相同的包,但是JSON字符串——需运行
content才能获取jq -r '.choices[0].message.content' | jq。{video_summary, events} - 错误 表现为HTTP非2xx状态码加JSON格式的。LVS返回503通常表示正在预热——请重试
{error: ...}。/v1/ready
Presenting the output to the user
向用户展示输出
Surface backend output with minimal transformation — do not paraphrase,
re-voice, add emojis, or reformat. One backend call → one rendering: no
parallel hedging, no duplicate headers, never call both LVS and VLM for the
same video.
Header line. Start with exactly one:
Summary of <video_name> (<duration>)<duration>Ns< 60 sMm Ss3m 30sLVS output: render verbatim (polished, tone-controlled
report — rewriting loses fidelity). Render each entry with its
, , , and full verbatim (table when
the client renders one cleanly, otherwise a per-event list). You MAY add a
one-line header and a closing offer to re-run with different parameters.
video_summaryeventsstart_timeend_timetypedescriptionVLM output: render verbatim. If the model
produced blocks, drop the
block and show the answer.
choices[0].message.content<think>…</think><answer>…</answer><think>Fallback warning (when applicable) goes above the summary, never
mixed into it.
对后端输出进行最小化转换——请勿改写、调整语气、添加表情符号或重新格式化。一次后端调用 → 一次输出:不支持并行对冲,不添加重复标题,同一视频永远不要同时调用LVS和VLM。
标题行。开头必须包含以下内容:
Summary of <video_name> (<duration>)<duration>NsMm Ss3m 30sLVS输出: 直接输出内容(经过精炼、语气可控的报告——改写会丢失准确性)。直接输出每个条目的、、和完整内容(如果客户端能清晰渲染表格则使用表格,否则使用事件列表)。您可以添加一行标题,并在结尾提供使用不同参数重新运行的选项。
video_summaryeventsstart_timeend_timetypedescriptionVLM输出: 直接输出内容。如果模型生成了块,请移除块并展示答案部分。
choices[0].message.content<think>…</think><answer>…</answer><think>备选警告(如适用)必须放在总结上方,切勿混入总结内容中。
Tips
提示
- Route by service availability, not by duration. Probe once in Setup; HTTP 200 → LVS+HITL for every clip; anything else → VLM fallback.
/v1/ready - HITL is mandatory on the LVS path. The opt-in is the only sanctioned bypass. The VLM fallback path is silent (no HITL).
defaults - Readiness = HTTP 200 on . Nothing else. Body may be empty. Always use
/v1/ready— never pipe throughcurl -s -o /dev/null -w '%{http_code}'/jq/grep.head - Delegate VIOS to — it is a sub-task; the final answer is the Step 2 summary, not the clip URL.
vss-manage-video-io-storage - twice for LVS output. First unwraps the OpenAI envelope, second parses the JSON string inside
jq.content - Prefer for 3.2 GA;
/v1/summarizeis a compatibility alias./summarize - Use the exact VLM model id advertised by the endpoint (default
).
nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8 - Render output verbatim — no paraphrasing, no reformatting, no rewriting
the or
video_summary.choices[0].message.content - One call, one render. No parallel hedging, no double renderings.
- 根据服务可用性路由,而非时长。 在设置步骤中探测一次;HTTP 200 → 所有片段使用LVS+HITL;其他情况 → 使用VLM备选方案。
/v1/ready - LVS路径必须运行HITL。 只有选择默认值的情况是唯一认可的绕过方式。VLM备选方案无需运行HITL。
- 就绪状态 = 返回HTTP 200。仅此一项。 响应体可能为空。始终使用
/v1/ready——切勿通过curl -s -o /dev/null -w '%{http_code}'/jq/grep处理。head - 将VIOS操作委托给——这是子任务;最终答案是步骤2的总结,而非片段URL。
vss-manage-video-io-storage - LVS输出需两次使用。 第一次解析OpenAI包,第二次解析
jq中的JSON字符串。content - 3.2 GA版本优先使用;
/v1/summarize是兼容别名。/summarize - 使用端点返回的准确VLM模型ID(默认值为)。
nim_nvidia_cosmos-reason2-8b_0303-fp8-dynamic-kv8 - 直接输出内容——请勿改写、重新格式化或修改或
video_summary。choices[0].message.content - 一次调用,一次输出。 不支持并行对冲,不重复输出。
Cross-reference
交叉引用
- vss-deploy-profile — bring up the (VLM only) or
base(VLM + video summarization service) profilelvs - vss-manage-video-io-storage (VIOS API) — upload videos, list streams, get clip URLs
- vss-search-archive — semantic search across the archive (different profile)
- vss-query-analytics — query incidents/events from Elasticsearch
- video summarization API reference —
references/video-summarization-api.md - video summarization service ops reference —
references/video-summarization-deployment.md
bump:1
- vss-deploy-profile —— 启动(仅VLM)或
base(VLM + 视频总结服务)配置文件lvs - vss-manage-video-io-storage(VIOS API)—— 上传视频、列出流、获取片段URL
- vss-search-archive —— 跨归档文件的语义搜索(不同配置文件)
- vss-query-analytics —— 从Elasticsearch查询事件/告警
- 视频总结API参考 ——
references/video-summarization-api.md - 视频总结服务运维参考 ——
references/video-summarization-deployment.md
bump:1