vss-frag

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

VSS Frag — Video Analysis with Enterprise RAG

VSS Frag — 结合企业级RAG的视频分析

Generate video summary reports using the VSS

video_search_frag

extension. This skill adds Enterprise RAG (Milvus) knowledge retrieval and guided human-in-the-loop (HITL) parameter collection on top of the base VSS agent.

Always run

curl

commands yourself; never instruct the user to run them.

借助VSS

video_search_frag

扩展生成视频摘要报告。该技能在基础VSS Agent之上新增了企业级RAG（Milvus）知识检索和引导式人机协同（HITL）参数收集功能。

请自行执行

curl

命令；切勿指导用户执行这些命令。

Deploying the Frag Extension

部署Frag扩展

The frag extension layers Enterprise RAG and HITL LVS tools on top of the base VSS agent image. Deployment is a two-step Docker build followed by compose up.

Environment variables: All commands use values from the
.env
file at
deployments/developer-workflow/dev-profile-lvs/.env
. Edit it before deploying. Key variables:
HOST_IP
,
VSS_AGENT_PORT
(default
8000
),
NGC_CLI_API_KEY
,
NVIDIA_API_KEY
,
ENTERPRISE_RAG_*
.

Frag扩展在基础VSS Agent镜像之上叠加了企业级RAG和HITL LVS工具。部署分为两步：先进行Docker构建，再执行compose启动。

环境变量： 所有命令均使用
deployments/developer-workflow/dev-profile-lvs/.env
文件中的值。部署前请编辑该文件。关键变量：
HOST_IP
、
VSS_AGENT_PORT
（默认值
8000
）、
NGC_CLI_API_KEY
、
NVIDIA_API_KEY
、
ENTERPRISE_RAG_*
。

Step 1: Configure the .env file

步骤1：配置.env文件

bash

nano deployments/developer-workflow/dev-profile-lvs/.env

Set at minimum:

```
HOST_IP
```
— your machine's IP (
```
hostname -I | awk '{print $1}'
```
)
```
NGC_CLI_API_KEY
```
— from https://ngc.nvidia.com/
```
NVIDIA_API_KEY
```
— from https://build.nvidia.com/

VSS_AGENT_CONFIG_FILE=./configs/video_search_frag/config.yml

ENTERPRISE_RAG_VDB_ENDPOINT

— your Milvus endpoint (e.g.,

tcp://127.0.0.1:19530

)

```
ENTERPRISE_RAG_COLLECTION_NAMES
```
— your Milvus collection name

bash

nano deployments/developer-workflow/dev-profile-lvs/.env

至少设置以下变量：

```
HOST_IP
```
— 你的机器IP（可通过
```
hostname -I | awk '{print $1}'
```
获取）
```
NGC_CLI_API_KEY
```
— 从https://ngc.nvidia.com/获取
```
NVIDIA_API_KEY
```
— 从https://build.nvidia.com/获取

VSS_AGENT_CONFIG_FILE=./configs/video_search_frag/config.yml

ENTERPRISE_RAG_VDB_ENDPOINT

— 你的Milvus端点（例如：

tcp://127.0.0.1:19530

）

```
ENTERPRISE_RAG_COLLECTION_NAMES
```
— 你的Milvus集合名称

Step 2: Log in to NGC registry

步骤2：登录NGC注册表

bash

echo "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

bash

echo "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

Step 3: Build the base agent image

步骤3：构建基础Agent镜像

bash

cd agent
docker build -f docker/Dockerfile -t vss-agent-base .

bash

cd agent
docker build -f docker/Dockerfile -t vss-agent-base .

Step 4: Build the frag extension image

步骤4：构建Frag扩展镜像

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build

This produces

vss-agent-frag:latest

— the base agent extended with

video_search_frag

(Enterprise RAG, HITL LVS, PDF report generation).

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build

此命令会生成

vss-agent-frag:latest

镜像——即在基础Agent上扩展了

video_search_frag

功能（包含企业级RAG、HITL LVS、PDF报告生成）的镜像。

Step 5: Deploy with docker compose

步骤5：使用docker compose部署

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d

Two

-f

flags: the frag compose defines

vss-agent

, the UI compose defines

metropolis-vss-ui

. They merge into a single deployment.

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d

两个

-f

参数：frag的compose文件定义了

vss-agent

，UI的compose文件定义了

metropolis-vss-ui

，二者会合并为一个部署实例。

Step 6: Verify deployment

步骤6：验证部署

bash

undefined

bash

undefined

Check containers are running

检查容器是否运行

docker ps --format "table {{.Names}}\t{{.Status}}"

Health check

健康检查

curl -sf --max-time 5 "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health" >/dev/null
&& echo "VSS frag agent is running"
|| echo "VSS frag agent is NOT reachable"

undefined

curl -sf --max-time 5 "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health" >/dev/null
&& echo "VSS frag agent is running"
|| echo "VSS frag agent is NOT reachable"

undefined

Tear down

停止部署

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

Rebuild after code changes

代码变更后重新构建

Always

down

then rebuild and

up

— never just

up -d

alone after changes.

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d

请先执行

down

，再重新构建并启动——代码变更后切勿仅执行

up -d

。

bash

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d

When to Use

适用场景

User wants to generate a video summary or report using the frag pipeline
User asks to analyze a video with Enterprise RAG knowledge context
User mentions "frag", "enterprise RAG", or "knowledge-enhanced report"

用户希望通过frag流水线生成视频摘要或报告
用户要求结合企业级RAG知识上下文分析视频
用户提及“frag”、“enterprise RAG”或“知识增强型报告”

When NOT to Use

不适用场景

Simple video understanding queries (use
```
video-understanding
```
skill)
Direct LVS summarization without HITL (use
```
video-summarization
```
skill)
Deployment tasks (use
```
deploy
```
skill)
Real-time alerts (use
```
alerts
```
skill)

简单视频理解查询（请使用
```
video-understanding
```
技能）
无需人机协同的直接LVS摘要生成（请使用
```
video-summarization
```
技能）
部署任务（请使用
```
deploy
```
技能）
实时告警（请使用
```
alerts
```
技能）

Workflow: Generate an LVS Report with Enterprise RAG

工作流：结合企业级RAG生成LVS报告

Step 1: List available videos

步骤1：列出可用视频

bash

curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What videos are available?"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"

Show the user the video list and ask which one they want to analyze.

bash

curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What videos are available?"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"

向用户展示视频列表，并询问他们想要分析哪一个。

Step 2: Collect parameters from the user

步骤2：向用户收集参数

Ask the user for these four inputs one at a time:

Scenario — What type of scenario is the video about? Example: "warehouse monitoring", "traffic monitoring", "retail store activity"
Events — What events should be detected? Comma-separated. Example: "accident, forklift stuck, workers not wearing PPE, person entering restricted area"
Objects of Interest — What objects should the analysis focus on? Or "skip" to skip. Example: "forklifts, pallets, workers"
Enterprise RAG Query — An optional question to search the enterprise knowledge base for additional context to include in the report. Or "skip" to skip. Example: "What are the principles of STCC?"

请依次向用户询问以下四个输入：

场景 — 视频涉及何种场景？示例：“仓库监控”、“交通监控”、“零售店活动”
事件 — 需要检测哪些事件？用逗号分隔。示例：“事故、叉车故障、工人未佩戴PPE、人员进入限制区域”
关注对象 — 分析应聚焦于哪些对象？或输入“skip”跳过。示例：“叉车、托盘、工人”
企业级RAG查询 — 可选问题，用于搜索企业知识库以获取报告中需包含的额外上下文。或输入“skip”跳过。示例：“STCC的原则是什么？”

Step 3: Start the report (HTTP HITL)

步骤3：启动报告（HTTP人机协同）

Send a POST to

/v1/chat

. This returns HTTP 202 with an execution_id and the first HITL prompt. Replace VIDEO_NAME with the chosen video:

bash

curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Generate a report for VIDEO_NAME using long video summarization"}]}'

The response contains:

```
execution_id
```
— save this, used in all subsequent requests
```
interaction_id
```
— identifies the current prompt
```
prompt.text
```
— the HITL prompt text
```
response_url
```
— the URL to POST the response to

向

/v1/chat

发送POST请求。返回HTTP 200状态码，包含execution_id和第一个人机协同提示。将VIDEO_NAME替换为用户选择的视频：

bash

curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Generate a report for VIDEO_NAME using long video summarization"}]}'

响应包含：

```
execution_id
```
— 请保存该值，后续所有请求都会用到
```
interaction_id
```
— 标识当前提示
```
prompt.text
```
— 人机协同提示文本
```
response_url
```
— 用于提交响应的URL

Step 4: Respond to HITL prompts

步骤4：响应人机协同提示

For each prompt, POST the user's parameter to the response_url. Replace EXECUTION_ID, INTERACTION_ID, and the text value:

bash

curl -sS -X POST \
  "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID/interactions/INTERACTION_ID/response" \
  -H "Content-Type: application/json" \
  -d '{"response": {"type": "text", "text": "USER_VALUE_HERE"}}'

Then poll for the next prompt:

bash

curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool

The HITL prompts come in this order:

Scenario — respond with the scenario from Step 2
Events — respond with the events from Step 2
Objects of Interest — respond with the objects from Step 2, or "skip"
Enterprise RAG Query — respond with the query from Step 2, or "skip"
Confirmation — respond with empty string "" to confirm and start processing

Repeat the POST-then-poll cycle for each prompt.

针对每个提示，将用户提供的参数POST到response_url。替换EXECUTION_ID、INTERACTION_ID和text值：

bash

curl -sS -X POST \
  "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID/interactions/INTERACTION_ID/response" \
  -H "Content-Type: application/json" \
  -d '{"response": {"type": "text", "text": "USER_VALUE_HERE"}}'

然后轮询获取下一个提示：

bash

curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool

人机协同提示的顺序如下：

场景 — 用步骤2中获取的场景响应
事件 — 用步骤2中获取的事件响应
关注对象 — 用步骤2中获取的对象响应，或输入“skip”
企业级RAG查询 — 用步骤2中获取的查询响应，或输入“skip”
确认 — 输入空字符串""以确认并开始处理

针对每个提示重复“提交响应-轮询”的循环。

Step 5: Wait for completion

步骤5：等待处理完成

After the confirmation prompt, the system processes the video. This takes 3-5 minutes. Keep polling until the status changes from "running" to "completed":

bash

curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool

Tell the user to wait — this takes 3-5 minutes. Poll every 30 seconds.

确认提示提交后，系统将开始处理视频。此过程需要3-5分钟。持续轮询直到状态从“running”变为“completed”：

bash

curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool

请告知用户等待——此过程需要3-5分钟。每30秒轮询一次。

Step 6: Present the results

步骤6：展示结果

When status is "completed", the response contains the full report with:

Detected events with timestamps
Narrative analysis summary
Enterprise RAG context (if queried)
PDF report download link (if available)

Present the report content to the user in a readable format.

当状态变为“completed”时，响应将包含完整报告，内容包括：

带时间戳的已检测事件
叙述性分析摘要
企业级RAG上下文（如果进行了查询）
PDF报告下载链接（如果可用）

以易读格式向用户展示报告内容。

Quick Commands

快捷命令

Health check

健康检查

bash

curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health"

bash

curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health"

Simple chat query (non-report)

简单聊天查询（非报告类）

For simple questions that do NOT involve report generation:

bash

curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "YOUR_QUESTION_HERE"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"

针对无需生成报告的简单问题：

bash

curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "YOUR_QUESTION_HERE"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"

Notes

注意事项

LVS reports take 3-5 minutes for a ~3.5 minute video — always tell the user to wait
Enterprise RAG requires a Milvus vector database with data ingested
If objects or rag_query are not needed, respond with "skip"

The HITL response format is always:

{"response": {"type": "text", "text": "value"}}

```
enable_interactive_extensions: true
```
must be set in the frag config for HTTP HITL to work