vss-frag

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

VSS Frag — Video Analysis with Enterprise RAG

VSS Frag — 结合企业级RAG的视频分析

Generate video summary reports using the VSS
video_search_frag
extension. This skill adds Enterprise RAG (Milvus) knowledge retrieval and guided human-in-the-loop (HITL) parameter collection on top of the base VSS agent.
Always run
curl
commands yourself; never instruct the user to run them.
借助VSS
video_search_frag
扩展生成视频摘要报告。该技能在基础VSS Agent之上新增了企业级RAG(Milvus)知识检索和引导式人机协同(HITL)参数收集功能。
请自行执行
curl
命令;切勿指导用户执行这些命令。

Deploying the Frag Extension

部署Frag扩展

The frag extension layers Enterprise RAG and HITL LVS tools on top of the base VSS agent image. Deployment is a two-step Docker build followed by compose up.
Environment variables: All commands use values from the
.env
file at
deployments/developer-workflow/dev-profile-lvs/.env
. Edit it before deploying. Key variables:
HOST_IP
,
VSS_AGENT_PORT
(default
8000
),
NGC_CLI_API_KEY
,
NVIDIA_API_KEY
,
ENTERPRISE_RAG_*
.
Frag扩展在基础VSS Agent镜像之上叠加了企业级RAG和HITL LVS工具。部署分为两步:先进行Docker构建,再执行compose启动。
环境变量: 所有命令均使用
deployments/developer-workflow/dev-profile-lvs/.env
文件中的值。部署前请编辑该文件。关键变量:
HOST_IP
VSS_AGENT_PORT
(默认值
8000
)、
NGC_CLI_API_KEY
NVIDIA_API_KEY
ENTERPRISE_RAG_*

Step 1: Configure the .env file

步骤1:配置.env文件

bash
nano deployments/developer-workflow/dev-profile-lvs/.env
Set at minimum:
  • HOST_IP
    — your machine's IP (
    hostname -I | awk '{print $1}'
    )
  • NGC_CLI_API_KEY
    — from https://ngc.nvidia.com/
  • NVIDIA_API_KEY
    — from https://build.nvidia.com/
  • VSS_AGENT_CONFIG_FILE=./configs/video_search_frag/config.yml
  • ENTERPRISE_RAG_VDB_ENDPOINT
    — your Milvus endpoint (e.g.,
    tcp://127.0.0.1:19530
    )
  • ENTERPRISE_RAG_COLLECTION_NAMES
    — your Milvus collection name
bash
nano deployments/developer-workflow/dev-profile-lvs/.env
至少设置以下变量:
  • HOST_IP
    — 你的机器IP(可通过
    hostname -I | awk '{print $1}'
    获取)
  • NGC_CLI_API_KEY
    — 从https://ngc.nvidia.com/获取
  • NVIDIA_API_KEY
    — 从https://build.nvidia.com/获取
  • VSS_AGENT_CONFIG_FILE=./configs/video_search_frag/config.yml
  • ENTERPRISE_RAG_VDB_ENDPOINT
    — 你的Milvus端点(例如:
    tcp://127.0.0.1:19530
  • ENTERPRISE_RAG_COLLECTION_NAMES
    — 你的Milvus集合名称

Step 2: Log in to NGC registry

步骤2:登录NGC注册表

bash
echo "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin
bash
echo "$NGC_CLI_API_KEY" | docker login nvcr.io --username '$oauthtoken' --password-stdin

Step 3: Build the base agent image

步骤3:构建基础Agent镜像

bash
cd agent
docker build -f docker/Dockerfile -t vss-agent-base .
bash
cd agent
docker build -f docker/Dockerfile -t vss-agent-base .

Step 4: Build the frag extension image

步骤4:构建Frag扩展镜像

bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build
This produces
vss-agent-frag:latest
— the base agent extended with
video_search_frag
(Enterprise RAG, HITL LVS, PDF report generation).
bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build
此命令会生成
vss-agent-frag:latest
镜像——即在基础Agent上扩展了
video_search_frag
功能(包含企业级RAG、HITL LVS、PDF报告生成)的镜像。

Step 5: Deploy with docker compose

步骤5:使用docker compose部署

bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d
Two
-f
flags: the frag compose defines
vss-agent
, the UI compose defines
metropolis-vss-ui
. They merge into a single deployment.
bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d
两个
-f
参数:frag的compose文件定义了
vss-agent
,UI的compose文件定义了
metropolis-vss-ui
,二者会合并为一个部署实例。

Step 6: Verify deployment

步骤6:验证部署

bash
undefined
bash
undefined

Check containers are running

检查容器是否运行

docker ps --format "table {{.Names}}\t{{.Status}}"
docker ps --format "table {{.Names}}\t{{.Status}}"

Health check

健康检查

curl -sf --max-time 5 "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health" >/dev/null
&& echo "VSS frag agent is running"
|| echo "VSS frag agent is NOT reachable"
undefined
curl -sf --max-time 5 "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health" >/dev/null
&& echo "VSS frag agent is running"
|| echo "VSS frag agent is NOT reachable"
undefined

Tear down

停止部署

bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down
bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

Rebuild after code changes

代码变更后重新构建

Always
down
then rebuild and
up
— never just
up -d
alone after changes.
bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d
请先执行
down
,再重新构建并启动——代码变更后切勿仅执行
up -d
bash
docker compose \
  -f app/video_search_frag/docker-compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  build

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  down

docker compose \
  -f app/video_search_frag/docker-compose.yml \
  -f ../deployments/agents/agent_ui/compose.yml \
  --env-file ../deployments/developer-workflow/dev-profile-lvs/.env \
  --profile bp_developer_lvs_2d \
  up -d

When to Use

适用场景

  • User wants to generate a video summary or report using the frag pipeline
  • User asks to analyze a video with Enterprise RAG knowledge context
  • User mentions "frag", "enterprise RAG", or "knowledge-enhanced report"
  • 用户希望通过frag流水线生成视频摘要或报告
  • 用户要求结合企业级RAG知识上下文分析视频
  • 用户提及“frag”、“enterprise RAG”或“知识增强型报告”

When NOT to Use

不适用场景

  • Simple video understanding queries (use
    video-understanding
    skill)
  • Direct LVS summarization without HITL (use
    video-summarization
    skill)
  • Deployment tasks (use
    deploy
    skill)
  • Real-time alerts (use
    alerts
    skill)
  • 简单视频理解查询(请使用
    video-understanding
    技能)
  • 无需人机协同的直接LVS摘要生成(请使用
    video-summarization
    技能)
  • 部署任务(请使用
    deploy
    技能)
  • 实时告警(请使用
    alerts
    技能)

Workflow: Generate an LVS Report with Enterprise RAG

工作流:结合企业级RAG生成LVS报告

Step 1: List available videos

步骤1:列出可用视频

bash
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What videos are available?"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"
Show the user the video list and ask which one they want to analyze.
bash
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What videos are available?"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"
向用户展示视频列表,并询问他们想要分析哪一个。

Step 2: Collect parameters from the user

步骤2:向用户收集参数

Ask the user for these four inputs one at a time:
  1. Scenario — What type of scenario is the video about? Example: "warehouse monitoring", "traffic monitoring", "retail store activity"
  2. Events — What events should be detected? Comma-separated. Example: "accident, forklift stuck, workers not wearing PPE, person entering restricted area"
  3. Objects of Interest — What objects should the analysis focus on? Or "skip" to skip. Example: "forklifts, pallets, workers"
  4. Enterprise RAG Query — An optional question to search the enterprise knowledge base for additional context to include in the report. Or "skip" to skip. Example: "What are the principles of STCC?"
请依次向用户询问以下四个输入:
  1. 场景 — 视频涉及何种场景? 示例:“仓库监控”、“交通监控”、“零售店活动”
  2. 事件 — 需要检测哪些事件?用逗号分隔。 示例:“事故、叉车故障、工人未佩戴PPE、人员进入限制区域”
  3. 关注对象 — 分析应聚焦于哪些对象?或输入“skip”跳过。 示例:“叉车、托盘、工人”
  4. 企业级RAG查询 — 可选问题,用于搜索企业知识库以获取报告中需包含的额外上下文。或输入“skip”跳过。 示例:“STCC的原则是什么?”

Step 3: Start the report (HTTP HITL)

步骤3:启动报告(HTTP人机协同)

Send a POST to
/v1/chat
. This returns HTTP 202 with an execution_id and the first HITL prompt. Replace VIDEO_NAME with the chosen video:
bash
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Generate a report for VIDEO_NAME using long video summarization"}]}'
The response contains:
  • execution_id
    — save this, used in all subsequent requests
  • interaction_id
    — identifies the current prompt
  • prompt.text
    — the HITL prompt text
  • response_url
    — the URL to POST the response to
/v1/chat
发送POST请求。返回HTTP 200状态码,包含execution_id和第一个人机协同提示。将VIDEO_NAME替换为用户选择的视频:
bash
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Generate a report for VIDEO_NAME using long video summarization"}]}'
响应包含:
  • execution_id
    — 请保存该值,后续所有请求都会用到
  • interaction_id
    — 标识当前提示
  • prompt.text
    — 人机协同提示文本
  • response_url
    — 用于提交响应的URL

Step 4: Respond to HITL prompts

步骤4:响应人机协同提示

For each prompt, POST the user's parameter to the response_url. Replace EXECUTION_ID, INTERACTION_ID, and the text value:
bash
curl -sS -X POST \
  "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID/interactions/INTERACTION_ID/response" \
  -H "Content-Type: application/json" \
  -d '{"response": {"type": "text", "text": "USER_VALUE_HERE"}}'
Then poll for the next prompt:
bash
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool
The HITL prompts come in this order:
  1. Scenario — respond with the scenario from Step 2
  2. Events — respond with the events from Step 2
  3. Objects of Interest — respond with the objects from Step 2, or "skip"
  4. Enterprise RAG Query — respond with the query from Step 2, or "skip"
  5. Confirmation — respond with empty string "" to confirm and start processing
Repeat the POST-then-poll cycle for each prompt.
针对每个提示,将用户提供的参数POST到response_url。替换EXECUTION_ID、INTERACTION_ID和text值:
bash
curl -sS -X POST \
  "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID/interactions/INTERACTION_ID/response" \
  -H "Content-Type: application/json" \
  -d '{"response": {"type": "text", "text": "USER_VALUE_HERE"}}'
然后轮询获取下一个提示:
bash
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool
人机协同提示的顺序如下:
  1. 场景 — 用步骤2中获取的场景响应
  2. 事件 — 用步骤2中获取的事件响应
  3. 关注对象 — 用步骤2中获取的对象响应,或输入“skip”
  4. 企业级RAG查询 — 用步骤2中获取的查询响应,或输入“skip”
  5. 确认 — 输入空字符串""以确认并开始处理
针对每个提示重复“提交响应-轮询”的循环。

Step 5: Wait for completion

步骤5:等待处理完成

After the confirmation prompt, the system processes the video. This takes 3-5 minutes. Keep polling until the status changes from "running" to "completed":
bash
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool
Tell the user to wait — this takes 3-5 minutes. Poll every 30 seconds.
确认提示提交后,系统将开始处理视频。此过程需要3-5分钟。持续轮询直到状态从“running”变为“completed”:
bash
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/executions/EXECUTION_ID" | python3 -m json.tool
请告知用户等待——此过程需要3-5分钟。每30秒轮询一次。

Step 6: Present the results

步骤6:展示结果

When status is "completed", the response contains the full report with:
  • Detected events with timestamps
  • Narrative analysis summary
  • Enterprise RAG context (if queried)
  • PDF report download link (if available)
Present the report content to the user in a readable format.
当状态变为“completed”时,响应将包含完整报告,内容包括:
  • 带时间戳的已检测事件
  • 叙述性分析摘要
  • 企业级RAG上下文(如果进行了查询)
  • PDF报告下载链接(如果可用)
以易读格式向用户展示报告内容。

Quick Commands

快捷命令

Health check

健康检查

bash
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health"
bash
curl -sS "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/health"

Simple chat query (non-report)

简单聊天查询(非报告类)

For simple questions that do NOT involve report generation:
bash
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "YOUR_QUESTION_HERE"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"
针对无需生成报告的简单问题:
bash
curl -sS -X POST "http://${HOST_IP}:${VSS_AGENT_PORT:-8000}/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "YOUR_QUESTION_HERE"}]}' | \
  python3 -c "import json,sys; d=json.load(sys.stdin); print(d['choices'][0]['message']['content'])"

Notes

注意事项

  • LVS reports take 3-5 minutes for a ~3.5 minute video — always tell the user to wait
  • Enterprise RAG requires a Milvus vector database with data ingested
  • If objects or rag_query are not needed, respond with "skip"
  • The HITL response format is always:
    {"response": {"type": "text", "text": "value"}}
  • enable_interactive_extensions: true
    must be set in the frag config for HTTP HITL to work
  • See also:
    video-summarization
    ,
    video-understanding
    ,
    report
    ,
    vios
    ,
    deploy
  • 针对约3.5分钟的视频,LVS报告生成需要3-5分钟——务必告知用户等待
  • 企业级RAG需要已导入数据的Milvus向量数据库
  • 如果不需要关注对象或rag_query,请响应“skip”
  • 人机协同响应格式固定为:
    {"response": {"type": "text", "text": "value"}}
  • 要使HTTP人机协同生效,必须在frag配置中设置
    enable_interactive_extensions: true
  • 另请参考:
    video-summarization
    video-understanding
    report
    vios
    deploy