videodb
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVideoDB Skill
VideoDB Skill
Perception + memory + actions for video, live streams, and desktop sessions.
针对视频、直播流和桌面会话的感知+记忆+处理能力。
When to use
适用场景
Desktop Perception
桌面感知
- Start/stop a desktop session capturing screen, mic, and system audio
- Stream live context and store episodic session memory
- Run real-time alerts/triggers on what's spoken and what's happening on screen
- Produce session summaries, a searchable timeline, and playable evidence links
- 启动/停止捕获屏幕、麦克风和系统音频的桌面会话
- 传输实时上下文并存储会话片段记忆
- 针对屏幕上的画面和语音内容运行实时警报/触发器
- 生成会话摘要、可搜索的时间线以及可播放的证据链接
Video ingest + stream
视频导入与流处理
- Ingest a file or URL and return a playable web stream link
- Transcode/normalize: codec, bitrate, fps, resolution, aspect ratio
- 导入文件或URL并返回可播放的网页流链接
- 转码/标准化:调整编解码器、比特率、帧率、分辨率、宽高比
Index + search (timestamps + evidence)
索引与搜索(时间戳+证据)
- Build visual, spoken, and keyword indexes
- Search and return exact moments with timestamps and playable evidence
- Auto-create clips from search results
- 构建视觉、语音和关键词索引
- 搜索并返回包含时间戳和可播放证据的精准时刻
- 根据搜索结果自动创建剪辑片段
Timeline editing + generation
时间线编辑与生成
- Subtitles: generate, translate, burn-in
- Overlays: text/image/branding, motion captions
- Audio: background music, voiceover, dubbing
- Programmatic composition and exports via timeline operations
- 字幕:生成、翻译、内嵌
- 叠加层:文字/图片/品牌标识、动态字幕
- 音频:背景音乐、旁白、配音
- 通过时间线操作进行程序化合成与导出
Live streams (RTSP) + monitoring
直播流(RTSP)与监控
- Connect RTSP/live feeds
- Run real-time visual and spoken understanding and emit events/alerts for monitoring workflows
- 连接RTSP/直播流
- 运行实时视觉与语音理解,并为监控工作流触发事件/警报
How it works
工作原理
Common inputs
常见输入
- Local file path, public URL, or RTSP URL
- Desktop capture request: start / stop / summarize session
- Desired operations: get context for understanding, transcode spec, index spec, search query, clip ranges, timeline edits, alert rules
- 本地文件路径、公共URL或RTSP URL
- 桌面捕获请求:启动/停止/总结会话
- 所需操作:获取理解用的上下文、转码规格、索引规格、搜索查询、剪辑范围、时间线编辑、警报规则
Common outputs
常见输出
- Stream URL
- Search results with timestamps and evidence links
- Generated assets: subtitles, audio, images, clips
- Event/alert payloads for live streams
- Desktop session summaries and memory entries
- 流链接
- 包含时间戳和证据链接的搜索结果
- 生成的资产:字幕、音频、图片、剪辑片段
- 直播流的事件/警报负载
- 桌面会话摘要和记忆条目
Running Python code
运行Python代码
Before running any VideoDB code, change to the project directory and load environment variables:
python
from dotenv import load_dotenv
load_dotenv(".env")
import videodb
conn = videodb.connect()This reads from:
VIDEO_DB_API_KEY- Environment (if already exported)
- Project's file in current directory
.env
If the key is missing, raises automatically.
videodb.connect()AuthenticationErrorDo NOT write a script file when a short inline command works.
When writing inline Python (), always use properly formatted code — use semicolons to separate statements and keep it readable. For anything longer than ~3 statements, use a heredoc instead:
python -c "..."bash
python << 'EOF'
from dotenv import load_dotenv
load_dotenv(".env")
import videodb
conn = videodb.connect()
coll = conn.get_collection()
print(f"Videos: {len(coll.get_videos())}")
EOF在运行任何VideoDB代码前,切换到项目目录并加载环境变量:
python
from dotenv import load_dotenv
load_dotenv(".env")
import videodb
conn = videodb.connect()该代码会从以下位置读取:
VIDEO_DB_API_KEY- 环境变量(若已导出)
- 当前目录下的项目文件
.env
如果密钥缺失,会自动抛出。
videodb.connect()AuthenticationError当简短的内联命令可实现需求时,请勿编写脚本文件。
编写内联Python代码()时,请始终使用格式规范的代码——使用分号分隔语句并保持可读性。对于超过约3条语句的代码,请使用 heredoc:
python -c "..."bash
python << 'EOF'
from dotenv import load_dotenv
load_dotenv(".env")
import videodb
conn = videodb.connect()
coll = conn.get_collection()
print(f"Videos: {len(coll.get_videos())}")
EOFSetup
安装配置
When the user asks to "setup videodb" or similar:
当用户要求“设置videodb”或类似需求时:
1. Install SDK
1. 安装SDK
bash
pip install "videodb[capture]" python-dotenvIf fails on Linux, install without the capture extra:
videodb[capture]bash
pip install videodb python-dotenvbash
pip install "videodb[capture]" python-dotenv如果在Linux上安装失败,请安装不带capture扩展的版本:
videodb[capture]bash
pip install videodb python-dotenv2. Configure API key
2. 配置API密钥
The user must set using either method:
VIDEO_DB_API_KEY- Export in terminal (before starting Claude):
export VIDEO_DB_API_KEY=your-key - Project file: Save
.envin the project'sVIDEO_DB_API_KEY=your-keyfile.env
Get a free API key at https://console.videodb.io (50 free uploads, no credit card).
Do NOT read, write, or handle the API key yourself. Always let the user set it.
用户必须通过以下任意一种方式设置:
VIDEO_DB_API_KEY- 在终端中导出(启动Claude前):
export VIDEO_DB_API_KEY=your-key - 项目.env文件:在项目的.env文件中保存
VIDEO_DB_API_KEY=your-key
请勿自行读取、写入或处理API密钥,请始终让用户自行设置。
Quick Reference
快速参考
Upload media
上传媒体
python
undefinedpython
undefinedURL
URL
video = coll.upload(url="https://example.com/video.mp4")
video = coll.upload(url="https://example.com/video.mp4")
YouTube
YouTube
video = coll.upload(url="https://www.youtube.com/watch?v=VIDEO_ID")
video = coll.upload(url="https://www.youtube.com/watch?v=VIDEO_ID")
Local file
本地文件
video = coll.upload(file_path="/path/to/video.mp4")
undefinedvideo = coll.upload(file_path="/path/to/video.mp4")
undefinedTranscript + subtitle
转录与字幕
python
undefinedpython
undefinedforce=True skips the error if the video is already indexed
force=True会在视频已被索引时跳过错误
video.index_spoken_words(force=True)
text = video.get_transcript_text()
stream_url = video.add_subtitle()
undefinedvideo.index_spoken_words(force=True)
text = video.get_transcript_text()
stream_url = video.add_subtitle()
undefinedSearch inside videos
视频内搜索
python
from videodb.exceptions import InvalidRequestError
video.index_spoken_words(force=True)python
from videodb.exceptions import InvalidRequestError
video.index_spoken_words(force=True)search() raises InvalidRequestError when no results are found.
当无结果时,search()会抛出InvalidRequestError。
Always wrap in try/except and treat "No results found" as empty.
请始终用try/except包裹,并将“未找到结果”视为空结果。
try:
results = video.search("product demo")
shots = results.get_shots()
stream_url = results.compile()
except InvalidRequestError as e:
if "No results found" in str(e):
shots = []
else:
raise
undefinedtry:
results = video.search("product demo")
shots = results.get_shots()
stream_url = results.compile()
except InvalidRequestError as e:
if "No results found" in str(e):
shots = []
else:
raise
undefinedScene search
场景搜索
python
import re
from videodb import SearchType, IndexType, SceneExtractionType
from videodb.exceptions import InvalidRequestErrorpython
import re
from videodb import SearchType, IndexType, SceneExtractionType
from videodb.exceptions import InvalidRequestErrorindex_scenes() has no force parameter — it raises an error if a scene
index_scenes()没有force参数——如果场景索引已存在,会抛出错误。从错误信息中提取现有索引ID。
index already exists. Extract the existing index ID from the error.
—
try:
scene_index_id = video.index_scenes(
extraction_type=SceneExtractionType.shot_based,
prompt="Describe the visual content in this scene.",
)
except Exception as e:
match = re.search(r"id\s+([a-f0-9]+)", str(e))
if match:
scene_index_id = match.group(1)
else:
raise
try:
scene_index_id = video.index_scenes(
extraction_type=SceneExtractionType.shot_based,
prompt="Describe the visual content in this scene.",
)
except Exception as e:
match = re.search(r"id\s+([a-f0-9]+)", str(e))
if match:
scene_index_id = match.group(1)
else:
raise
Use score_threshold to filter low-relevance noise (recommended: 0.3+)
使用score_threshold过滤低相关性结果(推荐值:0.3+)
try:
results = video.search(
query="person writing on a whiteboard",
search_type=SearchType.semantic,
index_type=IndexType.scene,
scene_index_id=scene_index_id,
score_threshold=0.3,
)
shots = results.get_shots()
stream_url = results.compile()
except InvalidRequestError as e:
if "No results found" in str(e):
shots = []
else:
raise
undefinedtry:
results = video.search(
query="person writing on a whiteboard",
search_type=SearchType.semantic,
index_type=IndexType.scene,
scene_index_id=scene_index_id,
score_threshold=0.3,
)
shots = results.get_shots()
stream_url = results.compile()
except InvalidRequestError as e:
if "No results found" in str(e):
shots = []
else:
raise
undefinedTimeline editing
时间线编辑
Important: Always validate timestamps before building a timeline:
- must be >= 0 (negative values are silently accepted but produce broken output)
start - must be <
startend - must be <=
endvideo.length
python
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, TextStyle
timeline = Timeline(conn)
timeline.add_inline(VideoAsset(asset_id=video.id, start=10, end=30))
timeline.add_overlay(0, TextAsset(text="The End", duration=3, style=TextStyle(fontsize=36)))
stream_url = timeline.generate_stream()注意: 在构建时间线前,请始终验证时间戳:
- 必须 >= 0(负值会被静默接受,但会产生损坏的输出)
start - 必须 <
startend - 必须 <=
endvideo.length
python
from videodb.timeline import Timeline
from videodb.asset import VideoAsset, TextAsset, TextStyle
timeline = Timeline(conn)
timeline.add_inline(VideoAsset(asset_id=video.id, start=10, end=30))
timeline.add_overlay(0, TextAsset(text="The End", duration=3, style=TextStyle(fontsize=36)))
stream_url = timeline.generate_stream()Transcode video (resolution / quality change)
转码视频(调整分辨率/画质)
python
from videodb import TranscodeMode, VideoConfig, AudioConfigpython
from videodb import TranscodeMode, VideoConfig, AudioConfigChange resolution, quality, or aspect ratio server-side
在服务器端调整分辨率、画质或宽高比
job_id = conn.transcode(
source="https://example.com/video.mp4",
callback_url="https://example.com/webhook",
mode=TranscodeMode.economy,
video_config=VideoConfig(resolution=720, quality=23, aspect_ratio="16:9"),
audio_config=AudioConfig(mute=False),
)
undefinedjob_id = conn.transcode(
source="https://example.com/video.mp4",
callback_url="https://example.com/webhook",
mode=TranscodeMode.economy,
video_config=VideoConfig(resolution=720, quality=23, aspect_ratio="16:9"),
audio_config=AudioConfig(mute=False),
)
undefinedReframe aspect ratio (for social platforms)
调整宽高比(适配社交平台)
Warning: is a slow server-side operation. For long videos it can take
several minutes and may time out. Best practices:
reframe()- Always limit to a short segment using /
startwhen possibleend - For full-length videos, use for async processing
callback_url - Trim the video on a first, then reframe the shorter result
Timeline
python
from videodb import ReframeMode警告: 是一项较慢的服务器端操作。对于长视频,可能需要数分钟甚至超时。最佳实践:
reframe()- 尽可能使用/
start限制为短片段end - 对于全长度视频,使用进行异步处理
callback_url - 先在上修剪视频,再对较短的结果进行宽高比调整
Timeline
python
from videodb import ReframeModeAlways prefer reframing a short segment:
优先对短片段进行宽高比调整:
reframed = video.reframe(start=0, end=60, target="vertical", mode=ReframeMode.smart)
reframed = video.reframe(start=0, end=60, target="vertical", mode=ReframeMode.smart)
Async reframe for full-length videos (returns None, result via webhook):
全长度视频异步调整宽高比(返回None,结果通过webhook获取):
video.reframe(target="vertical", callback_url="https://example.com/webhook")
video.reframe(target="vertical", callback_url="https://example.com/webhook")
Presets: "vertical" (9:16), "square" (1:1), "landscape" (16:9)
预设值:"vertical"(9:16)、"square"(1:1)、"landscape"(16:9)
reframed = video.reframe(start=0, end=60, target="square")
reframed = video.reframe(start=0, end=60, target="square")
Custom dimensions
自定义尺寸
reframed = video.reframe(start=0, end=60, target={"width": 1280, "height": 720})
undefinedreframed = video.reframe(start=0, end=60, target={"width": 1280, "height": 720})
undefinedGenerative media
生成式媒体
python
image = coll.generate_image(
prompt="a sunset over mountains",
aspect_ratio="16:9",
)python
image = coll.generate_image(
prompt="a sunset over mountains",
aspect_ratio="16:9",
)Error handling
错误处理
python
from videodb.exceptions import AuthenticationError, InvalidRequestError
try:
conn = videodb.connect()
except AuthenticationError:
print("Check your VIDEO_DB_API_KEY")
try:
video = coll.upload(url="https://example.com/video.mp4")
except InvalidRequestError as e:
print(f"Upload failed: {e}")python
from videodb.exceptions import AuthenticationError, InvalidRequestError
try:
conn = videodb.connect()
except AuthenticationError:
print("Check your VIDEO_DB_API_KEY")
try:
video = coll.upload(url="https://example.com/video.mp4")
except InvalidRequestError as e:
print(f"Upload failed: {e}")Common pitfalls
常见问题
| Scenario | Error message | Solution |
|---|---|---|
| Indexing an already-indexed video | | Use |
| Scene index already exists | | Extract the existing |
| Search finds no matches | | Catch the exception and treat as empty results ( |
| Reframe times out | Blocks indefinitely on long videos | Use |
| Negative timestamps on Timeline | Silently produces broken stream | Always validate |
| | Plan-gated features — inform the user about plan limits |
| 场景 | 错误信息 | 解决方案 |
|---|---|---|
| 对已索引的视频再次索引 | | 使用 |
| 场景索引已存在 | | 使用 |
| 搜索无匹配结果 | | 捕获异常并视为空结果( |
| Reframe操作超时 | 长时间阻塞无响应 | 使用 |
| 时间线使用负时间戳 | 静默生成损坏的流 | 创建 |
| | 该功能受计划限制——告知用户相关计划限制 |
Examples
示例
Canonical prompts
标准提示词
- "Start desktop capture and alert when a password field appears."
- "Record my session and produce an actionable summary when it ends."
- "Ingest this file and return a playable stream link."
- "Index this folder and find every scene with people, return timestamps."
- "Generate subtitles, burn them in, and add light background music."
- "Connect this RTSP URL and alert when a person enters the zone."
- "启动桌面捕获,当出现密码输入框时触发警报。"
- "录制我的会话,结束后生成可执行的摘要。"
- "导入该文件并返回可播放的流链接。"
- "索引该文件夹并找出所有有人物的场景,返回时间戳。"
- "生成字幕并内嵌,同时添加轻柔的背景音乐。"
- "连接该RTSP URL,当有人进入区域时触发警报。"
Screen Recording (Desktop Capture)
屏幕录制(桌面捕获)
Use to capture WebSocket events during recording sessions. Desktop capture supports macOS only.
ws_listener.py使用在录制会话期间捕获WebSocket事件。桌面捕获仅支持macOS。
ws_listener.pyQuick Start
快速开始
- Choose state dir:
STATE_DIR="${VIDEODB_EVENTS_DIR:-$HOME/.local/state/videodb}" - Start listener:
VIDEODB_EVENTS_DIR="$STATE_DIR" python scripts/ws_listener.py --clear "$STATE_DIR" & - Get WebSocket ID:
cat "$STATE_DIR/videodb_ws_id" - Run capture code (see reference/capture.md for the full workflow)
- Events written to:
$STATE_DIR/videodb_events.jsonl
Use whenever you start a fresh capture run so stale transcript and visual events do not leak into the new session.
--clear- 选择状态目录:
STATE_DIR="${VIDEODB_EVENTS_DIR:-$HOME/.local/state/videodb}" - 启动监听器:
VIDEODB_EVENTS_DIR="$STATE_DIR" python scripts/ws_listener.py --clear "$STATE_DIR" & - 获取WebSocket ID:
cat "$STATE_DIR/videodb_ws_id" - 运行捕获代码(完整工作流请参考reference/capture.md)
- 事件写入位置:
$STATE_DIR/videodb_events.jsonl
每次启动新的捕获运行时,请使用参数,避免旧的转录和视觉事件泄露到新会话中。
--clearQuery Events
查询事件
python
import json
import os
import time
from pathlib import Path
events_dir = Path(os.environ.get("VIDEODB_EVENTS_DIR", Path.home() / ".local" / "state" / "videodb"))
events_file = events_dir / "videodb_events.jsonl"
events = []
if events_file.exists():
with events_file.open(encoding="utf-8") as handle:
for line in handle:
try:
events.append(json.loads(line))
except json.JSONDecodeError:
continue
transcripts = [e["data"]["text"] for e in events if e.get("channel") == "transcript"]
cutoff = time.time() - 300
recent_visual = [
e for e in events
if e.get("channel") == "visual_index" and e["unix_ts"] > cutoff
]python
import json
import os
import time
from pathlib import Path
events_dir = Path(os.environ.get("VIDEODB_EVENTS_DIR", Path.home() / ".local" / "state" / "videodb"))
events_file = events_dir / "videodb_events.jsonl"
events = []
if events_file.exists():
with events_file.open(encoding="utf-8") as handle:
for line in handle:
try:
events.append(json.loads(line))
except json.JSONDecodeError:
continue
transcripts = [e["data"]["text"] for e in events if e.get("channel") == "transcript"]
cutoff = time.time() - 300
recent_visual = [
e for e in events
if e.get("channel") == "visual_index" and e["unix_ts"] > cutoff
]Additional docs
附加文档
Reference documentation is in the directory adjacent to this SKILL.md file. Use the Glob tool to locate it if needed.
reference/- reference/api-reference.md - Complete VideoDB Python SDK API reference
- reference/search.md - In-depth guide to video search (spoken word and scene-based)
- reference/editor.md - Timeline editing, assets, and composition
- reference/streaming.md - HLS streaming and instant playback
- reference/generative.md - AI-powered media generation (images, video, audio)
- reference/rtstream.md - Live stream ingestion workflow (RTSP/RTMP)
- reference/rtstream-reference.md - RTStream SDK methods and AI pipelines
- reference/capture.md - Desktop capture workflow
- reference/capture-reference.md - Capture SDK and WebSocket events
- reference/use-cases.md - Common video processing patterns and examples
Do not use ffmpeg, moviepy, or local encoding tools when VideoDB supports the operation. The following are all handled server-side by VideoDB — trimming, combining clips, overlaying audio or music, adding subtitles, text/image overlays, transcoding, resolution changes, aspect-ratio conversion, resizing for platform requirements, transcription, and media generation. Only fall back to local tools for operations listed under Limitations in reference/editor.md (transitions, speed changes, crop/zoom, colour grading, volume mixing).
参考文档位于本SKILL.md文件相邻的目录中。如有需要,请使用Glob工具查找。
reference/- reference/api-reference.md - 完整的VideoDB Python SDK API参考
- reference/search.md - 视频搜索深度指南(语音与场景搜索)
- reference/editor.md - 时间线编辑、资产与合成
- reference/streaming.md - HLS流与即时播放
- reference/generative.md - AI驱动的媒体生成(图片、视频、音频)
- reference/rtstream.md - 直播流导入工作流(RTSP/RTMP)
- reference/rtstream-reference.md - RTStream SDK方法与AI流水线
- reference/capture.md - 桌面捕获工作流
- reference/capture-reference.md - 捕获SDK与WebSocket事件
- reference/use-cases.md - 常见视频处理模式与示例
**当VideoDB支持相关操作时,请勿使用ffmpeg、moviepy或本地编码工具。**以下操作均由VideoDB在服务器端处理——修剪、合并剪辑、叠加音频或音乐、添加字幕、文字/图片叠加、转码、分辨率调整、宽高比转换、适配平台的尺寸调整、转录以及媒体生成。仅当操作属于reference/editor.md中列出的限制项(转场、速度调整、裁剪/缩放、色彩分级、音量混合)时,才使用本地工具作为备选。
When to use what
工具选择指南
| Problem | VideoDB solution |
|---|---|
| Platform rejects video aspect ratio or resolution | |
| Need to resize video for Twitter/Instagram/TikTok | |
| Need to change resolution (e.g. 1080p → 720p) | |
| Need to overlay audio/music on video | |
| Need to add subtitles | |
| Need to combine/trim clips | |
| Need to generate voiceover, music, or SFX | |
| 问题 | VideoDB解决方案 |
|---|---|
| 平台拒绝视频的宽高比或分辨率 | |
| 需要调整视频尺寸以适配Twitter/Instagram/TikTok | |
| 需要更改分辨率(如1080p → 720p) | 带 |
| 需要在视频上叠加音频/音乐 | 在 |
| 需要添加字幕 | |
| 需要合并/修剪剪辑 | 在 |
| 需要生成旁白、音乐或音效 | |