Search Results: video-understanding

Found 7 Skills

watch-youtube

Watch and analyze YouTube videos using Gemini's video understanding API. Pass any YouTube URL to get summaries, timestamps, Q&A, or detailed analysis of video content — audio and visual.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningnvidia/skills

video-understanding

Call the vss agent to run video understanding on video to answer a text question. Use when the user asks about video content, or about visual details that cannot be answered from conversation history, search hits, or metadata alone.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

vss-ask-video

Use to ask the VSS agent's video_understanding tool a fresh visual question about a recorded clip. Not for prior tool output, search hits, or metadata-answerable questions.

🇺🇸|EnglishTranslated

AI & Machine Learningjordanrendric/claude-vide...

video-perception

Use when the user mentions a video file (.mp4, .mov, .avi, .mkv, .webm), a YouTube URL, asks to watch/analyze/review a video, or references video content in conversation

🇺🇸|EnglishTranslated

AI & Machine Learningqianwen-ai/qianwen-ai

qianwen-vision

[QianWen] Understand images and videos with Qwen vision models. TRIGGER when: user wants to analyze, describe, or extract information from images or videos, OCR text extraction, chart/table reading, visual reasoning, multi-image comparison, screenshot understanding, video comprehension, or explicitly invokes this skill by name (e.g. use qianwen-vision). DO NOT TRIGGER when: user wants to generate/create images (use qianwen-image-generation), generate videos (use qianwen-video-generation), text-only tasks without visual input, or non-Qwen vision tasks.

🇺🇸|EnglishTranslated

6 scripts/Checked

AI & Machine Learningnvidia/skills

report

Produce video analysis reports by discovering the deployed VSS agent, querying POST /generate for a timestamped captioned summary of the clip, then formatting the agent reply as the standard Video Analysis Report markdown.

🇺🇸|EnglishTranslated

AI & Machine Learningxsir0/xsir-skills

google-gemini-media

Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understanding".

🇺🇸|EnglishTranslated

19 scripts/Checked