Loading...
Loading...
Found 23 Skills
Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.
Process multimedia files with FFmpeg (video/audio encoding, conversion, streaming, filtering, hardware acceleration) and ImageMagick (image manipulation, format conversion, batch processing, effects, composition). Use when converting media formats, encoding videos with specific codecs (H.264, H.265, VP9), resizing/cropping images, extracting audio from video, applying filters and effects, optimizing file sizes, creating streaming manifests (HLS/DASH), generating thumbnails, batch processing images, creating composite images, or implementing media processing pipelines. Supports 100+ formats, hardware acceleration (NVENC, QSV), and complex filtergraphs.
Media processing utilities for images, audio, and video using FFmpeg and ImageMagick. Use when working with media conversion, optimization, or batch processing tasks.
Video/audio/image processing with FFmpeg and ImageMagick. Tools: FFmpeg (video/audio), ImageMagick (images). Capabilities: format conversion, encoding (H.264/H.265/VP9/AV1), streaming (HLS/DASH), filters, effects, thumbnails, watermarks, batch processing, hardware acceleration (NVENC/QSV). Actions: convert, encode, resize, crop, compress, extract, merge, stream, transcode media. Keywords: FFmpeg, ImageMagick, video encoding, audio extraction, image resize, thumbnail, watermark, HLS, DASH, H.264, H.265, VP9, AV1, codec, bitrate, framerate, resolution, aspect ratio, filter, overlay, concat, trim, fade, batch processing. Use when: converting video/audio formats, encoding with specific codecs, generating thumbnails, creating streaming manifests, extracting audio from video, batch processing images, adding watermarks, optimizing file sizes.
Create aesthetically beautiful interfaces following proven design principles. Use when building UI/UX, analyzing designs from inspiration sites, generating design images with ai-multimodal, implementing visual hierarchy and color theory, adding micro-interactions, or creating design documentation. Includes workflows for capturing and analyzing inspiration screenshots with chrome-devtools and ai-multimodal, iterative design image generation until aesthetic standards are met, and comprehensive design system guidance covering BEAUTIFUL (aesthetic principles), RIGHT (functionality/accessibility), SATISFYING (micro-interactions), and PEAK (storytelling) stages. Integrates with chrome-devtools, ai-multimodal, media-processing, ui-styling, and web-frameworks skills.
Visual design intelligence and UI aesthetics. Integrates: chrome-devtools, ai-multimodal, media-processing. Capabilities: design analysis, visual hierarchy, color theory, typography, micro-interactions, animation, design systems, accessibility. Actions: analyze, design, create, capture, evaluate, implement UI aesthetics. Keywords: Dribbble, Behance, Mobbin, design inspiration, visual hierarchy, color palette, typography, spacing, animation, micro-interaction, design system, style guide, accessibility, WCAG, contrast ratio, golden ratio, whitespace, visual rhythm. Use when: building beautiful UIs, analyzing design inspiration, implementing visual hierarchy, adding animations/micro-interactions, creating design systems, evaluating aesthetic quality, capturing design screenshots.
Generate FFmpeg commands from natural language video editing requests - cut, trim, convert, compress, change aspect ratio, extract audio, and more.
Manage Alibaba Cloud ApsaraVideo for Media Processing (MPS/MTS) resources and workflows via OpenAPI/SDK. Use for media ingest and metadata tasks, transcoding/snapshot jobs, pipeline/template/workflow operations, and MPS job troubleshooting.
Post-process video files and generate optimized GIFs. Converts webm/mp4 to GIF with configurable quality settings.
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (captioning, object detection, OCR, visual Q&A, segmentation), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image, editing, composition, refinement). Use when working with audio/video files, analyzing images or screenshots, processing PDF documents, extracting structured data from media, creating images from text prompts, or implementing multimodal AI features. Supports multiple models (Gemini 2.5/2.0) with context windows up to 2M tokens.
Upscale and enhance image and video resolution using AI
Manage Alibaba Cloud Intelligent Cloud Editing (ICE) media workflows via OpenAPI/SDK. Use for media processing jobs, template/workflow orchestration, editing and production pipelines, and job status troubleshooting.