Loading...
Loading...
Found 43 Skills
MiniMax multimodal model skill — use MiniMax Multi-Modal models for speech, music, video, and image. Create voice, music, video, and images with MiniMax AI: TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract). Use when the user mentions MiniMax, multimodal generation, or wants speech/music/video/image AI, MiniMax APIs, or FFmpeg workflows alongside MiniMax outputs.
Generate HeyGen presenter videos via the v3 Video Agent pipeline — handles Frame Check (aspect ratio correction), prompt engineering, avatar resolution, and voice selection. Required for any HeyGen video generation. Replaces deprecated endpoints with v3. Use when: (1) generating any HeyGen video (via API or otherwise), (2) sending a personalized video message (outreach, update, announcement, pitch, knowledge), (3) creating a HeyGen presenter-led explainer, tutorial, or product demo with a human face, (4) "make a video of me saying...", "send a video to my leads", "record an update for my team", "create a video pitch", "make a loom-style message", "I want to appear in this video", "generate a HeyGen video", "make a talking head video". Accepts avatar_id from heygen-avatar for identity-first HeyGen videos, or uses a stock presenter. Returns video share URL + HeyGen session URL for iteration. Chain signal: when the user wants to create/design an avatar AND make a video in the same request, run heygen-avatar first, then return here. Conjunctions to watch: "and then", "and immediately", "first...then", "X and make a video", "design [presenter] and record" = always CHAIN. If the user provides a photo AND wants a video, route to heygen-avatar first. NOT for: avatar creation or identity setup (use heygen-avatar first), cinematic footage or b-roll without a presenter, translating videos, TTS-only, or streaming avatars.
Plan and orchestrate end-to-end video production pipelines in ComfyUI with validation gates and error recovery. Handles img2vid, txt2vid, vid2vid, and multi-shot video production. Produces pipeline plans with correct step ordering (generate, validate, animate, validate, concat), model selection, retry strategies (seed randomization, parameter adjustment, model fallback), and VRAM-aware resource management. Use when asked to make a video, animate images, create a multi-shot video, set up a video pipeline, or orchestrate video production in ComfyUI. Does NOT cover still image generation, prompt writing, workflow building for non-video tasks, video editing in external tools, model training, installation, or hardware recommendations.
Create modern product launch/pitch videos using Remotion. Use when creating app promo videos, SaaS launch videos, product demos, or startup pitch videos.
Quickly generate 2-3 video script outline plans, including title suggestions, thumbnail design recommendations, and complete structure design. Use this Skill when users mention "video outline", "video script", "video planning", "shooting videos", or "video content".
Quickly generate 2-3 video outline plans, including titles, cover suggestions, and structure designs. Use this when the user mentions 'video outline', 'video structure', 'script outline', or 'video topic selection'.
Colloquial Review for Video Scripts, remove formal tone to make the script suitable for speaking. Use this when users mention "colloquial", "too formal", "sound like natural speech", "script review".
Generate podcast clip visualization video prompts for Seedance 2.0 on Higgsfield. Use for podcast clip videos, audio-to-visual content, audiogram alternatives, podcast highlight reels, interview clip visuals, or any video that transforms audio content into engaging visual format. Triggers on podcast, audio clip, audiogram, interview clip, sound bite, audio visual, podcast video, episode highlight, podcast clip.
Generate a 65-second founder-style product video from a product URL + user-supplied imagery. Output is a 16:9 1080p MP4 — 4 × 15s SeeDance acts of a talking founder + 5s branded end card + background music. The user's actual product screenshots appear on the founder's phone in reveal shots, so on-screen UI is real, not AI-imagined. Triggers — "founder video", "product video", "60s pitch video", "make a video of [founder] for [URL]", "talking founder explainer". Requires Pika MCP. Uses a supplied brand kit folder (`brand.json` or an exported build-a-brand kit with `brand.md`, tokens, logo assets); if no kit exists, run build-a-brand first.
Manim CE animations: 3Blue1Brown math/algo videos.
Use when the user wants Luma / 拾光 / 拾光智能体 / 拾光工具 to create a complete viral-remix short-video workflow: research, rewrite, TTS, digital human, PIP materials, subtitles, BGM, and cover.
AI-assisted video editing workflows for cutting, structuring, and augmenting real footage. Covers the full pipeline from raw capture through FFmpeg, Remotion, ElevenLabs, fal.ai, and final polish in Descript or CapCut. Use when the user wants to edit video, cut footage, create vlogs, or build video content.