Loading...
Loading...
Found 58 Skills
Generate images, videos, audio, and 3D models via RunningHub API (170+ endpoints) and run any RunningHub AI Application (custom ComfyUI workflow) by webappId. Covers text-to-image, image-to-video, text-to-speech, music generation, 3D modeling, image upscaling, AI apps, and more.
MiniMax multimodal model skill — use MiniMax Multi-Modal models for speech, music, video, and image. Create voice, music, video, and images with MiniMax AI: TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract). Use when the user mentions MiniMax, multimodal generation, or wants speech/music/video/image AI, MiniMax APIs, or FFmpeg workflows alongside MiniMax outputs.
Generate images and videos with Kling O3 — Kling's most powerful model family. Text-to-image, text-to-video, image-to-video, and video-to-video editing. Use when the user requests "Kling", "Kling O3", "Best quality video", "Kling image", "Kling video editing".
Write better prompts for Kling 3.0 AI video generation. Use when the user wants to create, write, improve, or refine prompts — text-to-video, image-to-video, keyframes, multi-shot sequences, or dialogue scenes.
Expert guidance for Google Veo 3.1 video generation. Use when the user wants to (1) create text-to-video or image-to-video prompts, (2) optimize for cinematic quality and native audio syncing, (3) maintain character consistency via reference images, (4) structure multi-shot sequences with timestamp prompting, (5) use First/Last Frame interpolation, (6) select between standard and fast generation modes, or (7) troubleshoot physics, motion, or audio issues in generated video.
Use when generating videos with Alibaba Cloud Model Studio PixVerse models (`pixverse/pixverse-v5.6-t2v`, `pixverse/pixverse-v5.6-it2v`, `pixverse/pixverse-v5.6-kf2v`, `pixverse/pixverse-v5.6-r2v`). Use when building non-Wan text-to-video, first-frame image-to-video, keyframe-to-video, or multi-image reference-to-video workflows on Model Studio.
[QwenCloud] Generate videos using Wan models. Supports text-to-video, image-to-video, first+last frame, reference-based role-play, and video editing (VACE). TRIGGER when: user wants to create, generate, or edit video content, mentions video generation/animation/video clips/Wan models, or explicitly invokes this skill by name (e.g. use qwencloud-video-generation). DO NOT TRIGGER when: user wants to generate images (use qwencloud-image-generation), understand/analyze existing videos (use qwencloud-vision), text-only tasks.
Comprehensive creation via Xiaoyunque's AI capabilities, supporting generation and editing of images/videos. Covered scenarios include: Generation (text-to-image, text-to-video, image-to-video, animation creation, draw xxx, create xxx clip), Editing & Revision (replace xxx with yyy, remove xxx, add xxx, change to xxx, adjust xxx, local modification, lens adjustment), Style Transfer (style migration, repainting, style change), video continuation, video/TVC/promotional video replication, short drama/short comic drama generation, music MV creation, product advertisement/demo video production, storyboard design, educational video/short video production. This skill should also be triggered when users mention Xiaoyunque, xyq, uploading reference images/videos, or checking generation progress. Key Judgment: This skill must be triggered whenever the user's request involves AI video creation, generation, editing, or revision, regardless of the wording (e.g., "draw a cat", "make a poster", "create a video", "help me revise this video", "help me replicate this video", "make an MV with this song", "generate a short drama with one sentence")
Atlas Cloud API integration skill — quickly call 300+ AI image generation, video generation, and LLM models through a unified API. Use this skill when the user needs to integrate AI image generation (e.g., Flux, Seedream, DALL-E), AI video generation (e.g., Kling, Sora, Seedance), or call LLM APIs (OpenAI-compatible format) into their project. Applicable scenarios include: generating images, generating videos, calling large language models, using Atlas Cloud API, configuring ATLASCLOUD_API_KEY, querying available model lists, searching models by keyword, uploading local images/media files, one-step quick generation, image-to-video, text-to-image, text-to-video, AI content creation tool integration. Even if the user doesn't explicitly mention Atlas Cloud, this skill should be considered whenever AI media generation API integration development is involved.
Use Alibaba Cloud DashScope API and LingMou to generate AI video and speech. Seven capabilities — (1) LivePortrait talking-head (image + audio → video, two-step), (2) EMO talking-head, (3) AA/AnimateAnyone full-body animation (three-step), (4) T2I text-to-image (Wan 2.x, default wan2.2-t2i-flash), (5) I2V image-to-video (Wan 2.x, default wan2.7-i2v-flash, supports T2I→I2V pipeline), (6) Qwen TTS (auto model/voice by scene, default qwen3-tts-vd-realtime-2026-01-15), (7) LingMou digital-human template video with random template, public-template copy, and script confirmation. Trigger when the user needs talking-head, portrait, full-body animation, text-to-image, text-to-video, or speech synthesis.
Generate videos directly using the Runway API via runnable scripts. Supports text-to-video, image-to-video, and video-to-video with seedance2, gen4.5, veo3, and more.
Choose the right fal.ai endpoint for a given task. Modality-organized catalog of production endpoint defaults, text-to-image, image-to-image, text-to-video, image-to-video, and more. Use when the user has not named a specific model, or asks "which model for X", "best endpoint for Y", "what should I use for Z".