Loading...
Loading...
Found 25 Skills
AI image generation with OpenAI, Azure OpenAI, Google, OpenRouter, DashScope, MiniMax, Jimeng, Seedream and Replicate APIs. Supports text-to-image, reference images, aspect ratios, and batch generation from saved prompt files. Sequential by default; use batch parallel generation when the user already has multiple prompts or wants stable multi-image throughput. Use when user asks to generate, create, or draw images.
Audio generation skill — jingles, beds, voiceover, and sound effects. Routes music requests to Suno V5 / Udio / Lyria, speech to MiniMax TTS / FishAudio / ElevenLabs V3, and SFX to ElevenLabs SFX or AudioCraft. Output is one MP3/WAV file saved to the project folder.
Integrate Modellix's unified API for AI image and video generation into applications. Use this skill whenever the user wants to generate images from text, create videos from text or images, edit images, do virtual try-on, or call any Modellix model API. Also trigger when the user mentions Modellix, model-as-a-service for media generation, or needs to work with providers like Qwen, Wan, Seedream, Seedance, Kling, Hailuo, or MiniMax through a unified API.
Turn an article or script into a click-driven 16:9 web presentation that "looks like a video", with optional voiceover audio synthesis. Workflow: Original Article → **One-time Output** Script + Outline Development Plan → User **One-time Alignment** on 5 Items (Script / Outline / Theme / Assets / Development Mode) → Web Development (Chapter-by-Chapter / Sequential / Parallel) → Optional Audio Synthesis (Default: MiniMax CLI mmx-cli). **Outline only plans rhythm and information density, not animations** — Animations are designed on the fly during chapter development following the PRINCIPLES + ANTI-AI rules. Each click advances one beat of the script, each step occupies the full screen, and the progress bar is hidden by default only appearing on hover. Application Scenarios: Use web pages to make videos (dynamic PPT but not like PPT), turn scripts/articles into interactive explanations, create screen recording tutorials for Bilibili / YouTube / Video Channels, make cinematic product/talk demos. This Skill embodies design methodology + collaboration process — it is not bound to any specific styles/fonts/colors — so it can be reused for any theme and aesthetic.
Review pull requests for the MiniMax Skills repository. Use when reviewing PRs, validating new skill submissions, or checking existing skills for compliance. Run the validation script first for hard checks, then apply quality guidelines for content review. Triggers: PR review, pull request, validate skill, check skill.
Novita AI: LLM, Image Generation & Editing, Video Generation, Audio (TTS/ASR), and GPU Cloud. Use this skill whenever the user wants to call Novita AI APIs — chat with LLMs (DeepSeek, Llama, Qwen), generate images (FLUX, Stable Diffusion, Seedream, Hunyuan Image), edit images (remove background, upscale, inpainting, img2img, outpainting, reimagine, merge face, replace background, remove text), generate videos (Kling, Wan, Hunyuan, Minimax Hailuo, Vidu, PixVerse, Seedance), do text-to-speech or speech-to-text (MiniMax TTS, GLM TTS, Fish Audio, ASR, voice cloning), run OpenAI-compatible batch jobs, manage GPU cloud instances and serverless endpoints, or check account balance and billing. Also trigger when the user mentions novita.ai, Novita AI, Novita API key, or wants to use any Novita platform service — even if they just say "generate an image" or "run an LLM" and Novita is available as a provider.
Automatically collect hot topics in the AI field or complete AI technical article writing in the writing style of 'Second Brother' according to specified topics. It focuses on actual tests of AI Coding tools (Claude Code, Qoder, Cursor, TRAE, etc.), engineering implementation of large models (SpringAI, LangChain, RAG, etc.), AI Agent and workflow orchestration, evaluation of domestic large models (GLM, Tongyi Qianwen, DeepSeek, MiniMax, Kimi, etc.), and evaluation of various AI tools and Agent tools. Trigger keywords: write an AI article, AI technical article, large model evaluation, AI tool actual test, GLM, Claude Code, Qoder, Cursor, TRAE, SpringAI, RAG, Agent, workflow, domestic large model, collect AI hot topics, AI topic, etc.
Text-to-speech (TTS) and speech-to-text (STT) via Together AI. TTS models include Orpheus, Kokoro, Cartesia Sonic, Rime, MiniMax with REST, streaming, and WebSocket support. STT models include Whisper and Voxtral. Use when users need voice synthesis, audio generation, speech recognition, transcription, TTS, STT, or real-time voice applications.
Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Use when the user asks to create, build, modify, analyze, read, validate, or format any Excel spreadsheet, financial model, pivot table, or tabular data file. Covers: creating new xlsx from scratch, reading and analyzing existing files, editing existing xlsx with zero format loss, formula recalculation and validation, and applying professional financial formatting standards. Triggers on 'spreadsheet', 'Excel', '.xlsx', '.csv', 'pivot table', 'financial model', 'formula', or any request to produce tabular data in Excel format.
Use when user wants to generate music, songs, or audio tracks. Triggers on phrases like "generate a song", "make music", "create a track", "写首歌", "生成音乐", "来一首歌", "帮我做首歌", "纯音乐", "cover", "唱一首", or any request involving music creation, song writing, lyrics generation, or audio production. Also triggers when user provides lyrics and wants them turned into a song, or describes a mood/scene and wants background music. Even casual requests like "给我来点音乐" or "I want a chill beat" should trigger this skill. Do NOT use for music playback of existing files, music theory questions, or music recommendation without generation.
Use when user wants their Claude Code pet (/buddy) to sing a song. Triggers on "buddy sings", "let my buddy sing", "buddy sing", "make my pet sing", "宠物唱歌", "让宠物唱首歌", "让buddy唱歌", "buddy来一首", "我的宠物会唱歌吗", "pet sings", "let my companion sing", or any request that combines the concept of their Claude Code buddy/pet/companion with singing or music.
Analyze images using AI with the understand_image tool