Loading...
Loading...
Found 100 Skills
Supports text-to-image and image-to-image. Use when the user needs to create or generate images. Use cases: (1) Generate from text description, (2) Use reference images, (3) Customize model, aspect ratio, resolution. Triggers: generate image, draw, create image, AI art.
Generate AI images using Volcengine Seedream model. Supports text-to-image (T2I), image editing (I2I), multi-image fusion, and web-search-based generation. Use this skill when the user wants to create, generate, or edit images.
Generate and edit images using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports both text-to-image generation and image-to-image editing with configurable resolution (1K default, 2K, or 4K for high resolution). DO NOT read the image file first - use this skill directly with the --input-image parameter.
fal.ai AI image generation. Use this skill when you need to use fal, fal.ai, or generate images from text prompts using AI text-to-image models.
Guide for implementing Google Gemini API image generation - create high-quality images from text prompts using gemini-2.5-flash-image model. Use when generating images, creating visual content, or implementing text-to-image features. Supports text-to-image, image editing, multi-image composition, and iterative refinement.
Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".
Generate and edit images using Google's Gemini image models (Nano Banana 2 default, Nano Banana Pro legacy). Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image, image editing with up to 14 reference images, configurable resolution (0.5K-4K), aspect ratio, and adjustable thinking. DO NOT read the image file first - use this skill directly with the --input-image parameter.
Use jimeng-mcp-server for AI image and video generation. Use this skill when users request to generate images from text, synthesize multiple images, create videos from text descriptions, or add animations to static images. Supports four core capabilities: text-to-image, image synthesis, text-to-video, and image-to-video. Requires jimeng-mcp-server to run locally or be accessed via SSE/HTTP.
Use Alibaba Cloud DashScope API and LingMou to generate AI video and speech. Seven capabilities — (1) LivePortrait talking-head (image + audio → video, two-step), (2) EMO talking-head, (3) AA/AnimateAnyone full-body animation (three-step), (4) T2I text-to-image (Wan 2.x, default wan2.2-t2i-flash), (5) I2V image-to-video (Wan 2.x, default wan2.7-i2v-flash, supports T2I→I2V pipeline), (6) Qwen TTS (auto model/voice by scene, default qwen3-tts-vd-realtime-2026-01-15), (7) LingMou digital-human template video with random template, public-template copy, and script confirmation. Trigger when the user needs talking-head, portrait, full-body animation, text-to-image, text-to-video, or speech synthesis.
Generate images with Google Gemini. Text-to-image and style transfer from reference images.
Generates images using official OpenAI and Google APIs via AI SDK. Supports text-to-image, reference images, aspect ratios, and quality presets. Use when user asks to "generate image with API", "use official API for images", "create image with OpenAI/Google", or needs API-based generation instead of browser-based.
AI Image Generation Skill, using the latest ChatGPT image generation model gpt-image-2-all. This skill is applied when users need to generate images, visual infographics, create graphics, or edit/modify/adjust existing images. Based on the image generation service of the latest ChatGPT image generation model gpt-image-2-all from APIYI Platform (https://api.apiyi.com/), no external network access is required. The model is charged per image at $0.03 per piece, supporting text-to-image generation, single image editing, multi-image fusion, and natural language-based image modification, with high text restoration accuracy and friendly Chinese prompts. The size is controlled by prompt description (no explicit size parameter). Key differences from NanoBanana2: no size parameter, need to describe the size at the beginning of the prompt; unified $0.03 per image with no resolution tiering; the conversational endpoint /v1/chat/completions is the recommended one.