Loading...
Loading...
Found 86 Skills
AI Image Generation Skill, using the latest ChatGPT image generation model gpt-image-2-all. This skill is applied when users need to generate images, visual infographics, create graphics, or edit/modify/adjust existing images. Based on the image generation service of the latest ChatGPT image generation model gpt-image-2-all from APIYI Platform (https://api.apiyi.com/), no external network access is required. The model is charged per image at $0.03 per piece, supporting text-to-image generation, single image editing, multi-image fusion, and natural language-based image modification, with high text restoration accuracy and friendly Chinese prompts. The size is controlled by prompt description (no explicit size parameter). Key differences from NanoBanana2: no size parameter, need to describe the size at the beginning of the prompt; unified $0.03 per image with no resolution tiering; the conversational endpoint /v1/chat/completions is the recommended one.
Generate images directly using the Runway API via runnable scripts. Supports text-to-image with optional reference images.
Use jimeng-mcp-server for AI image and video generation. Use this skill when users request to generate images from text, synthesize multiple images, create videos from text descriptions, or add animations to static images. Supports four core capabilities: text-to-image, image synthesis, text-to-video, and image-to-video. Requires jimeng-mcp-server to run locally or be accessed via SSE/HTTP.
Generate and edit images with Alibaba Qwen-Image-2.0 models via inference.sh CLI. Models: Qwen-Image-2.0 (fast), Qwen-Image-2.0-Pro (professional text rendering). Capabilities: text-to-image, multi-image editing, complex text rendering. Triggers: qwen image, qwen-image, alibaba image, dashscope image, qwen image 2, qwen image pro
AI image generation and editing using Google Gemini models (Nano Banana). Use when the user asks to generate an image, create an image, edit an image, or references "nano banana", "nanobanana", or "gemini image". Supports text-to-image, image editing, multi-image references, and 1K/2K/4K resolution.
Generate and edit images using Google's Gemini image models (Nano Banana 2 default, Nano Banana Pro legacy). Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image, image editing with up to 14 reference images, configurable resolution (0.5K-4K), aspect ratio, and adjustable thinking. DO NOT read the image file first - use this skill directly with the --input-image parameter.
Generate images with Google Gemini. Text-to-image and style transfer from reference images.
Guide for implementing Google Gemini API image generation - create high-quality images from text prompts using gemini-2.5-flash-image model. Use when generating images, creating visual content, or implementing text-to-image features. Supports text-to-image, image editing, multi-image composition, and iterative refinement.
Generate images with Google Nano Banana 2 (Gemini-family flash-tier text-to-image) on RunComfy — bundled with the model's documented prompting patterns so the skill gets sharper output than naive prompting against the same model. Documents Nano Banana 2's strengths (rapid iteration, in-image typography rendering, predictable framing, optional web-grounded context), the resolution-tier pricing, the safety-tolerance dial, and when to route to Nano Banana Pro / GPT Image 2 / Flux 2 / Seedream instead. Calls `runcomfy run google/nano-banana-2/text-to-image` through the local RunComfy CLI. Triggers on "nano banana", "nano-banana-2", "nano banana 2", "google image gen", "gemini image", or any explicit ask to generate with this model.
AI image generation with OpenAI, Google and DashScope APIs. Supports text-to-image, reference images, aspect ratios. Sequential by default; parallel generation available on request. Use when user asks to generate, create, or draw images.
Create or edit images with Pilio GPT Image 2 through the unified Pilio developer API. Use when the user wants text-to-image generation, prompt-based image editing, restyling, product-photo transformation, or composition from one or more local reference images.
Generate images and videos via Higgsfield AI through 30+ models including Nano Banana 2, Soul V2, Veo 3.1, Kling 3.0, Seedance 2.0, Flux 2, GPT Image 2, plus Marketing Studio for branded ad video/image with curated avatars and imported products. Use when: "generate an image", "make a picture", "create artwork", "make a video", "animate this photo", "image-to-video", "img2vid", "edit this image with AI", "stylize a photo", "remix this image", "produce a clip", "render a scene", "create an ad", "make a UGC video", "generate marketing video", "make a product demo", "create unboxing", "TV spot", "virtual try-on", "product showcase", "brand video", "presenter video for product", "import product from URL", "create avatar for ad". Supports text-to-image, image-to-image, image-to-video, reference-based generation, and Marketing Studio (avatars + products + ad modes). Auto-detects whether passed IDs are uploads or previous jobs. Chain with higgsfield-soul-id when the user wants their face in the output. NOT for: training Soul Character (use higgsfield-soul-id), professional product photoshoots with mode-specific prompt enhancement (use higgsfield-product-photoshoot), text-only / chat / TTS tasks.