Loading...
Loading...
Found 103 Skills
Multimodal image processing skill, supporting text-to-image, image-to-image, image-to-text, long image stitching, marketing material packs, product design images, element disassembly diagrams, and social media image sets. Triggered when the user mentions keywords such as "draw", "generate image", "draw XX", "image processing", "image-to-image", "OCR", "image recognition", "stitch long image", "infographic", "illustration", "product image", "material pack", "marketing material", "detail page", "e-commerce image", "design drawing", "exploded view", "disassembly", "image set", "nine-grid", etc. Note: If the user requests a video (including illustrations + voiceover), use the video-creator skill instead.
Generate images with GPT Image 2 (ChatGPT Images 2.0) inside Claude Code, using your existing ChatGPT Plus or Pro subscription — no separate OpenAI access, no per-image billing. Supports text-to-image, image-to-image editing, style transfer, and multi-reference composition via the local Codex CLI. Triggers on "gpt image 2", "gpt-image-2", "ChatGPT Images 2.0", "image 2", or any explicit ask to generate or edit an image through the user's ChatGPT plan.
Help users integrate Runway image generation APIs (text-to-image with reference images)
Image Generation Skill: Use this skill when users need to generate images, create graphics, or edit/modify/adjust existing images. It supports 10 aspect ratios (1:1, 16:9, 9:16, etc.) and 3 resolutions (1K, 2K, 4K), and supports text-to-image and image-to-image editing.
Generate new images from text prompts using EachLabs AI models. Supports text-to-image with multiple model families including Flux, GPT Image, Gemini, Imagen, Seedream, and more. Use when the user wants to create new images from text. For editing existing images, see eachlabs-image-edit.
fal.ai AI image generation. Use this skill when you need to use fal, fal.ai, or generate images from text prompts using AI text-to-image models.
Generate or edit images using Gemini's native `generateContent` via New-API. Suitable for scenarios requiring text-to-image generation, reference image editing, local PNG output, and those who want to reuse the `.sofunny-image.env` file or current shell environment variables.
Supports text-to-image and image-to-image. Use when the user needs to create or generate images. Use cases: (1) Generate from text description, (2) Use reference images, (3) Customize model, aspect ratio, resolution. Triggers: generate image, draw, create image, AI art.
Generate and edit images using Google Gemini 3 Pro Image (Nano Banana Pro). Supports text-to-image, image editing, various aspect ratios, and high-resolution output (2K/4K).
State-of-the-art text-to-image generation with Stable Diffusion models via HuggingFace Diffusers. Use when generating images from text prompts, performing image-to-image translation, inpainting, or building custom diffusion pipelines.
Write structured VGL (Visual Generation Language) JSON prompts for Bria's FIBO image generation models. Use this skill when creating detailed image descriptions in JSON format for text-to-image generation, image editing, inpainting, outpainting, background generation, or captioning. Triggers include requests to write structured prompts, create VGL JSON, describe images for AI generation, or work with Bria/FIBO's structured_prompt format. Also use when converting natural language image requests into the deterministic JSON schema required by FIBO models.
Generate images using Google Gemini AI with text prompts and reference images. Use when creating game assets, concept art, UI mockups, promotional images, or any visual content. Supports text-to-image, image-to-image with style transfer, and multiple output sizes. Requires GEMINI_API_KEY environment variable. Triggers on requests for AI image generation, concept art, visual assets, or Gemini images.