Loading...
Loading...
Found 86 Skills
Generate brand-quality product images via mode-specific prompt enhancement on Higgsfield's gpt_image_2 model. The single entry point for any professional brand visual involving a product. Use when: "make a product photo", "studio shot", "lifestyle photo", "in use", "Pinterest pin", "hero banner", "website header", "carousel", "Meta ads", "ad creatives", "model wearing", "virtual try-on", "person holding product", "closeup with hands", "levitating product", "floating", "splash shot", "CGI style", "surreal product", "restyle", "Christmas version", "in [aesthetic] style", or any request involving a product, brand, or paid social creative. Modes: product_shot, lifestyle_scene, closeup_product_with_person, pinterest_pin, hero_banner, social_carousel, ad_creative_pack, virtual_model_tryout, conceptual_product, restyle. Backend assembles the final prompt — never write gpt_image_2 prompts freehand. Always go through this skill. NOT for: raw text-to-image with no brand/product (use higgsfield-generate), branded marketing video with avatars (use higgsfield-generate's Marketing Studio), Soul Character training (use higgsfield-soul-id).
Gemini-native Nano Banana image generation and editing across Nano Banana, Nano Banana 2, and Nano Banana Pro. Use when you need text-to-image, image-to-image edits, repeated local references, batch generation, dry-run request inspection, or a custom Gemini-compatible base URL such as a self-hosted gateway.
Generates image prompts for Seedream 5.0/4.0 (Jimeng AI), and can call the API to generate images and automatically download them to the output/ directory. Workflow: describe your idea → the agent outputs a prompt for review → user confirms → the agent runs generate.py. It covers text-to-image, image editing, multi-image fusion, character consistency, knowledge cards, posters, PPT backgrounds, e-commerce images, avatars, and group/storyboard generation. Activate this tool when the user mentions terms like seedream, jimeng, AI image generation, text-to-image, image-to-image, seedream prompt, prompt keyword, one-click image generation, knowledge card, poster design, e-commerce image, character consistency, or image generation.
Generate high-quality images from text prompts using fal.ai's text-to-image models. Supports intelligent model selection, style transfer, and professional-grade outputs.
Generate and edit images using TensorsLab's AI models. Supports text-to-image, image-to-image generation, plus advanced editing: avatar generation, watermark removal, object erasure, face replacement, and general image editing. Features automatic prompt enhancement, progress tracking, and local file saving. Requires browser-based authorization before first use.
Generate and edit images using OpenAI's GPT Image 1.5 model. Use when the user asks to generate, create, edit, modify, change, alter, or update images. Also use when user references an existing image file and asks to modify it in any way (e.g., "modify this image", "change the background", "replace X with Y"). Supports text-to-image generation and image editing with optional mask. DO NOT read the image file first - use this skill directly with the --input-image parameter.
AI image generation with OpenAI, Azure OpenAI, Google, OpenRouter, DashScope, MiniMax, Jimeng, Seedream and Replicate APIs. Supports text-to-image, reference images, aspect ratios, and batch generation from saved prompt files. Sequential by default; use batch parallel generation when the user already has multiple prompts or wants stable multi-image throughput. Use when user asks to generate, create, or draw images.
This skill should be used when generating and editing images using the Gemini API (Nano Banana Pro). It applies when creating images from text prompts, editing existing images, applying style transfers, generating logos with text, creating stickers, product mockups, or any image generation/manipulation task. Supports text-to-image, image editing, multi-turn refinement, and composition from multiple reference images.
Generate new images from text prompts using EachLabs AI models. Supports text-to-image with multiple model families including Flux, GPT Image, Gemini, Imagen, Seedream, and more. Use when the user wants to create new images from text. For editing existing images, see eachlabs-image-edit.
Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.
Use this skill for any image-related AI generation or editing task. Triggers include: GENERATE: "generate image", "create image", "make picture", "draw", "visualize", "image of", "create art", "generate art" EDIT: "edit image", "modify image", "change image", "update image", "fix image", "enhance image" ADD/REMOVE: "add to image", "put in image", "remove from image", "delete from image", "add element" STYLE: "style transfer", "make it look like", "convert style", "apply style", "in the style of" PRODUCT: "product photo", "product placement", "place product", "mockup", "put product on" COMPOSITE: "combine images", "merge images", "blend images", "create composite" Supports text-to-image generation, image editing with references, product placement, style transfer, and multi-image composition using Google Gemini (Nano Banana Pro) or OpenAI DALL-E.
Generate or edit images using Gemini's native `generateContent` via New-API. Suitable for scenarios requiring text-to-image generation, reference image editing, local PNG output, and those who want to reuse the `.sofunny-image.env` file or current shell environment variables.