Loading...
Loading...
Found 23 Skills
Operate OpenWord end-to-end for live adventure sessions. Use when Codex needs to download/install/start OpenWord, guide a human player in the browser, or play autonomously through REST API (create/load game, do_action loop, state/image retrieval), including configuring GEMINI_API_KEY and sharing interesting scenes and choices during play.
Generate AI images using Gemini or GPT APIs directly. Covers model selection (Gemini for scenes, GPT for transparent icons), the 5-part prompting framework, API calling patterns, multi-turn editing, and quality assurance. Produces photorealistic scenes, icons, illustrations, OG images, and product shots. Use when building websites that need images, creating marketing assets, or generating visual content. Triggers: 'generate image', 'ai image', 'create hero image', 'make an icon', 'generate illustration', 'create og image', 'ai art', 'image generation'.
Generate illustration images for articles and documentation using Gemini Nano Banana 2 API. Produces clean, minimal-style diagrams and concept illustrations.
Generate or edit images using Gemini's native `generateContent` via New-API. Suitable for scenarios requiring text-to-image generation, reference image editing, local PNG output, and those who want to reuse the `.sofunny-image.env` file or current shell environment variables.
Expert guidance for writing Python code using the official Google GenAI SDK (google-genai) for Gemini API and Vertex AI. Use for text generation, multimodal inputs, reasoning, tools, and media generation.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
Enables grounded question answering by automatically executing the Google Search tool within Gemini models. Use when the required information is recent (post knowledge cutoff) or requires verifiable citation.
Use when "nanobanana", "generate image", "create image", "edit image", "AI drawing", "Gemini image", "image generation"
Consult external LLMs (Gemini, OpenAI/Codex, Qwen) for second opinions, alternative plans, independent reviews, or delegated tasks. Use when a user asks for another model's perspective, wants to compare answers, or requests delegating a subtask to Gemini/Codex/Qwen.
Form a high-level investment committee consisting of three virtual experts modeled after legendary investors (Buffett, Wood, Druckenmiller) to conduct independent multi-round adversarial debates. True independent thinking is achieved through physically isolated Gemini API calls, and final resolutions are formed via voting. Use when evaluating investment decisions, reviewing stock research reports, or seeking multi-perspective analysis on public companies.
AI Course Content Generator - Generate complete online courses with Gemini API. Triggers on "create course", "generate lesson", "course content", "ccg", "/ccg".