Loading...
Loading...
Found 425 Skills
Validates Stories/Tasks or context via parallel multi-agent review (Codex + Gemini). Merges findings, debates, applies fixes. GO/NO-GO verdict.
Video understanding and transcription with intelligent multi-provider fallback. Use when: (1) Transcribing video or audio content, (2) Understanding video content including visual elements and scenes, (3) Analyzing YouTube videos by URL, (4) Extracting information from local video files, (5) Getting timestamps, summaries, or answering questions about video content. Automatically selects the best available provider based on configured API keys - prefers full video understanding (Gemini/OpenRouter) over ASR-only providers. Supports model selection per provider.
Fetch content from Reddit using Gemini CLI when WebFetch is blocked. Use when accessing Reddit URLs, researching topics on Reddit, or when Reddit returns 403/blocked errors.
App Store screenshot generation skill with two workflows: (A) AI-powered: fetches app metadata via `asc` CLI, analyzes screenshots with Claude vision, writes a ScreenPlan JSON, then generates final marketing screenshots via Gemini (`asc app-shots generate`), and optionally translates them (`asc app-shots translate`). (B) HTML-based (deterministic): writes a CompositionPlan JSON with precise device placement, text overlays, and backgrounds, then runs `asc app-shots html` to produce a self-contained HTML page with real device mockup frames and client-side PNG export — no AI needed. Use this skill when: (1) User asks to "create App Store screenshots" or "generate screenshot plan" (2) User asks to "make an HTML screenshot page" or "compose screenshots with mockups" (3) User mentions "asc-app-shots", "app-shots html", "composition plan", or screenshot marketing (4) User wants deterministic, reproducible screenshot layouts with device mockups (5) User wants AI-generated screenshots via Gemini
Validates optimization plan via parallel multi-agent review (Codex + Gemini) before execution. GO/NO-GO verdict.
A comprehensive traditional Chinese metaphysics agent ("Yi Jing" expert) that combines Mei Hua Yi Shu (Timing) with Gemini AI for modern interpretation.
Generate or edit images using Gemini's native `generateContent` via New-API. Suitable for scenarios requiring text-to-image generation, reference image editing, local PNG output, and those who want to reuse the `.sofunny-image.env` file or current shell environment variables.
Complete game asset design, creation, implementation, and optimization team. Use when creating visual assets, art direction, sprites, UI elements, icons, textures, animations specs, audio design, or any game visuals. Covers AI image generation (Gemini), asset pipelines, optimization for web/mobile, style guides, and performance tuning. Triggers on requests for game art, icons, backgrounds, character designs, UI assets, promotional materials, or asset troubleshooting.
Generate, edit, and compose images using Gemini Nano Banana models via portable Python scripts. Handles authentication via API Key or Vertex AI environment variables. Available parameters: prompt, model, aspect-ratio, safety-filter-level. Always confirm parameters with the user or explicitly state defaults before running.
Perform autonomous, multi-step research using the Gemini Deep Research Agent (Interactions API). Supports web search, file/directory context, and resilient streaming.
Generate and transcribe speech using Google's Gemini-TTS and Chirp 3 models. Supports Text-to-Speech (Single/Multi-speaker), Instant Custom Voice, and Speech-to-Text (Transcription/Diarization).
AI media generation CLI tool using Google's Imagen 4, Veo 3.1, and Gemini TTS. Use when the user wants to (1) generate images from text prompts, (2) edit existing images with AI, (3) explain image contents, (4) generate videos from text or images, (5) create narration/voice audio with character settings. Triggers on requests like "generate an image of...", "create a video...", "make a voice that says...", "edit this image to...", "describe this image".