Loading...
Loading...
Found 485 Skills
Turn an existing HTML page, landing page, oral script, memo draft, result table, or structured source material into a Xiaohongshu card-style image note. Use this when the user wants page-by-page card planning, cover copy, card text, or a design-ready Xiaohongshu图文 brief based on source material rather than writing a plain note from scratch. This skill is especially for 3:4 Xiaohongshu cards that may mix image-led pages with high-density memo pages, using strong information hierarchy and screenshot-worthy text density rather than generic sparse carousel copy.
Create, repair, validate, preview, and package Codex-compatible animated pet spritesheets from character art, screenshots, generated images, or visual references. Use when a user wants to hatch a Codex pet, create a custom animated pet, or build a built-in pet asset with an 8x9 atlas, transparent unused cells, row-by-row animation prompts, QA contact sheets, preview videos, and pet.json packaging. This skill composes the installed $imagegen system skill for visual generation and uses bundled scripts for deterministic spritesheet assembly.
Automated browser testing with Playwright -- navigate, interact, screenshot, and validate UI
Frontend and visual QA for responsive surfaces, screenshots, text fit, hierarchy, and interaction states.
Build browser automation and web scraping with Playwright on your local machine. Prevents 10 documented errors including CI timeout hangs, extension testing failures, and Ubuntu compatibility issues. Includes stealth mode for anti-bot bypass, authenticated sessions, infinite scroll handling, screenshot/PDF generation, and v1.57 Speedboard performance analysis. Use when: automating browsers, scraping protected sites, testing with real IPs, bypassing bot detection, generating screenshots/PDFs, or troubleshooting "target closed", "page.pause() hangs CI", "permission prompts block tests", or "Ubuntu 25.10 installation" errors.
Create aesthetically beautiful interfaces following proven design principles. Use when building UI/UX, analyzing designs from inspiration sites, generating design images with ai-multimodal, implementing visual hierarchy and color theory, adding micro-interactions, or creating design documentation. Includes workflows for capturing and analyzing inspiration screenshots with chrome-devtools and ai-multimodal, iterative design image generation until aesthetic standards are met, and comprehensive design system guidance covering BEAUTIFUL (aesthetic principles), RIGHT (functionality/accessibility), SATISFYING (micro-interactions), and PEAK (storytelling) stages. Integrates with chrome-devtools, ai-multimodal, media-processing, ui-styling, and web-frameworks skills.
Detect unintended visual changes in UI by comparing screenshots across versions. Use for visual regression, screenshot diff, Percy, Chromatic, UI testing, and visual validation.
Agentic workflow for cloning websites with pixel-perfect fidelity using specialized sub-agents. Use when the user wants to clone/copy/replicate a website, create a landing page based on an existing site, or needs to extract and recreate a website's design. Includes orchestration via slash command, four specialized sub-agents (screenshotter, extractor, cloner, qa-reviewer), and outputs React components with Tailwind CSS and motion animations.
Comprehensive web page validation with authentication, screenshot capture, mobile testing, and enhanced error detection
REQUIRED for ANY changes to Linux desktop, window manager, or system config. Use when editing ~/.config/hypr/, ~/.config/waybar/, ~/.config/walker/, ~/.config/alacritty/, ~/.config/kitty/, ~/.config/ghostty/, ~/.config/mako/, or ~/.config/omarchy/. Triggers: Hyprland, window rules, animations, keybindings, monitors, gaps, borders, blur, opacity, waybar, walker, terminal config, themes, wallpaper, night light, idle, lock screen, screenshots, layer rules, workspace settings, display config, or any omarchy-* commands.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis up to 9.5 hours), understand images (better image analysis than Claude models, captioning, reasoning, object detection, design extraction, OCR, visual Q&A, segmentation, handle multiple images), process videos (scene detection, Q&A, temporal analysis, YouTube URLs, up to 6 hours), extract from documents (PDF tables, forms, charts, diagrams, multi-page), generate images (text-to-image with Imagen 4, editing, composition, refinement), generate videos (text-to-video with Veo 3, 8-second clips with native audio). Use when working with audio/video files, analyzing images or screenshots (instead of default vision capabilities of Claude, only fallback to Claude's vision capabilities if needed), processing PDF documents, extracting structured data from media, creating images/videos from text prompts, or implementing multimodal AI features. Supports Gemini 3/2.5, Imagen 4, and Veo 3 models with context windows up to 2M tokens.
Design and implement visual regression testing for UI changes. Defines screenshot coverage, rendering stabilization, baseline management, and CI integration (e.g., Playwright screenshots, Percy/Chromatic). Use when UI/styling/layout changes need protection against regressions, or when adding screenshot-based tests to a web/WASM/desktop UI.