Loading...
Loading...
Found 11 Skills
Visually QA a web application by launching it in Cursor's built-in browser, taking screenshots, checking console errors, and auditing network requests. Use after making UI changes to verify they look correct.
Use vision models to self-review screenshots against design intent. Catches spacing issues, alignment problems, color inconsistencies, responsive bugs, and accessibility gaps. Use when reviewing designs, comparing implementations to mockups, or doing pre-ship QA.
World-class character and art style consistency for AI-generated images and videos - ensures visual coherence across series, maintains character identity, and provides rigorous QA before deliveryUse when "character consistency, art style, same character, consistent character, visual continuity, series, turnaround sheet, character sheet, reference image, character bible, style guide, anime character, consistent look, face consistency, outfit consistency, lora training, ip-adapter, flux kontext, visual qa, art quality, generation review, style drift, character drift, character-consistency, art-style, visual-qa, ai-art, image-generation, video-generation, anime, illustration, lora, ip-adapter, flux, midjourney, stable-diffusion" mentioned.
Large Language and Vision Assistant. Enables visual instruction tuning and image-based conversations. Combines CLIP vision encoder with Vicuna/LLaMA language models. Supports multi-turn image chat, visual question answering, and instruction following. Use for vision-language chatbots or image understanding tasks. Best for conversational image analysis.
Automated visual QA testing using Playwright — navigate web apps like a real user, capture screenshots, find bugs, and fix them.
6-phase PPTX presentation generation with visual QA: Gather, Design, Generate, Convert, QA, Output. Use when user needs a PowerPoint deck, slide presentation, pitch deck, or conference talk slides. Triggers: "create a presentation", "make slides", "pitch deck", "powerpoint", "pptx", "slide deck", "generate presentation". Do NOT use for Google Slides, Keynote, or PDF-only documents.
Vision-language pre-training framework bridging frozen image encoders and LLMs. Use when you need image captioning, visual question answering, image-text retrieval, or multimodal chat with state-of-the-art zero-shot performance.
Persistent browser interaction through a normal Node.js Playwright script for fast iterative web UI debugging.
Designer's eye QA: finds visual inconsistency, spacing issues, hierarchy problems, AI slop patterns, and slow interactions — then fixes them. Iteratively fixes issues in source code, committing each fix atomically and re-verifying with before/after screenshots. For plan-mode design review (before implementation), use /plan-design-review. Use when asked to "audit the design", "visual QA", "check if it looks good", or "design polish". Proactively suggest when the user mentions visual inconsistencies or wants to polish the look of a live site.
Analyze images using Dashscope (Qwen) Vision models for detailed description, OCR text extraction, object recognition, and visual Q&A. Use when the user needs to understand image content via Alibaba Cloud Dashscope API, especially for Chinese-language image analysis and documents.
Analyze images using GPT-4 Vision for detailed description, OCR text extraction, object recognition, and visual Q&A. Use when the user needs to understand image content, extract text from screenshots, identify objects in photos, or ask questions about images via OpenAI GPT-4 Vision API.