Loading...
Loading...
Found 3,339 Skills
Harden designs for real-world use by systematically identifying and designing for every condition outside the happy path. Part of the Intent design strategy system. Covers state inventories, error recovery, empty states, loading patterns, first-run experiences, stress testing, internationalization readiness, and latency handling. Trigger on: edge cases, error states, empty states, loading states, first-run experience, onboarding, offline mode, "what happens when", "what if the user", "stress test this", "what could go wrong", "harden this design", "edge case review", "what are the failure modes", zero states, timeout handling, or any question about how a design behaves outside ideal conditions. The happy path is a fantasy — this skill designs for the world your users actually live in.
LLM prompt testing, evaluation, and CI/CD quality gates using Promptfoo. Invoke when: - Setting up prompt evaluation or regression testing - Integrating LLM testing into CI/CD pipelines - Configuring security testing (red teaming, jailbreaks) - Comparing prompt or model performance - Building evaluation suites for RAG, factuality, or safety Keywords: promptfoo, llm evaluation, prompt testing, red team, CI/CD, regression testing
Use when generating PDFs from markdown with Pandoc - covers differences from Python-Markdown, blank line rules, fix scripts for labels/anchors/metadata, and visual testing workflow
Run Microsoft's eval-recipes benchmarks to validate amplihack improvements against baseline agents. Auto-activates when testing improvements, running evals, or benchmarking changes.
Test-Driven Development with Iron Laws enforcement. Use when writing any production code to ensure tests are written first. Includes testing-expert capabilities.
Amazon Bedrock Prompt Management for creating, versioning, and managing prompt templates with variables, multi-variant A/B testing, and flow integration. Use when creating reusable prompt templates, managing prompt versions, implementing A/B testing for prompts, integrating prompts with Bedrock Flows, optimizing prompt engineering, or building production prompt catalogs.
When validating system performance under load, identifying bottlenecks through profiling, or optimizing application responsiveness. Covers load testing (k6, Locust), profiling (CPU, memory, I/O), and optimization strategies (caching, query optimization, Core Web Vitals). Use for capacity planning, regression detection, and establishing performance SLOs.
Senior QA Automation Engineer with 10+ years E2E testing experience. Use when writing end-to-end tests for web apps with Playwright, mobile apps with Detox, testing critical user flows, cross-browser testing, or visual regression testing.
Web vulnerability testing patterns for SQL injection, XSS, CSRF, LFI, SSTI, and file upload bypasses in CTF challenges. Trigger: When testing web applications, SQL injection, XSS, or file uploads.
Master metrics definition, KPI tracking, dashboarding, A/B testing, and data-driven decision making. Use data to guide product decisions.
pytest Python testing framework with fixtures. Use for Python testing.
Detox React Native E2E testing. Use for RN testing.