Loading...
Loading...
Found 90 Skills
Research-driven code review and validation at multiple levels of abstraction. Two modes: (1) Session review — after making changes, review and verify work using parallel reviewers that research-validate every assumption; (2) Full codebase audit — deep end-to-end evaluation using parallel teams of subagent-spawning reviewers. Use when reviewing changes, verifying work quality, auditing a codebase, validating correctness, checking assumptions, finding defects, reducing complexity. NOT for writing new code, explaining code, or benchmarking.
Testing and benchmarking LLM agents including behavioral testing, capability assessment, reliability metrics, and production monitoring—where even top agents achieve less than 50% on real-world benchmarks Use when: agent testing, agent evaluation, benchmark agents, agent reliability, test agent.
Performance benchmarking expertise for shell tools, covering benchmark design, statistical analysis (min/max/mean/median/stddev), performance targets (<100ms, >90% hit rate), workspace generation, and comprehensive reporting
Perform a deep competitive analysis for a solopreneur business. Use when mapping competitors in detail, finding exploitable gaps, understanding competitor strategy, benchmarking your own offering, or deciding how to position against the field. Goes deeper than the broad landscape mapping in market-research — this is focused dissection of specific competitors. Trigger on "analyze my competitors", "competitive analysis", "who are my competitors", "competitor deep-dive", "how do I beat the competition", "competitive landscape", "benchmark against competitors".
HCCL (Huawei Collective Communication Library) performance testing for Ascend NPU clusters. Use for testing distributed communication bandwidth, verifying HCCL functionality, and benchmarking collective operations like AllReduce, AllGather. Covers MPI installation, multi-node pre-flight checks (SSH/CANN version/NPU health), and production testing workflows.
Social media campaign analysis and performance tracking. Calculates engagement rates, ROI, and benchmarks across platforms. Use for analyzing social media performance, calculating engagement rate, measuring campaign ROI, comparing platform metrics, or benchmarking against industry standards.
Expert-level performance optimization, profiling, benchmarking, and tuning
Expert in observing, benchmarking, and optimizing AI agents. Specializes in token usage tracking, latency analysis, and quality evaluation metrics. Use when optimizing agent costs, measuring performance, or implementing evals. Triggers include "agent performance", "token usage", "latency optimization", "eval", "agent metrics", "cost optimization", "agent benchmarking".
Generate comprehensive philosophy and standards documents for any domain (UX design, landing pages, email outbound, API design, etc.). Load when user says "create philosophy doc", "generate standards for [domain]", "build best practices guide", or "create benchmarking document". Conducts deep research, synthesizes findings, and produces structured philosophy documents with principles, frameworks, anti-patterns, checklists, case studies, and metrics.
Automates benchmark test creation for C++ projects using Google Benchmark with consistent software testing patterns. Use when creating performance benchmarks, profiling tests, or when the user mentions benchmarking, Google Benchmark, or performance testing.
Set up performance benchmarks and CodSpeed harness for a project. Use this skill whenever the user wants to create benchmarks, add performance tests, set up CodSpeed, configure codspeed.yml, integrate a benchmarking framework (criterion, divan, pytest-benchmark, vitest bench, go test -bench, google benchmark), or when the user says 'add benchmarks', 'set up perf tests', 'create a benchmark', 'benchmark this', or wants to measure performance of their code for the first time. Also trigger when the optimize skill needs benchmarks that don't exist yet.
Orchestrate Xcode build optimization by benchmarking first, running the specialist analysis skills, prioritizing findings, requesting explicit approval, delegating approved fixes to xcode-build-fixer, and re-benchmarking after changes. Use when a developer wants an end-to-end build optimization workflow, asks to speed up Xcode builds, wants a full build audit, or needs a recommend-first optimization pass covering compilation, project settings, and packages.