Loading...
Loading...
Found 147 Skills
Analyze compensation — benchmarking, band placement, and equity modeling. Trigger with "what should we pay a [role]", "is this offer competitive", "model this equity grant", or when uploading comp data to find outliers and retention risks.
Detect performance anti-patterns and apply optimization techniques in Go. Covers allocations, string handling, slice/map preallocation, sync.Pool, benchmarking, and profiling with pprof. Use when checking performance, finding slow code, reducing allocations, profiling, or reviewing hot paths. Trigger examples: "check performance", "find slow code", "reduce allocations", "benchmark this", "profile", "optimize Go code". Do NOT use for concurrency correctness (use go-concurrency-review) or general code style (use go-coding-standards).
How to benchmark and analyze memory usage in Turso using the memory-benchmark crate and dhat heap profiler. Use this skill whenever the user mentions memory usage, memory profiling, allocation tracking, heap analysis, memory regression, memory benchmarking, dhat, or wants to understand where memory is being allocated during SQL workloads. Also use when investigating memory growth in WAL or MVCC mode. IMPORTANT - If you modify the perf/memory crate (add profiles, change CLI flags, change output format, etc.), update this skill document to reflect those changes so it stays accurate for future agents.
Performance benchmarking for a deployed NVIDIA RAG Blueprint server: profiling pass + aiperf load test driven by a single YAML config. Not for accuracy / RAGAS scoring (use rag-eval) or for deploying / repairing services (use rag-blueprint).
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
Build institutional-grade comparable company analyses with operating metrics, valuation multiples, and statistical benchmarking in Excel/spreadsheet format. **Perfect for:** - Public company valuation (M&A, investment analysis) - Benchmarking performance vs. industry peers - Pricing IPOs or funding rounds - Identifying valuation outliers (over/under-valued) - Supporting investment committee presentations - Creating sector overview reports **Not ideal for:** - Private companies without comparable public peers - Highly diversified conglomerates - Distressed/bankrupt companies - Pre-revenue startups - Companies with unique business models
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.
Performance benchmarking expertise for shell tools, covering benchmark design, statistical analysis (min/max/mean/median/stddev), performance targets (<100ms, >90% hit rate), workspace generation, and comprehensive reporting
Analyzes and optimizes SQL queries using EXPLAIN plans, index recommendations, query rewrites, and performance benchmarking. Use for "query optimization", "slow queries", "database performance", or "EXPLAIN analysis".
Run Microsoft's eval-recipes benchmarks to validate amplihack improvements against baseline agents. Auto-activates when testing improvements, running evals, or benchmarking changes.
Apply systematic performance optimization techniques when writing or reviewing code. Use when optimizing hot paths, reducing latency, improving throughput, fixing performance regressions, or when the user mentions performance, optimization, speed, latency, throughput, profiling, or benchmarking.
Use when evaluating LLMs, running benchmarks like MMLU/HumanEval/GSM8K, setting up evaluation pipelines, or asking about "NeMo Evaluator", "LLM benchmarking", "model evaluation", "MMLU", "HumanEval", "GSM8K", "benchmark harnesses"