Search Results: benchmarking

Found 90 Skills

llm-evaluation

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.

🇺🇸|EnglishTranslated

Backend Developmentpluginagentmarketplace/cu...

java-performance

JVM performance tuning - GC optimization, profiling, memory analysis, benchmarking

🇺🇸|EnglishTranslated

1 scripts/Checked

Code Qualityaj-geddes/useful-ai-promp...

profiling-optimization

Profile application performance, identify bottlenecks, and optimize hot paths using CPU profiling, flame graphs, and benchmarking. Use when investigating performance issues or optimizing critical code paths.

🇺🇸|EnglishTranslated

AI & Machine Learningorq-ai/assistant-plugins

compare-agents

Run cross-framework agent comparisons using evaluatorq from orqkit — compares any combination of agents (orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, Vercel AI SDK) head-to-head on the same dataset with LLM-as-a-judge scoring. Use when comparing agents, benchmarking, or wanting side-by-side evaluation. Do NOT use when comparing only orq.ai configurations with no external agents (use run-experiment instead).

🇺🇸|EnglishTranslated

Code Qualityterraphim/terraphim-skill...

rust-performance

High-performance Rust optimization. Profiling, benchmarking, SIMD, memory optimization, and zero-copy techniques. Focuses on measurable improvements with evidence-based optimization.

🇺🇸|EnglishTranslated

Code Qualityynulihao/agentskillos

performance-optimization

Apply systematic performance optimization techniques when writing or reviewing code. Use when optimizing hot paths, reducing latency, improving throughput, fixing performance regressions, or when the user mentions performance, optimization, speed, latency, throughput, profiling, or benchmarking.

🇺🇸|EnglishTranslated

Backend Developmenteduardo-sl/go-agent-skill...

go-performance-review

Detect performance anti-patterns and apply optimization techniques in Go. Covers allocations, string handling, slice/map preallocation, sync.Pool, benchmarking, and profiling with pprof. Use when checking performance, finding slow code, reducing allocations, profiling, or reviewing hot paths. Trigger examples: "check performance", "find slow code", "reduce allocations", "benchmark this", "profile", "optimize Go code". Do NOT use for concurrency correctness (use go-concurrency-review) or general code style (use go-coding-standards).

🇺🇸|EnglishTranslated

AI & Machine Learningvllm-project/vllm-skills

vllm-prefix-cache-bench

This is a skill for benchmarking the efficiency of automatic prefix caching in vLLM using fixed prompts, real-world datasets, or synthetic prefix/suffix patterns. Use when the user asks to benchmark prefix caching hit rate, caching efficiency, or repeated-prompt performance in vLLM.

🇺🇸|EnglishTranslated

Backend Developmentcockroachlabs/cockroachdb...

benchmarking-transaction-patterns

Guides benchmarking and comparing explicit multi-statement transactions versus single-statement CTE transactions in CockroachDB, with fair test methodology, contention analysis, and performance interpretation. Use when comparing transaction formulations, benchmarking CockroachDB workloads under contention, investigating retry pressure, or deciding whether to rewrite multi-step application flows into single SQL statements.

🇺🇸|EnglishTranslated

Code Qualityoutfitter-dev/agents

performance

This skill should be used when profiling code, optimizing bottlenecks, benchmarking, or when "performance", "profiling", "optimization", or "--perf" are mentioned.

🇺🇸|EnglishTranslated

AI & Machine Learningeyadsibai/ltk

nemo-evaluator

Use when evaluating LLMs, running benchmarks like MMLU/HumanEval/GSM8K, setting up evaluation pipelines, or asking about "NeMo Evaluator", "LLM benchmarking", "model evaluation", "MMLU", "HumanEval", "GSM8K", "benchmark harnesses"

🇺🇸|EnglishTranslated

Data Processingmarketcalls/openalgo-indi...

custom-indicator

Create a custom technical indicator using Numba JIT + NumPy. Generates production-grade, O(n) optimized indicator functions with charting and benchmarking.

🇺🇸|EnglishTranslated