Loading...
Loading...
Found 3 Skills
Help users create and run AI evaluations. Use when someone is building evals for LLM products, measuring model quality, creating test cases, designing rubrics, or trying to systematically measure AI output quality.
Master LLM-as-a-Judge evaluation techniques including direct scoring, pairwise comparison, rubric generation, and bias mitigation. Use when building evaluation systems, comparing model outputs, or establishing quality standards for AI-generated content.
Use this skill when designing structured interviews, creating rubrics, building coding challenges, or assessing culture fit. Triggers on interview design, rubrics, scoring criteria, coding challenges, behavioral interviews, system design interviews, culture fit assessment, and any task requiring interview process design or evaluation criteria.