Loading...
Loading...
Found 16 Skills
Evaluate solutions through multi-round debate between independent judges until consensus
Interactively set up a first Coval AI evaluation. Guides users through installing the CLI, connecting an agent, creating personas, building test cases, selecting metrics, and launching their first eval run. Use when user says "onboard", "get started", "set up evaluation", "first eval", "new to coval", or wants help creating their first test run.
Use when auditing Claude skills and commands for quality. Supports Quick Scan (changed skills only) and Full Stocktake modes with sequential subagent batch evaluation.
LLM and AI testing patterns — mock responses, evaluation with DeepEval/RAGAS, structured output validation, and agentic test patterns (generator, healer, planner). Use when testing AI features, validating LLM outputs, or building evaluation pipelines.