Search Results: ai-evaluation

Found 16 Skills

AI & Machine Learningneolabhq/context-engineer...

judge-with-debate

Evaluate solutions through multi-round debate between independent judges until consensus

AI & Machine Learningcoval-ai/coval-external-s...

onboard

Interactively set up a first Coval AI evaluation. Guides users through installing the CLI, connecting an agent, creating personas, building test cases, selecting metrics, and launching their first eval run. Use when user says "onboard", "get started", "set up evaluation", "first eval", "new to coval", or wants help creating their first test run.

🇺🇸|EnglishTranslated

AI & Machine Learningaffaan-m/everything-claud...

skill-stocktake

Use when auditing Claude skills and commands for quality. Supports Quick Scan (changed skills only) and Full Stocktake modes with sequential subagent batch evaluation.

🇺🇸|EnglishTranslated

3 scripts/Attention

Testing & QAyonatangross/orchestkit

testing-llm

LLM and AI testing patterns — mock responses, evaluation with DeepEval/RAGAS, structured output validation, and agentic test patterns (generator, healer, planner). Use when testing AI features, validating LLM outputs, or building evaluation pipelines.

🇺🇸|EnglishTranslated