Search Results: continuous-evaluation

Found 3 Skills

AI & Machine Learningshipshitdev/library

evaluation

Build evaluation frameworks for agent systems. Use when testing agent performance, validating context engineering choices, or measuring improvements over time.

🇺🇸|EnglishTranslated

1 scripts/Checked

AI & Machine Learningeyadsibai/ltk

agent-evaluation

Use when evaluating agent performance, building test frameworks, measuring quality, or asking about "agent evaluation", "LLM-as-judge", "agent testing", "quality metrics", "evaluation rubrics", "agent benchmarks"

🇺🇸|EnglishTranslated

AI & Machine Learningflora131/atomic

evaluation

This skill should be used when the user asks to "evaluate agent performance", "build test framework", "measure agent quality", "create evaluation rubrics", or mentions LLM-as-judge, multi-dimensional evaluation, agent testing, or quality gates for agent pipelines. Part of the context engineering skill suite — also activates when the user mentions "context engineering" or "context-engineering" in the context of measuring agent effectiveness.

🇺🇸|EnglishTranslated

1 scripts/Checked