Loading...
Loading...
Build automated evaluation suites for AI agents using golden datasets, rubrics, and regression gates.
npx skill4agent add bagelhole/devops-security-agent-skills agent-evals# Example eval pipeline steps
make evals-smoke
make evals-regression
make evals-safety