Loading...
Loading...
Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.
npx skill4agent add google-gemini/gemini-cli behavioral-evals[!NOTE] Single Source of Truth: For core concepts, policies, running tests, and general best practices, always refer to evals/README.md.
appEvalTestAppRigevalTestTestRigUSUALLY_PASSESALWAYS_PASSESfilespackage.jsonrig.setBreakpoint()rig.readToolLogs()