Search Results: ai-reliability

Found 1 Skills

create-context-tests

Generate a test suite of natural-language → SQL pairs that becomes the quality benchmark for a nao agent, then run it via `nao test`. Use when the user wants to start measuring agent reliability, extend an existing test suite, or add tests for new metrics. Tests are the only honest answer to "is the context working?". Do not use for writing rules (write-context-rules) or diagnosing failures (audit-context).

🇺🇸|EnglishTranslated