Search Results: agent-testing

Found 33 Skills

AI & Machine Learninggoogle-gemini/gemini-cli

behavioral-evals

Guidance for creating, running, fixing, and promoting behavioral evaluations. Use when verifying agent decision logic, debugging failures, debugging prompt steering, or adding workspace regression tests.

🇺🇸|EnglishTranslated

AI & Machine Learningdawiddutoit/custom-claude

manage-agents

Creates, modifies, and manages Claude Code subagents by writing agent files with YAML frontmatter, system prompts, and tool configurations. Use when you need to "create an agent", "modify an agent", "set up a specialist", "I need an agent for [task]", "agent to handle [domain]", or "configure agent tools". Covers agent file format, YAML frontmatter, system prompts, tool restrictions, MCP integration, model selection, and testing.

🇺🇸|EnglishTranslated

3 scripts/Attention

Testing & QAoimiragieo/agent-studio

testing-expert

Deprecated alias for tdd skill

🇺🇸|EnglishTranslated

AI & Machine Learningneolabhq/context-engineer...

customaize-agent:test-prompt

Use when creating or editing any prompt (commands, hooks, skills, subagent instructions) to verify it produces desired behavior - applies RED-GREEN-REFACTOR cycle to prompt engineering using subagents for isolated testing

🇺🇸|EnglishTranslated

AI & Machine Learningnotque/claude-code-toolki...

testing-agents-with-subagents

RED-GREEN-REFACTOR testing for agents: dispatch subagents with known inputs, capture verbatim outputs, verify against expectations. Use when creating, modifying, or validating agents and skills. Use for "test agent", "validate agent", "verify agent works", or pre-deployment checks. Do NOT use for feature requests, simple prompt edits without behavioral impact, or agents with no structured output to verify.

🇺🇸|EnglishTranslated

Testing & QAincept5/eve-skillpacks

eve-verification-plans

Author agentic verification plans for Eve-compatible apps. Use when building structured test suites that verify app correctness AND Eve platform conformance — CLI parity, manifest conventions, SSO auth, managed migrations, fixture-driven ingestion, and agent efficiency.

🇺🇸|EnglishTranslated

AI & Machine Learninged3dai/ed3d-plugins

creating-an-agent

Use when creating specialized subagents for Claude Code plugins or the Task tool - covers description writing for auto-delegation, tool selection, prompt structure, and testing agents

🇺🇸|EnglishTranslated

Testing & QAveris-ai/veris-skills

agent-integration

Integrate a raw customer agent repo with Veris end to end. Installs or verifies veris-cli, logs in, creates or reuses a Veris environment, analyzes the repo, generates or updates `.veris/veris.yaml`, `.veris/Dockerfile.sandbox`, `.veris/.dockerignore`, configures runtime env vars, and can finish with `veris env push`. Use when a repo has no Veris setup yet, or when an existing `.veris/` integration is stale and needs to be refreshed.

🇺🇸|EnglishTranslated

AI & Machine Learningcoval-ai/coval-external-s...

design-persona

Design and create a simulation persona for testing an AI agent. Guides through use case selection, voice and language configuration, behavior prompt crafting, and interruption calibration. Use when user says "create a persona", "design a persona", "set up a test persona", "configure simulation persona", or "build a caller profile".

🇺🇸|EnglishTranslated

AI & Machine Learningcekura-ai/cekura-skills

cekura-eval-design

Use when the user asks to "create an evaluator", "create evals", "create a scenario", "write a test scenario", "design a test case", "test my agent", "build eval coverage", "plan a test suite", "create red team tests", "set up test profiles", "configure conditional actions", "write a conditional action evaluator", "build a deterministic test", "design an IVR test", "IVR navigation test", "write a unit test for a voice agent", "build a regression test", "scripted scenario", "scripted voice test", "structured evaluator", "exact flow test", "sequential conditions", "fixed sequence test", or "run evals". Covers individual evaluator design, suite coverage strategy, test profiles, mock-tool data design, conditional actions (deterministic / unit test / regression / IVR navigation flows), and best practices for workflow / red-team / edge-case / deterministic test types.

🇺🇸|EnglishTranslated

AI & Machine Learningadenhq/hive

hive

Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.

🇺🇸|EnglishTranslated

Testing & QAyonatangross/orchestkit

testing-e2e

End-to-end testing patterns with Playwright — page objects, AI agent testing, visual regression, accessibility testing with axe-core, and CI integration. Use when writing E2E tests, setting up Playwright, implementing visual regression, or testing accessibility.

🇺🇸|EnglishTranslated