Loading...
Loading...
Found 7 Skills
Create and run orq.ai experiments — compare configurations against datasets using evaluators, analyze results, and generate prioritized action plans. Use when evaluating LLM agents, deployments, conversations, or RAG pipelines end-to-end. Do NOT use without a dataset and evaluators. Do NOT use for cross-framework comparisons with external agents (use compare-agents).
Comprehensive guide for creating Claude Code agents with proper structure, triggering conditions, system prompts, and validation - combines official Anthropic best practices with proven patterns
This skill should be used when creating agents, writing agent frontmatter, configuring subagents, or when "create agent", "agent.md", "subagent", or "Task tool" are mentioned.
[Hyper] Test Codex/agent skills for intended triggering and behavior with realistic positive, negative, boundary, and edge-case scenarios. Use when validating a skill folder, SKILL.md, rules/references/scripts/assets, trigger precision, workflow correctness, or regression coverage before shipping skill changes.
Runs the full validator workflow after coding tasks for requests such as "run the validator", "run final verification", "validate before commit", or "run validation". Executes checks and reviews before commit, push, or PR creation.
This skill should be used when the user asks to "create an agent", "make an agent", "write an agent", "build a subagent", "add an agent to a plugin", "design an autonomous agent", "generate an agent file", "write a system prompt for an agent", "what frontmatter does an agent need", "create a specialized agent". Not for skills or commands — use create-skill.
Agent testing methodology - run agents with test inputs, observe outputs, iterate until outputs are accurate and well-structured.