Loading...
Loading...
Found 66 Skills
Generates eval test cases from an eval suite plan (output of /eval-suite-planner) or a plain-English agent description. Supports both single-response and conversation (multi-turn) evaluation modes. Outputs a Copilot Studio test set table, a CSV file for import (single-response only), and a docx report for human review.
Amazon Bedrock AgentCore Evaluations for testing and monitoring AI agent quality. 13 built-in evaluators plus custom LLM-as-Judge patterns. Use when testing agents, monitoring production quality, setting up alerts, or validating agent behavior.
Analyze gaps between implementation plans and actual codebase implementation for the Rust self-learning memory project
A comprehensive verification system for Claude Code sessions.
Help users ship products faster and with higher quality. Use when someone is planning a launch, struggling to release features, dealing with shipping velocity issues, or trying to establish better release practices.
Use when working with tdd workflows tdd cycle
Conduct an architecture health check on a design — either verify if the design is internally consistent (no conflicts between terminology, contracts, and implementation steps) or check if the design aligns with the code (ensuring what was promised in the design is actually implemented in code). This skill only outputs issue lists and repair suggestions, and does not make any modifications. It focuses on only one target each time; "顺手把另一项也查了" is not allowed. Trigger scenarios: Users say "perform architecture check", "is the design internally consistent?", "does the plan align with the code?", or want to conduct a health check before proceeding to the implement/acceptance phase.
Help teams continuously improve skills, automatically identify skill optimization opportunities (such as missing essential information, format issues, version update needs, etc.), execute secure update processes (backup, modification, testing, restoration), and ensure skill quality keeps improving as the project progresses
Build comprehensive, mobile-compatible Obsidian study vaults from academic course materials with checkpoint-based workflow, error pattern recognition, and quality assurance. Battle-tested patterns from 828KB/37-file projects. Works across all subjects - CS, medicine, business, self-study.
Comprehensive testing and validation strategies for spec-driven development. Learn phase-specific validation techniques, quality gates, and testing approaches to ensure high-quality implementation.
Coordinator workflow for orchestrating dockeragents through fix-review-iterate-present loop. Use when delegating any task that produces code changes. Ensures agents achieve 10/10 quality before presenting to human.
Comprehensive autonomous development strategies including milestone planning, incremental implementation, auto-debugging, and continuous quality assurance for full development lifecycle management