Loading...
Loading...
Found 3,730 Skills
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI application quality, or establishing evaluation frameworks.
Claude Code extensibility: agents, skills, output styles. Capabilities: create/update/delete agents and skills, YAML frontmatter, system prompts, tool/model selection, resumable agents, CLI-defined agents. Actions: create, edit, delete, optimize, test extensions. Keywords: agent, skill, output-style, SKILL.md, subagent, Task tool, progressive disclosure. Use when: creating agents/skills, editing extensions, configuring tool access, choosing models, testing activation.
Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.
TypeScript/JavaScript guardrails, patterns, and best practices for AI-assisted development. Use when working with TypeScript (.ts, .tsx) or JavaScript (.js, .jsx) files, package.json, or tsconfig.json. Provides strict mode conventions, async patterns, testing standards, and module system guidelines.
Identify risky assumptions for a feature idea in an existing product across Value, Usability, Viability, and Feasibility. Uses multi-perspective devil's advocate thinking. Use when stress-testing a feature idea, doing risk assessment, or preparing for assumption mapping.
Ethereum development knowledge for AI agents — from idea to deployed dApp. Fetch real-time docs on gas costs, Solidity patterns, Scaffold-ETH 2, Layer 2s, DeFi composability, security, testing, and production deployment. Use when: (1) building any Ethereum or EVM dApp, (2) writing or reviewing Solidity contracts, (3) deploying to mainnet or L2s, (4) the user asks about gas, tokens, wallets, or smart contracts, (5) any web3/blockchain/onchain development task. NOT for: trading, price checking, or portfolio management — use a trading skill for those.
Full closed-loop QA combining issue discovery and software testing. Scout -> Strategist -> Generator -> Executor -> Analyst with multi-perspective scanning, progressive test layers, GC loops, and quality scoring. Supports discovery, testing, and full QA modes.
Plan resource capacity — workload analysis and utilization forecasting. Use when heading into quarterly planning, the team feels overallocated and you need the numbers, deciding whether to hire or deprioritize, or stress-testing whether upcoming projects fit the people you have.
Apply Fastify best practices when creating servers, plugins, routes, schemas, hooks, configuration, decorators, error handling, testing, and TypeScript integration. Use when writing or reviewing Fastify code, setting up a new Fastify project, or asking "How should I structure my Fastify app?"
This skill should be used when the user asks to "create a GitHub Actions workflow", "set up CI/CD", "configure GitHub Actions", "add automated testing", "deploy with GitHub Actions", or needs guidance on GitHub Actions workflows, syntax, or automation.
Generate mock API servers for testing and development with realistic response data. Use when creating mock APIs for development and testing. Trigger with phrases like "create mock API", "generate API mock", or "setup mock server".
Use when the user asks to create a pull request. Build a complete PR using best-practice structure with rich details on changes, verification, QA evidence, risks, and rollout notes. Include issue linkage and clear testing commands/results in the PR body.