Loading...
Loading...
Found 1,332 Skills
Help the user systematically identify and categorize failure modes in an LLM pipeline by reading traces. Use when starting a new eval project, after significant pipeline changes (new features, model switches, prompt rewrites), when production metrics drop, or after incidents.
Comprehensive quality audit for Claude Code agents, skills, and commands with comparative analysis
Analyze cryptocurrency projects with tokenomics, on-chain metrics, and market analysis. Generate comprehensive crypto research reports.
Write professional investment memorandums for VC, PE, or public market investments. Structure thesis, risks, and recommendations clearly.
Iteratively improve any output until measurable criteria are met. Use when the user wants to refine existing work against specific standards — whether it's code, prose, data, config, or any other artifact. Triggers on phrases like "improve this", "make it better", "iterate", "refine", "keep improving", "not good enough yet", "optimize this", "polish this", "tighten this up", or when the user provides criteria and wants repeated improvement until they're satisfied. Also use when the user gives feedback on output and expects you to keep refining, even if they don't say "improve" explicitly.
Interact with local Chrome browser session (only on explicit user approval after being asked to inspect, debug, or interact with a page open in Chrome)
Generate probability-weighted alternative options that challenge default thinking. Forces unconventional alternatives and exposes hidden assumptions behind the "obvious" choice. For decision-point analysis, NOT full design exploration (use brainstorming for that). Triggers on "대안", "alternatives", "옵션 뽑아", "options", "어떤 방법이", "아이디어", "다른 방법", "선택지".
Expert methodology for analyzing and summarizing research papers, extracting key contributions, methodological details, and contextualizing findings. Use when reading papers from PDFs, DOIs, or URLs to create structured summaries for researchers.
Parallel 3-reviewer code review orchestration: launch Security, Business-Logic, and Architecture reviewers simultaneously, aggregate findings by severity, and produce a unified BLOCK/FIX/APPROVE verdict. Use when reviewing PRs with 5+ files, security-sensitive changes, new features needing broad coverage, or when user requests "parallel review", "comprehensive review", or "full review". Do NOT use for single-file fixes, documentation-only changes, or when systematic-code-review (sequential) is sufficient.
Critiques ML conference papers with reviewer-style feedback. Use when users want to anticipate reviewer concerns, identify weaknesses, check claim-evidence gaps, or find missing citations.
Analyze an open source GitHub repository and generate a structured report. Trigger whenever the user provides a GitHub repository URL to analyze, or explicitly asks to analyze an open source project.
Autonomously optimize an existing AI skill by running it repeatedly against binary evals, mutating one instruction at a time, and keeping only changes that improve pass rate. Based on Karpathy-style autoresearch, but applied to SKILL.md iteration instead of ML training. Use when optimizing a skill, benchmarking prompt quality, building evals for a skill, or running self-improvement loops on reusable agent instructions. Triggers on: skill-autoresearch, optimize this skill, improve this skill, benchmark this skill, eval my skill, run autoresearch on this skill, self-improve skill.