Loading...
Loading...
Found 1,927 Skills
Comprehensive primary skill for agents working with Weights & Biases. Covers both the W&B SDK (training runs, metrics, artifacts, sweeps) and the Weave SDK (GenAI traces, evaluations, scorers). Includes helper libraries, gotcha tables, and data analysis patterns. Use this skill whenever the user asks about W&B runs, Weave traces, evaluations, training metrics, loss curves, model comparisons, or any Weights & Biases data — even if they don't say "W&B" explicitly.
Benchmark any agent skill to measure whether it actually improves performance. Use when the user wants to evaluate, test, or compare a skill against baseline, or when they mention "benchmark", "eval", "skill performance", or "does this skill help". Runs isolated eval sessions with and without the skill, grades outputs via layered grading (deterministic checks + LLM-as-judge), analyzes behavioral signals, and generates a comparison report with a USE / DON'T USE verdict.
Expert in Jungian analytical psychology, depth psychology, shadow work, archetypal analysis, dream interpretation, active imagination, addiction/recovery through Jungian lens, and the individuation process.
Help identify and evaluate communities to build a minimalist business around. Use when someone is looking for a business idea, trying to find their community, or wondering where to start as an entrepreneur.
Code-first Netra best-practices playbook covering setup, instrumentation, context tracking, custom spans/metrics, integration patterns, evaluation, simulation, and troubleshooting.
Evaluate a README file text, score it out of 100, and provide specific, actionable improvement suggestions.
Set up and improve harness engineering (AGENTS.md, docs/, lint rules, eval systems, project-level prompt engineering) for AI-agent-friendly codebases. Triggers on: new/empty project setup for AI agents, AGENTS.md or CLAUDE.md creation, harness engineering questions, making agents work better on a codebase. ALSO triggers when users are frustrated or complaining about agent quality — e.g. 'the agent keeps ignoring conventions', 'it never follows instructions', 'why does it keep doing X', 'the agent is broken' — because poor agent output almost always signals harness gaps, not model problems. Covers: context engineering, architectural constraints, multi-agent coordination, evaluation, long-running agent harness, and diagnosis of agent quality issues.
Writes recommendation letters for graduate school applications (master's, PhD, study abroad) from OfferClaw. Matches recommender voice, highlights student research and achievements, and tailors emphasis to target programs. Use when asked to draft, rewrite, or refine a letter of recommendation for university admission.
Implements the Syncfusion dotnet Calculate Library for parsing and evaluating formulas using CalcEngine, ICalcData, CalcQuickBase, and ExcelLikeComputations. Supports formula computation, custom functions, cross-sheet references, and XlsIO formula evaluation. Trigger when implementing formula calculations, expression parsing, workbook formula evaluation, or custom function registration.
Score, evaluate, and iteratively improve any content or strategy using an auto-assembled panel of domain experts. Handles copy, sequences, landing pages, strategy docs, titles, charts, recruiting evaluations, or anything else that needs a quality gate. Recursively iterates until all scores hit 90+ (max 3 rounds). Use when asked to: "expert panel this", "score this", "rate these variants", "quality check this", "panel review", "which version is better", "expert score", "evaluate this copy/strategy/page", or when another skill needs a quality gate on its output. Also triggers on: "score this landing page", "expert panel these email variants", "rate this headline", "panel these charts".
Apply Cognitive Load Theory to optimize instructional design by managing intrinsic, extraneous, and germane load within working memory limits. Use this skill when the user needs to diagnose why learners are overwhelmed, redesign training or documentation for better comprehension, evaluate UI/UX information architecture for cognitive burden, or when they ask 'why is this tutorial confusing', 'how to simplify complex instructions', or 'what causes information overload'.
Objective task quality evaluation framework using quantitative KPIs. KPIs are automatically calculated by a hook when task files are modified and saved to TASK-XXX--kpi.json. Use when: reading KPI data for task evaluation, understanding quality metrics, deciding whether to iterate or approve based on data.