Search Results: uat

Found 1,927 Skills

wandb-primary

Comprehensive primary skill for agents working with Weights & Biases. Covers both the W&B SDK (training runs, metrics, artifacts, sweeps) and the Weave SDK (GenAI traces, evaluations, scorers). Includes helper libraries, gotcha tables, and data analysis patterns. Use this skill whenever the user asks about W&B runs, Weave traces, evaluations, training metrics, loss curves, model comparisons, or any Weights & Biases data — even if they don't say "W&B" explicitly.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningworkersio/spec

skill-benchmark

Benchmark any agent skill to measure whether it actually improves performance. Use when the user wants to evaluate, test, or compare a skill against baseline, or when they mention "benchmark", "eval", "skill performance", or "does this skill help". Runs isolated eval sessions with and without the skill, grades outputs via layered grading (deterministic checks + LLM-as-judge), analyzes behavioral signals, and generates a comparison report with a USE / DON'T USE verdict.

🇺🇸|EnglishTranslated

3 scripts/Attention

Uncategorizedsundial-org/awesome-openc...

jungian-psychologist

Expert in Jungian analytical psychology, depth psychology, shadow work, archetypal analysis, dream interpretation, active imagination, addiction/recovery through Jungian lens, and the individuation process.

🇺🇸|EnglishTranslated

Project Managementslavingia/skills

find-community

Help identify and evaluate communities to build a minimalist business around. Use when someone is looking for a business idea, trying to find their community, or wondering where to start as an entrepreneur.

🇺🇸|EnglishTranslated

AI & Machine Learningkeyvaluesoftwaresystems/n...

netra-best-practices

Code-first Netra best-practices playbook covering setup, instrumentation, context tracking, custom spans/metrics, integration patterns, evaluation, simulation, and troubleshooting.

🇺🇸|EnglishTranslated

Documentation & Writingsunny0826/open-source-ski...

readme-grader

Evaluate a README file text, score it out of 100, and provide specific, actionable improvement suggestions.

🇺🇸|EnglishTranslated

AI & Machine Learning10xchengtu/harness-engine...

harness-engineering

Set up and improve harness engineering (AGENTS.md, docs/, lint rules, eval systems, project-level prompt engineering) for AI-agent-friendly codebases. Triggers on: new/empty project setup for AI agents, AGENTS.md or CLAUDE.md creation, harness engineering questions, making agents work better on a codebase. ALSO triggers when users are frustrated or complaining about agent quality — e.g. 'the agent keeps ignoring conventions', 'it never follows instructions', 'why does it keep doing X', 'the agent is broken' — because poor agent output almost always signals harness gaps, not model problems. Covers: context engineering, architectural constraints, multi-agent coordination, evaluation, long-running agent harness, and diagnosis of agent quality issues.

🇺🇸|EnglishTranslated

Documentation & Writingofferclaw/agent-skills

recommendation-letter-writing

Writes recommendation letters for graduate school applications (master's, PhD, study abroad) from OfferClaw. Matches recommender voice, highlights student research and achievements, and tailors emphasis to target programs. Use when asked to draft, rewrite, or refine a letter of recommendation for university admission.

🇺🇸|EnglishTranslated

Backend Developmentsyncfusion/document-sdk-s...

syncfusion-dotnet-calculate

Implements the Syncfusion dotnet Calculate Library for parsing and evaluating formulas using CalcEngine, ICalcData, CalcQuickBase, and ExcelLikeComputations. Supports formula computation, custom functions, cross-sheet references, and XlsIO formula evaluation. Trigger when implementing formula calculations, expression parsing, workbook formula evaluation, or custom function registration.

🇺🇸|EnglishTranslated

AI & Machine Learningericosiu/ai-marketing-ski...

expert-panel

Score, evaluate, and iteratively improve any content or strategy using an auto-assembled panel of domain experts. Handles copy, sequences, landing pages, strategy docs, titles, charts, recruiting evaluations, or anything else that needs a quality gate. Recursively iterates until all scores hit 90+ (max 3 rounds). Use when asked to: "expert panel this", "score this", "rate these variants", "quality check this", "panel review", "which version is better", "expert score", "evaluate this copy/strategy/page", or when another skill needs a quality gate on its output. Also triggers on: "score this landing page", "expert panel these email variants", "rate this headline", "panel these charts".

🇺🇸|EnglishTranslated

5 scripts/Attention

Product & Designasgard-ai-platform/skills

grad-cognitive-load

Apply Cognitive Load Theory to optimize instructional design by managing intrinsic, extraneous, and germane load within working memory limits. Use this skill when the user needs to diagnose why learners are overwhelmed, redesign training or documentation for better comprehension, evaluate UI/UX information architecture for cognitive burden, or when they ask 'why is this tutorial confusing', 'how to simplify complex instructions', or 'what causes information overload'.

🇺🇸|EnglishTranslated

Code Qualitygiuseppe-trisciuoglio/dev...

task-quality-kpi

Objective task quality evaluation framework using quantitative KPIs. KPIs are automatically calculated by a hook when task files are modified and saved to TASK-XXX--kpi.json. Use when: reading KPI data for task evaluation, understanding quality metrics, deciding whether to iterate or approve based on data.

🇺🇸|EnglishTranslated