Search Results: uat

Found 1,927 Skills

AI & Machine Learningcoval-ai/coval-external-s...

get-results

Retrieve and analyze simulation results from a Coval run. Use when user wants to review evaluation outcomes or debug agent behavior.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

add-benchmark

Guide for adding a new benchmark or training environment to NeMo-Gym. Use when the user asks to add, create, or integrate a benchmark, evaluation, training environment, or resources server into NeMo-Gym. Also use when wrapping an existing 3rd-party benchmark library. Covers the full workflow: data preparation, resources server implementation, agent wiring, YAML config, testing, and reward profiling (baselining). Triggered by: "add benchmark", "new resources server", "integrate benchmark", "wrap benchmark", "add training environment", "add eval".

🇺🇸|EnglishTranslated

AI & Machine Learningxiaomoboy/claude-writing-...

score-optimizer

Use when the user wants to iterate on a viral-article scoring system itself, calibrate or improve a scoring prompt against labeled samples, or run batch scoring experiments on a fixed article set. Best for prompt-only scoring research where the evaluator scripts stay fixed and only the scoring rubric/prompt is meant to evolve.

🇨🇳|ChineseTranslated

3 scripts/Checked

Data Processinganthropics/financial-serv...

model-update

Update financial models with new data — quarterly earnings, management guidance, macro changes, or revised assumptions. Adjusts estimates, recalculates valuation, and flags material changes. Use after earnings, guidance updates, or when assumptions need refreshing. Triggers on "update model", "plug earnings", "refresh estimates", "update numbers for [company]", "new guidance", or "revise estimates".

🇺🇸|EnglishTranslated

Tools & Utilitiesyuan1z0825/nature-skills

nature-reviewer

Simulate a Nature-style reviewer assessment from the referee perspective rather than an author rebuttal. Use when the user wants a pre-submission review, reviewer report, peer-review style critique, novelty/significance/technical soundness assessment, reviewer-style manuscript evaluation, 审稿人视角评估, 预审稿意见, or Nature reviewer report. Return 3 reviewer reports plus a cross-review synthesis, grounded only in the local Nature reviewer source basis.

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-analyze-gaps-vlm-bcq

Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions. Use after running VLM evaluation when you have a predictions JSON and need to identify failure cases for DEFT root cause analysis on a binary-classification VLM workflow.

🇺🇸|EnglishTranslated

Marketing & Growthsickn33/antigravity-aweso...

seo-fundamentals

Core principles of SEO including E-E-A-T, Core Web Vitals, technical foundations, content quality, and how modern search engines evaluate pages. This skill explains *why* SEO works, not how to execute specific optimizations.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingdavila7/claude-code-templ...

polars

Fast DataFrame library (Apache Arrow). Select, filter, group_by, joins, lazy evaluation, CSV/Parquet I/O, expression API, for high-performance data analysis workflows.

🇺🇸|EnglishTranslated

Security & Compliancealirezarezvani/claude-ski...

mdr-745-specialist

EU MDR 2017/745 compliance specialist for medical device classification, technical documentation, clinical evidence, and post-market surveillance. Covers Annex VIII classification rules, Annex II/III technical files, Annex XIV clinical evaluation, and EUDAMED integration.

🇺🇸|EnglishTranslated

1 scripts/Checked

Product & Designvasilyu1983/ai-agents-pub...

startup-business-models

Use when choosing or evaluating a startup revenue model, pricing/value metric, packaging/tier design, or calculating unit economics (LTV, CAC, payback, gross margin, NRR), including usage-based/credit/AI pricing and variable compute/COGS constraints.

🇺🇸|EnglishTranslated

Backend Developmentrysweet/amplihack

computer-scientist-analyst

Analyzes events through computer science lens using computational complexity, algorithms, data structures, systems architecture, information theory, and software engineering principles to evaluate feasibility, scalability, security. Provides insights on algorithmic efficiency, system design, computational limits, data management, and technical trade-offs. Use when: Technology evaluation, system architecture, algorithm design, scalability analysis, security assessment. Evaluates: Computational complexity, algorithmic efficiency, system architecture, scalability, data integrity, security.

🇺🇸|EnglishTranslated

AI & Machine Learningyonatangross/orchestkit

langfuse-observability

LLM observability platform for tracing, evaluation, prompt management, and cost tracking. Use when setting up Langfuse, monitoring LLM costs, tracking token usage, or implementing prompt versioning.

🇺🇸|EnglishTranslated

2 scripts/Attention