Loading...
Loading...
Found 1,906 Skills
Add and manage evaluation results in Hugging Face model cards. Supports extracting eval tables from README content, importing scores from Artificial Analysis API, and running custom model evaluations with vLLM/lighteval. Works with the model-index metadata format.
Use this skill when users need to evaluate potential co-founders, assess founder compatibility, design equity splits, or navigate co-founder relationships. Activates for "should I work with this person," "co-founder fit," "equity split," or founding team questions.
Evaluates and sharpens content hooks using The Hook Stack™ framework. Use when scoring headlines, refining hooks for video/social/newsletter, or when asked to "evaluate this hook", "run through hook stack", or "score my headline".
Under the assumption that the US dollar or a certain currency loses its reserve status and gold becomes the only anchor, deduce the 'implied gold price that the balance sheet can withstand' by dividing central bank monetary liabilities by gold reserves, and output the leverage level, gap and ranking of each country or currency.
Use when comparing technology stacks, evaluating frameworks/providers, or assessing TCO, security, and ecosystem health for migration decisions.
Use when "evaluating technology", "choosing frameworks", "stack comparison", "technology decisions", or asking about "React vs Vue", "PostgreSQL vs MySQL", "AWS vs GCP", "build vs buy"
Use when evaluating LLMs, running benchmarks like MMLU/HumanEval/GSM8K, setting up evaluation pipelines, or asking about "NeMo Evaluator", "LLM benchmarking", "model evaluation", "MMLU", "HumanEval", "GSM8K", "benchmark harnesses"
Professionally evaluate story outlines, judge and score from the dimensions of market potential, innovation attributes, and content highlights. Suitable for story outline quality assessment, IP adaptation potential judgment, and project approval decision-making
Evaluate educational chapters from dual student and teacher perspectives. This skill should be used when analyzing chapter quality, identifying content gaps, or planning chapter improvements. Reads all lessons in a chapter directory and provides structured analysis with ratings, gap identification, and prioritized recommendations.
Create AI evaluation plans with benchmarks, rubrics, and error analysis workflows.
Evaluate and improve code modularization using the Balanced Coupling Model. Analyzes coupling strength, connascence types, and distance to identify refactoring opportunities and architectural improvements. Use when reviewing code architecture, refactoring modules, or designing new systems.
Guides evaluation of RAG pipeline retrieval and generation quality. Use when evaluating a retrieval-augmented generation system, measuring retrieval quality, assessing generation faithfulness or relevance, generating synthetic QA pairs for retrieval testing, or optimizing chunking strategies.