Loading...
Loading...
Found 1,906 Skills
Systematic usability evaluation using established heuristics (Nielsen's 10, Shneiderman's 8, or custom rubrics). Use when reviewing UI designs, screenshots, prototypes, or live products for usability issues. Triggers on "review this design", "what's wrong with this UI", "usability check", "evaluate this interface", or when user shares screenshots/mockups asking for feedback.
Assess business quality, competitive positioning, and sustainability of value creation beyond financial models. Use when the user asks about economic moats, competitive advantages, Porter's Five Forces, management quality, ESG integration, or business model analysis. Also trigger when users mention 'does this company have a moat', 'switching costs', 'network effects', 'brand value', 'management track record', 'capital allocation', 'insider ownership', 'red flags', or ask whether a company's advantage is durable.
This skill should be used when the user asks to "evaluate agent performance", "build test framework", "measure agent quality", "create evaluation rubrics", or mentions LLM-as-judge, multi-dimensional evaluation, agent testing, or quality gates for agent pipelines. Part of the context engineering skill suite — also activates when the user mentions "context engineering" or "context-engineering" in the context of measuring agent effectiveness.
Evaluates LLMs across 100+ benchmarks from 18+ harnesses (MMLU, HumanEval, GSM8K, safety, VLM) with multi-backend execution. Use when needing scalable evaluation on local Docker, Slurm HPC, or cloud platforms. NVIDIA's enterprise-grade platform with container-first architecture for reproducible benchmarking.
Evaluates LLMs across 60+ academic benchmarks (MMLU, HumanEval, GSM8K, TruthfulQA, HellaSwag). Use when benchmarking model quality, comparing models, reporting academic results, or tracking training progress. Industry standard used by EleutherAI, HuggingFace, and major labs. Supports HuggingFace, vLLM, APIs.
Evaluate options for a specific design decision node and recommend one with explicit trade-offs. Use when the design already exposes a concrete choice such as architecture style, state management approach, auth model, storage pattern, sync strategy, multi-agent coordination model, language or runtime, UI framework, data-layer library, or tooling selection. Trigger when the user needs structured comparison and recommendation for a bounded design decision. Do not use for broad design discovery, full-system decomposition, or final readiness review.
You are **Tool Evaluator**, an expert technology assessment specialist who evaluates, tests, and recommends tools, software, and platforms for business use. You optimize team productivity and busin...
Expert technology assessment specialist focused on evaluating, testing, and recommending tools, software, and platforms for business use and productivity optimization
Prepare a research artifact package for conference artifact evaluation, reproducibility review, badges, supplementary material, or post-acceptance artifact release. Use this skill whenever the user needs install instructions, reviewer-facing reproduction commands, Docker or environment checks, data/checkpoint packaging, hardware/runtime estimates, anonymized or public artifact metadata, artifact evaluation forms, or a claim-to-artifact reproducibility audit for ML/AI venues.
Estimate the intrinsic value of a public company using DCF, relative (peer multiple) and sum-of-parts (SOTP) methods, then triangulate to an implied share price with upside/downside versus the current market price. Use this skill whenever the user asks: "what is AAPL worth", "valuation of NVDA", "fair value of TSLA", "intrinsic value", "DCF for MSFT", "build a DCF", "discounted cash flow", "WACC", "terminal value", "implied share price", "upside to fair value", "is X overvalued/undervalued", "relative valuation", "peer comparison valuation", "EV/EBITDA target", "SOTP", "sum of the parts", "how much is [company] worth", "price target from fundamentals", "value this company", or any ticker in the context of computing intrinsic or relative valuation. Default to running ALL three methods (DCF + relative + SOTP-if-applicable) and presenting a blended implied price with a sensitivity table. Do not answer valuation questions from memory — always run the workflow.
Evaluate interface descriptions against 168 research-backed UX/UI principles. Returns structured findings with severity, remediation, and business impact. API key optional — enriched output requires uxuiprinciples.com API Access.
Use when testing, reviewing, pressure-testing, refining, packaging, or validating agent skills for academic research workflows before installing or relying on them.