Search Results: uat

Found 1,927 Skills

AI & Machine Learningcekura-ai/cekura-skills

cekura-metric-design

Use when the user asks to "create a metric", "write a metric", "design a metric", "build a metric for", "evaluate agent performance", "measure call quality", "track a KPI", "add a workflow metric", "improve my metric", "fix a metric", "debug metric results", "set up quality scoring", or "what metrics do I need". Also relevant when discussing LLM judge prompts, custom code metrics, evaluation triggers, VALID_SKIP patterns, section extraction, or metric best practices for Cekura voice AI agents. Covers both creating new metrics and reviewing, iterating on, or troubleshooting existing ones.

🇺🇸|EnglishTranslated

2 scripts/Checked

AI & Machine Learningaradotso/ai-agent-skills

datawhale-agent-learning-hub

AI Agent learning roadmap and curated resources for building production-ready agents with modern patterns like Claude Code, OpenClaw, skills, MCP, and evaluation

🇺🇸|EnglishTranslated

AI & Machine Learninganastasiyaw/claude-code-c...

harness-design

Design and build multi-agent harness architectures for long-running AI application development. GAN-inspired Generator-Evaluator pattern, Sprint Contract negotiation, context management, quality criteria calibration. Based on Anthropic Engineering patterns. Use when: "build a harness", "multi-agent architecture", "agent orchestration", "generator-evaluator", "long-running app", "harness design", "agent pipeline", "quality evaluation loop", "sprint contract", "build app with agents", "Claude Agent SDK architecture", or when building complex full-stack apps that need planning → generation → evaluation cycles. Also use when discussing context degradation, self-evaluation bias, or assumption testing in AI workflows.

🇺🇸|EnglishTranslated

AI & Machine Learningfarmage/opencode-skills

prompt-engineer

Writes, refactors, and evaluates prompts for LLMs — generating optimized prompt templates, structured output schemas, evaluation rubrics, and test suites. Use when designing prompts for new LLM applications, refactoring existing prompts for better accuracy or token efficiency, implementing chain-of-thought or few-shot learning, creating system prompts with personas and guardrails, building JSON/function-calling schemas, or developing prompt evaluation frameworks to measure and improve model performance.

🇺🇸|EnglishTranslated

Data Processingvangongwanxiaowan/screen-...

score-analyzer

Analyze multi-round evaluation score data, count various indicators, and calculate rating levels. Suitable for analyzing score trends and calculating S/A/B ratings

🇨🇳|ChineseTranslated

AI & Machine Learningntcoding/claude-skillz

challenge-that

Force critical evaluation of proposals, requirements, or decisions by analyzing from multiple adversarial perspectives. Triggers on: accepting a proposal without pushback, 'sounds good', 'let's go with', design decisions with unstated tradeoffs, unchallenged assumptions, premature consensus. Invoke with /challenge-that.

🇺🇸|EnglishTranslated

AI & Machine Learningohing504/skills

agent-reference

Generate objective reference check reports about the user from real AI collaboration data — session history, git logs, GitHub profile, and memory files. Like a colleague writing a professional reference, but grounded in actual shared work. Use whenever the user asks to evaluate them as a developer, wants a reference letter, work style analysis, introduced by my agents content, interview prep from collaboration history, or blog topics from past discussions. Triggers on: write a reference, analyze my work patterns, what do you think of me, 나에 대한 레퍼런스 써줘, 내 작업 스타일 분석해줘. Not for general code review, architecture docs, cover letters, or codebase-only analysis.

🇺🇸|EnglishTranslated

Tools & Utilitiesonewave-ai/claude-skills

hiring-scorecard

Creates structured hiring scorecards for any role. Takes job title, requirements, and team context. Generates comprehensive scorecard with weighted scoring rubric, interview questions per competency, evaluation matrix, red/green flags, and reference check questions.

🇺🇸|EnglishTranslated

Tools & Utilitiesonewave-ai/claude-skills

pitch-deck-reviewer

Reviews pitch decks and investor presentations. Reads slide content, evaluates narrative flow, problem/solution clarity, market sizing, competitive positioning, financial projections, team credibility, and ask clarity. Generates a scored pitch-review.md with slide-by-slide feedback, overall score, top improvements, investor objection predictions, and comparisons to successful decks. Use when reviewing fundraising materials, investor decks, or pitch presentations.

🇺🇸|EnglishTranslated

Documentation & Writingmizchi/skills

tech-article-reproducibility

Evaluate the reproducibility of technical articles. Dispatch a subagent to simulate a first-time reader reproducing the work locally and list missing information. Use as the final check on a draft before publication.

🇺🇸|EnglishTranslated

Data Processingwind-information-co-ltd/w...

peer_comparison_decision_skill

Compare the differences in business quality, growth, profitability, valuation and catalysts of peer candidate companies horizontally, and provide conclusions on relative strengths and weaknesses. It is applicable to scenarios such as choosing between two candidate stocks, selecting the best among peers in an industry, and establishing a priority tracking order.

🇨🇳|ChineseTranslated

Data Processinganthropics/financial-serv...

returns-analysis

Build quick IRR/MOIC sensitivity tables for PE deal evaluation. Models returns across entry multiple, leverage, exit multiple, growth, and hold period scenarios. Use when sizing up a deal, stress-testing assumptions, or preparing IC returns exhibits. Triggers on "returns analysis", "IRR sensitivity", "MOIC table", "what's the return at", "model the returns", or "back of the envelope".

🇺🇸|EnglishTranslated