Search Results: uat

Found 1,162 Skills

AI & Machine Learningfatih-developer/fth-skill...

agent-reviewer

After an agentic task completes, perform a retrospective analysis across 6 dimensions (goal alignment, efficiency, decision quality, error handling, communication, reusability). Score performance, identify inefficiency patterns, evaluate skill usage, and produce actionable improvement recommendations. Triggers on "how did it go", "retrospective", "review performance", "what could be better", or after any long agentic task completes.

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

phoenix-observability

Open-source AI observability platform for LLM tracing, evaluation, and monitoring. Use when debugging LLM applications with detailed traces, running evaluations on datasets, or monitoring production AI systems with real-time insights.

🇺🇸|EnglishTranslated

AI & Machine Learningorchestra-research/ai-res...

langsmith-observability

LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.

🇺🇸|EnglishTranslated

Product & Designphuryn/pm-skills

ab-test-analysis

Analyze A/B test results with statistical significance, sample size validation, confidence intervals, and ship/extend/stop recommendations. Use when evaluating experiment results, checking if a test reached significance, interpreting split test data, or deciding whether to ship a variant.

🇺🇸|EnglishTranslated

Security & Compliancekostja94/marketing-skills

brand-protection

When the user faces brand impersonation, fake websites, phishing sites, or trademark infringement. Also use when the user mentions "fake site," "impersonation," "phishing site," "trademark infringement," "domain squatting," or "brand abuse."

🇺🇸|EnglishTranslated

AI & Machine Learninggoogle/adk-docs

adk-eval-guide

MUST READ before running any ADK evaluation. ADK evaluation methodology — eval metrics, evalset schema, LLM-as-judge, tool trajectory scoring, and common failure causes. Use when evaluating agent quality, running adk eval, or debugging eval results. Do NOT use for API code patterns (use adk-cheatsheet), deployment (use adk-deploy-guide), or project scaffolding (use adk-scaffold).

🇺🇸|EnglishTranslated

Data Processingzuoa/aj-skills

aj-stock-analysis

A-share Value Investment Analysis Tool, providing stock screening, in-depth individual stock analysis, industry comparison and valuation calculation functions. Based on value investment theory, it uses tushare to obtain public financial data, suitable for ordinary investors with low-frequency trading.

🇨🇳|ChineseTranslated

26 scripts/Checked

Backend Developmentanthropics/knowledge-work...

architecture

Create or evaluate an architecture decision record (ADR). Use when choosing between technologies (e.g., Kafka vs SQS), documenting a design decision with trade-offs and consequences, reviewing a system design proposal, or designing a new component from requirements and constraints.

🇺🇸|EnglishTranslated

Automationalirezarezvani/claude-ski...

run

Run a single experiment iteration. Edit the target file, evaluate, keep or discard.

🇺🇸|EnglishTranslated

Backend Developmentbencium/bencium-claude-co...

negentropy-lens

A decision-support framework that evaluates systems, architectures, and strategies through the entropy (decay) vs negentropy (growth) lens, while surfacing tacit knowledge gaps. Use this skill whenever the user is making architecture decisions, evaluating system designs, reviewing technical approaches, choosing between options, auditing existing systems, or planning strategies. Also trigger when the user explicitly asks to "apply the negentropy lens", mentions "entropy", "negentropy", "tacit knowledge", "knowledge engine", or "flip the switch". Nudge activation when you detect the user is at a decision point — even if they haven't asked for this lens — by briefly noting the entropic/negentropic dimension before proceeding.

🇺🇸|EnglishTranslated

Product & Designowl-listener/designer-ski...

click-test-plan

Design click/first-click tests to evaluate navigation and information findability.

🇺🇸|EnglishTranslated

AI & Machine Learninggithub/awesome-copilot

eval-driven-dev

Instrument Python LLM apps, build golden datasets, write eval-based tests, run them, and root-cause failures — covering the full eval-driven development cycle. Make sure to use this skill whenever a user is developing, testing, QA-ing, evaluating, or benchmarking a Python project that calls an LLM, even if they don't say "evals" explicitly. Use for making sure an AI app works correctly, catching regressions after prompt changes, debugging why an agent started behaving differently, or validating output quality before shipping.

🇺🇸|EnglishTranslated