Search Results: uat

Found 1,927 Skills

AI & Machine Learninglubu-labs/langchain-agent...

langsmith-trace-analyzer

Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.

🇺🇸|EnglishTranslated

3 scripts/Attention

AI & Machine Learningpjt222/development-guides

awareness

AI situational awareness — internal threat detection for hallucination risk, scope creep, and context degradation. Maps Cooper color codes to reasoning states and OODA loop to real-time decisions. Use during any task where reasoning quality matters, when operating in unfamiliar territory, after detecting early warning signs such as an uncertain fact or suspicious tool result, or before high-stakes output like irreversible changes or architectural decisions.

🇺🇸|EnglishTranslated

Testing & QAibrahim-3d/conductor-orch...

eval-business-logic

Specialized business logic evaluator for the Evaluate-Loop. Use this for evaluating tracks that implement core product logic — pipelines, dependency resolution, state machines, pricing/tier enforcement, packaging. Checks feature correctness against product rules, edge cases, state transitions, data flow, and user journey completeness. Dispatched by loop-execution-evaluator when track type is 'business-logic', 'generator', or 'core-feature'. Triggered by: 'evaluate logic', 'test business rules', 'verify business rules', 'check feature'.

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

write-judge-prompt

Design LLM-as-Judge evaluators for subjective criteria that code-based checks cannot handle. Use when a failure mode requires interpretation (tone, faithfulness, relevance, completeness). Do NOT use when the failure mode can be checked with code (regex, schema validation, execution tests). Do NOT use when you need to validate or calibrate the judge — use validate-evaluator instead.

🇺🇸|EnglishTranslated

AI & Machine Learninghamelsmu/evals-skills

generate-synthetic-data

Create diverse synthetic test inputs for LLM pipeline evaluation using dimension-based tuple generation. Use when bootstrapping an eval dataset, when real user data is sparse, or when stress-testing specific failure hypotheses. Do NOT use when you already have 100+ representative real traces (use stratified sampling instead), or when the task is collecting production logs.

🇺🇸|EnglishTranslated

AI & Machine Learninglangchain-ai/langsmith-sk...

langsmith-dataset

INVOKE THIS SKILL when creating evaluation datasets, uploading datasets to LangSmith, or managing existing datasets. Covers dataset types (final_response, single_step, trajectory, RAG), CLI management commands, SDK-based creation, and example management. Uses the langsmith CLI tool.

🇺🇸|EnglishTranslated

Security & Compliancekostja94/marketing-skills

brand-protection

When the user faces brand impersonation, fake websites, phishing sites, or trademark infringement. Also use when the user mentions "fake site," "impersonation," "phishing site," "trademark infringement," "domain squatting," or "brand abuse."

🇺🇸|EnglishTranslated

Documentation & Writingsugarforever/01coder-agen...

personal-writing-style

Personal writing style preferences. Reference this skill when writing, translating, or editing content to ensure consistent style, punctuation, and formatting.

🇺🇸|EnglishTranslated

AI & Machine Learningmastepanoski/claude-skill...

ai-assessment-scale

Evaluate AI contribution in projects using the AI Assessment Scale (AIAS) 5-level framework. Measure AI involvement from no AI to full AI exploration across development stages.

🇺🇸|EnglishTranslated

AI & Machine Learningaradotso/trending-skills

chrome-cdp-live-browser

Connect AI agents to your live Chrome session via CDP for real-time tab interaction, screenshots, and JS evaluation without re-login

🇺🇸|EnglishTranslated

Frontend Developmentsomnio-software/somnio-ai...

react-health-audit

Execute a comprehensive React Project Health Audit. Analyzes tech stack, architecture, state management, testing, code quality, performance, CI/CD, and documentation. Produces a Google Docs-ready report with section scores and weighted overall score. Use when the user asks to audit a React project, run a health check, evaluate frontend quality, or assess technical debt. Triggers on: 'react audit', 'health audit', 'react health', 'frontend audit', 'next.js audit', 'vite audit', 'project quality check'.

🇺🇸|EnglishTranslated

Product & Designreason-healthcare/health-...

health-human-factors

Review healthcare and EHR software interfaces against a comprehensive design style guide grounded in NIST, FDA, IEC 62366, ISO 9241, ISO 14971, WCAG 2.1, ONC SAFER, and HL7 FHIR standards. Produces a report-only assessment without modifying code or designs. Use when an agent needs to evaluate clinical UI screens, data display, forms, alerts, or workflows for patient-safety, usability, accessibility, and data-clarity compliance.

🇺🇸|EnglishTranslated