Loading...
Loading...
Found 3,307 Skills
Use when finishing a ticket or pull request and the user asks to validate, demo, or sign off on delivered behavior, including non-user-facing changes. Triggers include "UAT", "verify", "walk me through", "show what changed", "can we merge?", "sign off", "acceptance test", "demo this", "ready to merge", "validate the changes", "show me it works", and similar phrases indicating a need for an acceptance walkthrough or demonstration before merge.
Security Benchmark Runner - Auto-activating skill for Security Advanced. Triggers on: security benchmark runner, security benchmark runner Part of the Security Advanced skill category.
Senior AI Product Manager. Expert in Probabilistic Strategy, Rapid Agentic Prototyping, and Hypothesis Generation for 2026.
Calibrate an LLM judge against human labels using data splits, TPR/TNR, and bias correction. Use after writing a judge prompt (write-judge-prompt) when you need to verify alignment before trusting its outputs. Do NOT use for code-based evaluators (those are deterministic; test with standard unit tests).
Audit an LLM eval pipeline and surface problems: missing error analysis, unvalidated judges, vanity metrics, etc. Use when inheriting an eval system, when unsure whether evals are trustworthy, or as a starting point when no eval infrastructure exists. Do NOT use when the goal is to build a new evaluator from scratch (use error-analysis, write-judge-prompt, or validate-evaluator instead).
Debug OpenWork sidecars, config, and audit trail
Optimize landing pages for maximum conversions using proven frameworks from Unbounce and Oli Gardner—apply the "one goal, one message, one action" principle with data-driven design, copy, and CTA best practices. Use when: **Create a new landing page** for a campaign; **Optimize an existing landing page** that isn't converting; **Review landing page design** before launch; **Improve form conversion rates** on lead gen pages; **A/B test landing page elements** systematically
Use when the user wants to validate that implemented code matches its specifications, generate integration tests from feature files, or check if code still satisfies existing scenarios. Trigger after implementation completes a feature. Also use when the user asks "does the code do what we specified?" or "generate tests from the feature files".
Comprehensive Python expertise covering language fundamentals, idiomatic patterns, software design principles, and production best practices. Use when writing, reviewing, debugging, or refactoring Python code. Triggers: Python, .py files, pip, uv, pytest, dataclasses, asyncio, type hints, or any Python library.
Core concepts and best practices for `package:test`. Covers `test`, `group`, lifecycle methods (`setUp`, `tearDown`), and configuration (`dart_test.yaml`).
Smoke test for alicloud-data-analytics-dataanalysisgbi. Validate minimal authentication, API reachability, and one read-only query path.
Smoke test for alicloud-ai-text-document-mind. Validate minimal authentication, API reachability, and one read-only query path.