Loading...
Loading...
Found 456 Skills
Bootstrap evaluators from production traces — emit SDK code, a framework-agnostic JSON spec, or publish online LLM-judge evaluators directly to Datadog. Use when user says "bootstrap evaluators", "generate evaluators", "create evals from traces", "eval bootstrap", "write evaluators", "build eval suite", "publish evaluators", or wants to generate BaseEvaluator/LLMJudge code or online judge configs from production LLM trace data. Works with ml_app and optional RCA report or failure hypothesis.
Interact with Litefuse and access its documentation. Use when needing to (1) query or modify Litefuse data programmatically via the CLI — traces, prompts, datasets, scores, sessions, and any other API resource, (2) look up Litefuse documentation, concepts, integration guides, or SDK usage, or (3) understand how any Litefuse feature works. This skill covers CLI-based API access (via npx) and multiple documentation retrieval methods.
DevOps and Infrastructure expert with comprehensive knowledge of CI/CD pipelines, containerization, orchestration, infrastructure as code, monitoring, security, and performance optimization. Use PROACTIVELY for any DevOps, deployment, infrastructure, or operational issues. If a specialized expert is a better fit, I will recommend switching and stop.
View Langfuse trace details. Use when checking specific trace input/output, debugging LLM calls, or analyzing costs.
Microservices architecture patterns and best practices. Use when designing distributed systems, breaking down monoliths, or implementing service communication.
AI 도입 전략, Build vs Buy, 우선순위 설정, 거버넌스/보안, 6개월 확장 로드맵을 다루는 모듈.
Automate Datadog tasks via Rube MCP (Composio): query metrics, search logs, manage monitors/dashboards, create events and downtimes. Always search tools first for current schemas.
Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.
Generate/create Loki configs — ingester, querier, compactor, ruler, S3/GCS/Azure backends.
Post-deploy canary monitoring. Watches the live app for console errors, performance regressions, and page failures using the browse daemon. Takes periodic screenshots, compares against pre-deploy baselines, and alerts on anomalies. Use when: "monitor deploy", "canary", "post-deploy check", "watch production", "verify deploy".
Lumigo integration. Manage data, records, and automate workflows. Use when the user wants to interact with Lumigo data.
Logz.io integration. Manage data, records, and automate workflows. Use when the user wants to interact with Logz.io data.