Loading...
Loading...
Found 32 Skills
End-to-end pipeline from unlabeled ml_app traces to a bootstrapped evaluator suite. Runs trace classification → root cause analysis → eval bootstrap in sequence with user checkpoints. Use when user says "run the eval pipeline", "go from traces to evals", "bootstrap evals end to end", "classify then RCA then bootstrap", "build an eval set from scratch", or wants a guided walkthrough from production data to evaluator code.
Integrate Databuddy analytics into applications using the SDK or REST API. Use when implementing analytics tracking, feature flags, custom events, Web Vitals, error tracking, LLM observability, or querying analytics data programmatically.
LLM observability platform for tracing, evaluation, and monitoring. Use when debugging LLM applications, evaluating model outputs against datasets, monitoring production systems, or building systematic testing pipelines for AI applications.
Add Opik tracing to an existing codebase. Detects language (Python/TypeScript), identifies LLM frameworks, adds appropriate decorators and integrations, marks entrypoints, and wires up environment config. Use for "instrument my code", "add opik tracing", "add observability", or "trace my agent".
Generates a self-contained Python experiment client that uses the ddtrace.llmobs SDK. Emits either a runnable .py script or a Jupyter .ipynb notebook matching the canonical DataDog reference notebook style. Use when the user says "generate Python experiment", "write an SDK experiment", "create a ddtrace experiment", "Python notebook experiment", "use the LLM Obs SDK", or has `ddtrace` installed and wants idiomatic SDK code.
Analyze LLM experiment results. Handles single or comparative experiments, exploratory or Q&A modes. Use when user says "analyze experiment", "compare experiments", "analyze against baseline", or provides one or two experiment IDs for analysis.
Every PostHog resource in one CLI — with offline search, agent-native output, and cross-resource analytics no... Trigger phrases: `check my PostHog feature flags`, `query PostHog events`, `show experiment results in PostHog`, `what errors are spiking in PostHog`, `LLM costs in PostHog`, `is it safe to ramp this flag`, `use posthog`.
LLM observability platform for tracing, evaluation, prompt management, and cost tracking. Use when setting up Langfuse, monitoring LLM costs, tracking token usage, or implementing prompt versioning.