Loading...
Loading...
Found 26 Skills
File GitHub issues to the right repository (pup CLI or plugin)
Monitoring and observability strategy, implementation, and troubleshooting. Use for designing metrics/logs/traces systems, setting up Prometheus/Grafana/Loki, creating alerts and dashboards, calculating SLOs and error budgets, analyzing performance issues, and comparing monitoring tools (Datadog, ELK, CloudWatch). Covers the Four Golden Signals, RED/USE methods, OpenTelemetry instrumentation, log aggregation patterns, and distributed tracing.
Create Post Incident Records (PIRs) by analysing incidents discovered from PagerDuty. Orchestrates pagerduty-oncall, datadog-analyser, and traffic-spikes-investigator skills to enrich each incident with observability and traffic data, auto-determines severity, and outputs completed PIR forms. Use when asked to "create a PIR", "write a post incident record", "fill out PIR form", "incident report", "analyse incidents", or after on-call shifts need documentation.
Partition-first log analysis methodology. Use for log searches, error analysis, pattern finding across Datadog, CloudWatch, or Kubernetes logs.
Instrument web applications to send telemetry data to Azure Application Insights for observability and monitoring. USE FOR: instrument app with app insights, add appinsights instrumentation, configure application insights, set up telemetry monitoring, enable app insights auto-instrumentation, add observability to azure web app, instrument webapp to send data to app insights, configure telemetry for app service. DO NOT USE FOR: non-Azure monitoring (use CloudWatch for AWS, Datadog for third-party), log analysis (use azure-kusto), cost monitoring (use azure-cost-optimization), security monitoring (use azure-security).
Set up monitoring, logging, and observability for applications and infrastructure. Use when implementing health checks, metrics collection, log aggregation, or alerting systems. Handles Prometheus, Grafana, ELK Stack, Datadog, and monitoring best practices.
Comprehensive logging and observability patterns for production systems including structured logging, distributed tracing, metrics collection, log aggregation, and alerting. Triggers for this skill - log, logging, logs, trace, tracing, traces, metrics, observability, OpenTelemetry, OTEL, Jaeger, Zipkin, structured logging, log level, debug, info, warn, error, fatal, correlation ID, span, spans, ELK, Elasticsearch, Loki, Datadog, Prometheus, Grafana, distributed tracing, log aggregation, alerting, monitoring, JSON logs, telemetry.
Create a new built-in evlog adapter to send wide events to an external observability platform. Use when adding a new drain adapter (e.g., for Datadog, Sentry, Loki, Elasticsearch, etc.) to the evlog package. Covers source code, build config, package exports, tests, and all documentation.
Guide for implementing HolmesGPT - an AI agent for troubleshooting cloud-native environments. Use when investigating Kubernetes issues, analyzing alerts from Prometheus/AlertManager/PagerDuty, performing root cause analysis, configuring HolmesGPT installations (CLI/Helm/Docker), setting up AI providers (OpenAI/Anthropic/Azure), creating custom toolsets, or integrating with observability platforms (Grafana, Loki, Tempo, DataDog).
Error tracking and monitoring integration. Sentry, Datadog RUM, Bugsnag. Source maps, breadcrumbs, release tracking, performance monitoring, and alerting configuration. USE WHEN: user mentions "Sentry", "error tracking", "Bugsnag", "Datadog RUM", "crash reporting", "source maps", "release tracking", "error monitoring" DO NOT USE FOR: application logging - use logging skills; APM/tracing - use `opentelemetry`; structured error responses - use `error-handling`
Use this skill when implementing logging, metrics, distributed tracing, alerting, or defining SLOs. Triggers on structured logging, Prometheus, Grafana, OpenTelemetry, Datadog, distributed tracing, error tracking, dashboards, alert fatigue, SLIs, SLOs, error budgets, and any task requiring system observability or monitoring setup.
Use this skill when working with SigNoz - open-source observability platform for application monitoring, distributed tracing, log management, metrics, alerts, and dashboards. Triggers on SigNoz setup, OpenTelemetry instrumentation for SigNoz, sending traces/logs/metrics to SigNoz, creating SigNoz dashboards, configuring SigNoz alerts, exception monitoring, and migrating from Datadog/Grafana/New Relic to SigNoz.