Loading...
Loading...
Found 87 Skills
Use this skill when implementing logging, metrics, distributed tracing, alerting, or defining SLOs. Triggers on structured logging, Prometheus, Grafana, OpenTelemetry, Datadog, distributed tracing, error tracking, dashboards, alert fatigue, SLIs, SLOs, error budgets, and any task requiring system observability or monitoring setup.
Build FastAPI services with JWT auth, structlog, and Prometheus metrics. Use when creating or modifying a Python HTTP server, adding authentication, structured logging, or instrumentation to a FastAPI app.
Production server monitoring stack covering Prometheus, Node Exporter, Grafana, Alertmanager, Loki, and Promtail on bare-metal or VM Linux hosts. USE WHEN: - Setting up monitoring for a new production server or VPS - Configuring Prometheus scrape targets for application or system metrics - Creating Grafana dashboards and datasource provisioning - Writing Alertmanager routing rules with email/Slack notifications - Implementing the PLG stack (Promtail + Loki + Grafana) for log aggregation - Performing live system diagnostics with htop, iotop, nethogs, ss, vmstat, iostat - Setting up uptime monitoring with UptimeRobot or healthchecks.io DO NOT USE FOR: - Kubernetes-native observability (use the kubernetes skill instead) - Application-level APM (distributed tracing with Jaeger/Tempo — use observability skill) - Cloud-managed monitoring (CloudWatch, GCP Monitoring, Azure Monitor) - Windows Server monitoring
Set up monitoring, logging, and observability for applications and infrastructure. Use when implementing health checks, metrics collection, log aggregation, or alerting systems. Handles Prometheus, Grafana, ELK Stack, Datadog, and monitoring best practices.
Golang benchmarking, profiling, and performance measurement. Use when writing, running, or comparing Go benchmarks, profiling hot paths with pprof, interpreting CPU/memory/trace profiles, analyzing results with benchstat, setting up CI benchmark regression detection, or investigating production performance with Prometheus runtime metrics. Also use when the developer needs deep analysis on a specific performance indicator - this skill provides the measurement methodology, while golang-performance provides the optimization patterns.
Automatically discover observability and monitoring skills when working with Prometheus, Grafana, distributed tracing, structured logging, metrics, alerting, dashboards, or monitoring. Activates for observability development tasks.
Expert guidance for building production-ready FastAPI applications with modular architecture where each business domain is an independent module with own routes, models, schemas, services, cache, and migrations. Uses UV + pyproject.toml for modern Python dependency management, project name subdirectory for clean workspace organization, structlog (JSON+colored logging), pydantic-settings configuration, auto-discovery module loader, async SQLAlchemy with PostgreSQL, per-module Alembic migrations, Redis/memory cache with module-specific namespaces, central httpx client, OpenTelemetry/Prometheus observability, conversation ID tracking (X-Conversation-ID header+cookie), conditional Keycloak/app-based RBAC authentication, DDD/clean code principles, and automation scripts for rapid module development. Use when user requests FastAPI project setup, modular architecture, independent module development, microservice architecture, async database operations, caching strategies, logging patterns, configuration management, authentication systems, observability implementation, or enterprise Python web services. Supports max 3-4 route nesting depth, cache invalidation patterns, inter-module communication via service layer, and comprehensive error handling workflows.
Observability visualization with Grafana and LGTM stack. Dashboard design, panel configuration, alerting, variables/templating, and data sources. USE WHEN: Creating Grafana dashboards, configuring panels and visualizations, writing LogQL/TraceQL queries, setting up Grafana data sources, configuring dashboard variables and templates, building Grafana alerts. DO NOT USE: For writing PromQL queries (use /prometheus), for alerting rule strategy (use /prometheus), for general observability architecture (use senior-software-engineer with infrastructure focus). TRIGGERS: grafana, dashboard, panel, visualization, logql, traceql, loki, tempo, mimir, data source, annotation, variable, template, row, stat, graph, table, heatmap, gauge, bar chart, pie chart, time series, logs panel, traces panel, LGTM stack.
Grafana Alloy OpenTelemetry collector and telemetry pipeline configuration. Covers the Alloy configuration language (blocks, attributes, expressions), components for collecting metrics/logs/traces/profiles, sending data to Grafana Cloud/Prometheus/Loki/Tempo, clustering, Fleet Management remote config, and building telemetry pipelines. Use when configuring Alloy, writing Alloy config files (.alloy), building data collection pipelines, setting up scraping, or troubleshooting Alloy deployments.
Comprehensive logging and observability patterns for production systems including structured logging, distributed tracing, metrics collection, log aggregation, and alerting. Triggers for this skill - log, logging, logs, trace, tracing, traces, metrics, observability, OpenTelemetry, OTEL, Jaeger, Zipkin, structured logging, log level, debug, info, warn, error, fatal, correlation ID, span, spans, ELK, Elasticsearch, Loki, Datadog, Prometheus, Grafana, distributed tracing, log aggregation, alerting, monitoring, JSON logs, telemetry.
CI/CD pipeline design, containerization, and infrastructure management. Handles Docker, Kubernetes, monitoring setup (Prometheus/Grafana), and infrastructure-as-code (Terraform/Pulumi).
Monitoring, logging, and tracing implementation using OpenTelemetry as the unified standard. Use when building production systems requiring visibility into performance, errors, and behavior. Covers OpenTelemetry (metrics, logs, traces), Prometheus, Grafana, Loki, Jaeger, Tempo, structured logging (structlog, tracing, slog, pino), and alerting.