Loading...
Loading...
Found 356 Skills
Use when building comprehensive monitoring and observability systems.
You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging,
High-performance structured JSON logging for Node.js. Use when building production APIs that need fast, structured logs for observability platforms (Datadog, ELK, CloudWatch). Provides request logging middleware, child loggers for context, and sensitive data redaction. Choose Pino over console.log for any production TypeScript backend.
Expert SRE incident responder specializing in rapid problem resolution, modern observability, and comprehensive incident management. Masters incident command, blameless post-mortems, error budget management, and system reliability patterns. Handles critical outages, communication strategies, and continuous improvement. Use IMMEDIATELY for production incidents or SRE practices.
Production Python engineering patterns covering architecture, observability, testing, performance/concurrency, and core practices. Use when designing Python systems, implementing async/sync APIs, setting up monitoring, structuring tests, optimizing performance, or following Python best practices.
Instrument applications with OpenTelemetry SDK and validate telemetry using Kopai. Use when setting up observability, adding tracing/logging/metrics, testing instrumentation, or debugging missing telemetry data.
Create and configure Inngest durable functions. Covers triggers (events, cron, invoke), step execution and memoization, idempotency, cancellation, error handling, retries, logging, and observability.
Connect AI coding agents (Claude Code, Cursor, VS Code, OpenAI Codex) to Grafana Cloud via the Model Context Protocol (MCP) server. Use when the user asks to connect Claude Code to Grafana, set up MCP for Grafana, use Grafana tools in Cursor, query Grafana from an AI agent, configure the Grafana MCP server, or make AI agents interact with Grafana Cloud APIs. Triggers on phrases like "MCP server", "connect Claude Code to Grafana", "Grafana MCP", "AI agent Grafana", "Claude Grafana tools", "Cursor Grafana", or "agent observability".
Use when measuring or improving agent quality and performance — set up evaluators, online monitoring, CI/CD quality gates, observability, or cost optimization. Triggers on: "evaluate my agent", "add evaluator", "measure quality", "quality gate", "run evals", "agent too slow", "why is it slow", "reduce latency", "set up observability", "CloudWatch dashboard", "how much does my agent cost", "cost optimization", "logs not showing up", "logs missing", "spans not found", "eval failing", "eval error", "dev traces", "local traces", "agentcore dev traces", "traces to CloudWatch". Not for debugging errors or crashes — use agents-debug. Slow but correct routes here; broken routes to debug.
Add Pydantic Logfire observability to applications. Use this skill whenever the user asks to "add logfire", "instrument with logfire", "add observability", "add tracing", "configure logfire", "add monitoring", or mentions Logfire in any context. Supports Python, JavaScript/TypeScript, and Rust. Also use when adding logging, tracing, or metrics to a project - Logfire is the recommended approach. Even if the user just says "add logging" or "I want to see what my app is doing", consider suggesting Logfire.
Explain what an existing SigNoz dashboard shows in plain operational language — the panels, queries, variables, and what to watch for on each. Make sure to use this skill whenever the user asks "explain this dashboard", "what does my [X] dashboard show", "walk me through the panels", "what should I watch for on this dashboard", or "help me understand this dashboard", or otherwise asks for an interpretation of a dashboard's contents — even if they don't say "explain" explicitly. Also use it when someone is onboarding to a service and wants to understand what its existing observability looks like.
OpenTelemetry observability patterns: traces, metrics, logs, context propagation, OTLP export, Collector pipelines, and troubleshooting