Total 50,510 skills, DevOps & Cloud Services has 3052 skills
Showing 12 of 3052 skills
Design and run a monitoring system for a website or web app. Use this skill when setting up uptime checks, defining SLOs, configuring error tracking, choosing what to alert on, designing on-call rotations, or fixing alert fatigue. Triggers on monitoring, alerts, uptime, SLO, SLA, error rate, on-call, pager, alert fatigue, observability, dashboards, what should we monitor. Also triggers when an incident reveals a gap in monitoring.
Generate, write, or run an ad-hoc query against SigNoz observability data — metrics, logs, traces, or exceptions — without wrapping it in a dashboard panel or alert. Make sure to use this skill whenever the user asks "show me error rates", "query logs for timeout errors", "what's the p99 latency for the cart service", "how many requests hit the payment endpoint", "find slow traces", "errors in the last hour", or otherwise asks an exploratory question that needs live observability data — even if they don't say "query" or "search" explicitly.
Deploy Python applications to Google App Engine Standard/Flexible. Covers app.yaml configuration, Cloud SQL socket connections, Cloud Storage for static files, scaling settings, and environment variables. Use when: deploying to App Engine, configuring app.yaml, connecting Cloud SQL, setting up static file serving, or troubleshooting 502 errors, cold starts, or memory limits.
Check service status, rename services, change service icons, link services, or create services with Docker images. For creating services with local code, prefer railway-new skill. For GitHub repo sources, use railway-new skill to create empty service then railway-environment skill to configure source.
Azure Verified Modules (AVM) requirements and best practices for developing certified Azure Terraform modules. Use when creating or reviewing Azure modules that need AVM certification.
Datadog CLI for searching logs, querying metrics, tracing requests, and managing dashboards. Use this when debugging production issues or working with Datadog observability.
Best practices for writing reliable Pulumi programs. Covers Output handling, resource dependencies, component structure, secrets management, safe refactoring with aliases, and deployment workflows.
Optimize end-to-end application performance with profiling, observability, and backend/frontend tuning. Use when coordinating performance optimization across the stack.
Generates architecture diagrams from Azure Bicep files. Use when user has .bicep files or asks to visualize Bicep infrastructure.
Expert Harbor container registry administrator specializing in registry operations, vulnerability scanning with Trivy, artifact signing with Notary, RBAC, and multi-region replication. Use when managing container registries, implementing security policies, configuring image scanning, or setting up disaster recovery.
Comprehensive Terraform Infrastructure as Code skill covering resources, modules, state management, workspaces, providers, and advanced patterns for cloud-agnostic infrastructure deployment
You are a workflow automation expert specializing in creating efficient CI/CD pipelines, GitHub Actions workflows, and automated development processes. Design automation that reduces manual work, improves consistency, and accelerates delivery while maintaining quality and security.