Total 30,905 skills, DevOps & Cloud Services has 1906 skills
Showing 12 of 1906 skills
Set up uv (Rust-based Python package manager) in CI/CD pipelines. Use when configuring GitHub Actions workflows, GitLab CI/CD, Docker builds, or matrix testing across Python versions. Includes patterns for cache optimization, frozen lockfiles, multi-stage builds, and PyPI publishing with trusted publishing. Covers GitHub Actions setup-uv action, Docker multi-stage production/development builds, and deployment patterns.
Adds new users to Cloudflare Access authentication by updating ACCESS_ALLOWED_EMAIL in .env and syncing policies to all protected services. Use when you need to grant access to a new user, add someone to the network, share service access, or update allowed emails. Triggers on "add user to access", "grant access to [email]", "add [email] to cloudflare", "share access with", "allow [email] to authenticate", or "update access users". Works with .env, update-access-emails.sh, and Cloudflare Access policies for pihole, jaeger, langfuse, sprinkler, ha, and temet.ai services.
Diagnoses and fixes DNS resolution issues in the network infrastructure including Pi-hole not resolving locally, router DNS configuration, and DNS record verification. Use when services can't be accessed via domain names, "can't access pihole.temet.ai", DNS not resolving, or need to verify Pi-hole DNS records. Triggers on "DNS not working", "can't resolve domain", "domain not found", "Pi-hole DNS issue", "fix DNS", or "troubleshoot DNS resolution". Works with Pi-hole DNS, docker-compose.yml FTLCONF variables, and local network DNS resolution.
Sets up and configures Google Kubernetes Engine (GKE) clusters for production use. Use when creating new GKE clusters, choosing between Autopilot vs Standard modes, configuring networking (VPC-native, private clusters), setting up node pools, or planning cluster architecture for Spring Boot microservices. Includes regional vs zonal decisions, security hardening, and resource provisioning guidance.
Systematically diagnoses and resolves common GKE issues including pod failures, networking problems, database connection errors, and Pub/Sub issues. Use when pods are stuck in Pending, CrashLoopBackOff, ImagePullBackOff, experiencing DNS failures, Cloud SQL connection timeouts, or Pub/Sub message processing problems. Provides systematic debugging workflows and solution patterns for Spring Boot applications.
Creates and manages Cloudflare Access service tokens for automated infrastructure verification and non-human access. Use when setting up automation, verification scripts, monitoring systems, or need to test services without Google OAuth. Triggers on "create service token", "setup automation access", "verify without OAuth", "automated monitoring", or "service token for testing". Works with Cloudflare Access Service Auth, .env credential storage, and cf-service-token.sh script for testing and management.
Configures automated infrastructure monitoring with mobile alerts (ntfy.sh and Home Assistant) and implements auto-recovery for common failures. Use when setting up monitoring, configuring mobile notifications, enabling auto-recovery, or troubleshooting alert delivery. Triggers on "setup monitoring", "configure alerts", "mobile notifications", "enable auto-recovery", "monitoring not working", or "not getting alerts". Works with ntfy.sh push notifications, Docker container health checks, Bash monitoring scripts, and optional Home Assistant automation integration.
Discovers, tests, and manages remote SSH infrastructure hosts and Docker services across 5 hosts (infra.local, deus, homeassistant, pi4-motor, armitage). Use when checking infrastructure status, verifying service connectivity, managing Docker containers, troubleshooting remote services, or before using remote resources (MongoDB, Langfuse, OTLP, Neo4j). Triggers on "check infrastructure", "connect to infra/deus/ha", "test MongoDB on infra", "view Docker services", "verify connectivity", "troubleshoot remote service", "what services are running", or when remote connections fail.
Interactively adds a new subdomain to the network infrastructure by gathering service details, configuring domains.toml, and applying changes. Use when you need to add a new service, create a subdomain, expose a new application, or set up reverse proxy for a service. Triggers on "add subdomain", "new subdomain", "add service to network", "expose service", "create domain for", "set up reverse proxy", or "add [name] to infrastructure". Works with domains.toml, manage-domains.sh, and Cloudflare Tunnel.
Complete ClickHouse operations guide for DevOps and SRE teams managing production deployments. Provides practical guidance on monitoring essential metrics (query latency, throughput, memory, disk), introspecting system tables, performance analysis, scaling strategies (vertical and horizontal), backup/disaster recovery, tuning at query/server/table levels, and troubleshooting common issues. Use when diagnosing ClickHouse problems, optimizing performance, planning capacity, setting up monitoring, implementing backups, or managing production clusters. Includes resource management strategies for disk space, connections, and background operations plus production checklists.
Performs low-level Cloudflare DNS operations including adding, updating, deleting DNS records, managing zone settings, and dynamic DNS updates via Cloudflare API. Use when need manual DNS record management, dynamic DNS updates, zone settings configuration, or operations outside domain management system. Triggers on "add DNS record", "update DNS", "delete DNS record", "dynamic DNS", "Cloudflare API", or "manual DNS management". Works with Cloudflare API v4, cf-dns.sh and cf-settings.sh helper scripts, and direct curl API calls.
Debugs and fixes Terraform errors systematically. Use when encountering Terraform failures, state lock issues, provider errors, syntax problems, or unexpected infrastructure changes. Includes debugging workflows, error categorization, common GCP-specific issues, and recovery procedures.