Search Results: sre

Found 157 Skills

spec-context

Use when you need to locate the current spec pack (FEATURE_DIR) in the Spec process of sdlc-dev, avoid reading or writing requirements/*.md in the wrong directory, or encounter issues such as "misreading context/writing to the wrong file/non-compliant branch".

🇨🇳|ChineseTranslated

1 scripts/Checked

DevOps & Cloud Servicessickn33/antigravity-aweso...

observability-engineer

Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicessickn33/antigravity-aweso...

incident-response-incident-response

Use when working with incident response incident response

🇺🇸|EnglishTranslated

DevOps & Cloud Servicespjt222/development-guides

write-incident-runbook

Create structured incident runbooks with diagnostic steps, resolution procedures, escalation paths, and communication templates for effective incident response. Use when documenting response procedures for recurring alerts, standardizing incident response across an on-call rotation, reducing MTTR with clear diagnostic steps, creating training materials for new team members, or linking alert annotations directly to resolution procedures.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesincidentfox/incidentfox

investigate

Systematic incident investigation methodology. Use when investigating production issues, service degradation, errors, latency spikes, or outages.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesonewave-ai/claude-skills

runbook-generator

Generates comprehensive operational runbooks for any system or process. Reads codebase, infrastructure config, and deployment scripts to produce structured runbook.md files formatted for on-call engineers. Use when you need operations documentation, incident response guides, deployment procedures, or disaster recovery plans.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesalirezarezvani/claude-ski...

incident-commander

Incident Commander Skill

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesonewave-ai/claude-skills

incident-responder

Production incident response automation. Reads logs, checks recent deploys, identifies root cause, suggests fixes, drafts incident comms, creates post-mortem templates. Severity classification (SEV1-4), escalation paths, status page updates. Generates incident-report.md with timeline, root cause, impact assessment, remediation steps, and prevention measures.

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesfamaoai-creator/gemini-sk...

chaos-monkey-orchestrator

Injects managed chaos into environments to test system resilience. Validates that self-healing and monitoring systems work as expected under stress.

🇺🇸|EnglishTranslated

1 scripts/Checked

DevOps & Cloud Servicesgeneraljerel/chalk-skills

create-postmortem

Create a blameless postmortem when the user asks to write a postmortem, document what went wrong, analyze an incident, or run a 5 Whys analysis

🇺🇸|EnglishTranslated

DevOps & Cloud Servicesalirezarezvani/claude-ski...

runbook-generator

Runbook Generator

🇺🇸|EnglishTranslated

1 scripts/Checked

DevOps & Cloud Servicesyuniorglez/gemini-elite-c...

debug-master

Senior Site Reliability Engineer & Debug Architect. Expert in AI-assisted observability, distributed tracing, and autonomous incident remediation in 2026.

🇺🇸|EnglishTranslated