Loading...
Loading...
Found 157 Skills
Use when you need to locate the current spec pack (FEATURE_DIR) in the Spec process of sdlc-dev, avoid reading or writing requirements/*.md in the wrong directory, or encounter issues such as "misreading context/writing to the wrong file/non-compliant branch".
Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability.
Use when working with incident response incident response
Create structured incident runbooks with diagnostic steps, resolution procedures, escalation paths, and communication templates for effective incident response. Use when documenting response procedures for recurring alerts, standardizing incident response across an on-call rotation, reducing MTTR with clear diagnostic steps, creating training materials for new team members, or linking alert annotations directly to resolution procedures.
Systematic incident investigation methodology. Use when investigating production issues, service degradation, errors, latency spikes, or outages.
Generates comprehensive operational runbooks for any system or process. Reads codebase, infrastructure config, and deployment scripts to produce structured runbook.md files formatted for on-call engineers. Use when you need operations documentation, incident response guides, deployment procedures, or disaster recovery plans.
Incident Commander Skill
Production incident response automation. Reads logs, checks recent deploys, identifies root cause, suggests fixes, drafts incident comms, creates post-mortem templates. Severity classification (SEV1-4), escalation paths, status page updates. Generates incident-report.md with timeline, root cause, impact assessment, remediation steps, and prevention measures.
Injects managed chaos into environments to test system resilience. Validates that self-healing and monitoring systems work as expected under stress.
Create a blameless postmortem when the user asks to write a postmortem, document what went wrong, analyze an incident, or run a 5 Whys analysis
Runbook Generator
Senior Site Reliability Engineer & Debug Architect. Expert in AI-assisted observability, distributed tracing, and autonomous incident remediation in 2026.