SRE Playbooks
Step-by-step response playbooks for on-call: what to check, in what order, and how to confirm the fix.
Rollback or fix forward? An on-call decision playbook
Incident Response
An on-call decision playbook for choosing rollback vs fix-forward during an incident, with safe Kubernetes and ArgoCD rollback commands and verification steps.
Incident commander checklist
Incident Response
An on-call playbook for the incident commander role: declare and classify, assign roles, run comms cadence, mitigate before root cause, hand off cleanly, and close out with a blameless review.