On-Call

When you're "not working", but could have to start working at any moment because something blew up. Familiar to doctors and devops (now performed by all software devs). part of Incident Response

PagerDuty is a tool/service for handling those rotations/escalations. https://www.pagerduty.com/

Charity Majors on how to make it less painful (over time). (Starting by eliminating 90% of your pages, by moving from symptom-based alerting to SLOs (SLA). Many of those should at least be moved to "Lane 2" non-urgent notifications.)


Edited:    |       |    Search Twitter for discussion