Session
Recovery by Design: A Postmortem Adventure
It is a software engineering team's worst nightmare -- production is down, management is unhappy, customers are affected, and it's only Monday. With a single command, production is back up, management is happy, customers are smiling, and it's only took 3 minutes. Once the systems, services and infrastructure are back up and running, the real work begins.
Systems and services are design for 99% reliability, yet we have a host of engineers on-call around the clock. While things will go wrong, learning from our mistakes is the first step in that last 1%.
Chris Houdeshell
VP of Eng. and Ops | Bit Herder | ☕
State College, Pennsylvania, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top