Session
Don't Just Fix It, Learn From It - The Importance of Incident Management when Something Breaks
Panic messages saying the system is having issues. Your phone buzzing from your alert system sending you text's about the system being down. Intuition kicks in and tells you solve this issue as fast as possible and get back to your day. While you have solved your issue at hand, you're not setting yourself up for future success and preventing doing the same thing next time around.
In this session, we will discuss the importance of not just solving the issue at hand but how to learn and improve your processes. We'll review topics such as documenting as the outcomes as it is occurring, the importance of playbooks, and leading a successful post mortem to make sure this isn't a fix and forget situation. We'll go thru a mock incident to see how we can incorporate each of these and other processes throughout to ensure that we learn from our mistakes to prevent similar scenarios from happening in the future.
While getting your system usable for your end users should be goal number 1, the very next goal is not falling into a similar state in the future. Putting this process in place, you will have the tools in your belt to prevent this from happening again.
Rick Clymer
Quality and Reliability Lead, RocketReach
Cleveland, Ohio, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top