Session

Navigating the Storm: Tools and Patterns for Resilient Systems in the Face of Production Chaos

Troubleshooting large, distributed systems during production outages can be a daunting challenge. When downtime is not an option, how do you steer your ship through the storm of technical difficulties?

Drawing from extensive experience trying to keep production up at one of South Africa's largest banks, this talk delves into the realm of resilience engineering. It will equip you to identify and reproduce issues and also to make your systems more resilient in the face of chaos.

We will explore an array of troubleshooting tools, such as Application Performance Monitoring tools, logging, heap and thread dumps, application metrics, profiling techniques, load testing, and briefly touch on chaos engineering.

Recognizing that prevention is better than cure, we will conclude with patterns and strategies to help you build resilient systems. Topics will include timeouts, connection reuse, circuit breakers, bulkheads, and fallback mechanisms.

Renette Ros

Technical Lead @ Entelect

Pretoria, South Africa

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top