Session

Automating Resiliency at Scale

Failures are inevitable! With increasing complexity and dependencies in the micro services world, it would be impossible to avoid failures, but one can be prepared for failures by building resilient systems. These systems should proactively recover with appropriate monitoring and alerts and provide delightful and uninterrupted experiences to end users with fewer outages & less disruptions.

In this session, we will share how Intuit with 1000s of services across 100s of Kubernetes clusters automated resiliency at scale with a simplified and self-serve experience via a continuous integration pipeline. By leveraging LitmusChaos (an open-source cloud-native Chaos Engineering framework) and integration with Argo tools (ArgoCD, Argo Workflows, Argo Applicationsets) , we achieved higher developer productivity that is enabling thousand of developers across the organization to build & ship reliable products. Also, we will share our learnings and journey on how this approach paved the way to conduct Intuit-wide game days so the same principles and patterns can be applied within your organizations to gain more confidence to execute ad-hoc chaos testing in the production.

Deepthi Panthula

Senior Staff Product Manager

San Jose, California, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top