Session
Ensuring Zero Downtime: Resiliency Testing Strategy for Business-Critical Systems
Business-critical systems in payments that support real-time transaction processing are expected to be available and highly responsive 24/7/365. These systems must be fault-tolerant and highly resilient to any failures that might happen during payment transaction processing. Resiliency testing is the key to ensuring uptime and performance under unpredictable conditions.
With customers expecting continuous availability of business-critical systems, the companies must think differently not only how to build reliable systems but also how the critical systems are tested. The companies need to go beyond traditional testing and adopt resiliency testing practices as part of their Software Development Lifecycle.
This talk explores real-world strategies for testing the resiliency of business-critical software systems, including failure injection, chaos engineering and disaster recovery. You will learn how to plan for resiliency tests, proactively test for failures, optimize recovery time, and build reliable systems that can handle extreme loads. This ultimately helps to prevent costly outages, maintain business continuity, and build failure-resistant software systems.
Key Takeaways:
- Understanding resiliency testing and its importance for business-critical systems
- Tools and techniques for implementing resiliency testing
- How to introduce failures in a controlled environment and observe system behavior
- Simulating real-world failures – latency spikes, network disruption, process failures, platform service outrages
- Disaster Recovery and failover for rapid recovery after outages
- Automated Resiliency Testing in CI/CD pipelines
This talk is inspired by the challenges faced when implementing major changes to existing business-critical systems or when replatforming the existing legacy infrastructure that runs critical workload. The non-functional testing focuses on performance SLA of the system; however, it doesn’t provide evidence that the system has been built to withstand failures under load. Resiliency testing provides confidence that the system can handle failures gracefully and can continue to serve the customers without any impact.

Aman Sardana
Discover Financial Services, Expert Application Architect
Chicago, Illinois, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top