Speaker

Aman Sardana

Aman Sardana

Discover Financial Services, Expert Application Architect

Chicago, Illinois, United States

Actions

Aman Sardana is a technology professional in the financial services and payments domain. He is a hands-on technology leader, enabling business capabilities by implementing cutting-edge, modernized technology solutions. He is skilled in designing, developing, and implementing innovative financial technology solutions that drive business results and establish best-in-class operations.

Aman did his Masters in Information Technology from Northwestern University. This unique program straddles between the business and technical side of information technology, focusing on data mining, information security, enterprise architecture, statistics, innovation, marketing and finance.

Area of Expertise

  • Finance & Banking
  • Information & Communications Technology

Topics

  • Software Engineering
  • Distributed Software Systems
  • Modern Software Development
  • Software Design
  • Software Craftsmanship
  • Software Architecture
  • Software Deveopment
  • Agile software development
  • cloud-native software architecture
  • DevOps
  • DevOps Transformation
  • DevOps & Automation
  • Cloud & DevOps
  • Continuous Software Development
  • software architecure
  • engineering leadership
  • Platform Engineering
  • engineering management
  • Engineering Culture & Leadership
  • Site Reliability Engineering
  • Backend Engineering
  • Engineering Culture
  • Data Engineering
  • Software Engineering Management

Securing Open Banking APIs platform using Highly Secure FAPI profiles

Open Banking is transforming consumer consent based data sharing and consumer data needs highest level of protection. This session will bring how to protect Open Banking APIs using Financial Grade API (FAPI) Profiles and deliver microservices driven products in Banking and fintech industry

Navigating Hybrid Cloud Challenges for Business-Critical Systems

The core platform systems support business-critical applications and they are expected to be available and highly responsive 24/7/365. As an example, financial institutions with legacy infrastructure are increasingly adopting Hybrid cloud solutions to stay competitive, balance scalability, drive cost-efficiency, and achieve regulatory compliance. However, this shift introduces significant challenges related to system performance, security, and compliance. These challenges must be addressed with strategic mitigation measures.
In this talk we’ll share insights, experiences, and innovative solutions to successfully navigate Hybrid cloud challenges in Financial Technology domain for core backend processing systems, Open Banking, payment processing etc. Attendees will gain practical strategies for securing hybrid environments, ensuring compliance with financial regulations, and optimizing performance across distributed cloud ecosystems. These strategies can be applied to any industry vertical that operates critical infrastructure with hybrid cloud deployments.

Key Takeaways:
- The importance of hybrid cloud in modern FinTech operations
- Hybrid cloud architecture and infrastructure best practices
- Data Management and Governance
- Security considerations and risk management in hybrid cloud environments
- Compliance and regulatory challenges (PCI-DSS, GDPR etc.)
- Performance optimization strategies for financial workloads in hybrid cloud deployments
- Business Continuity, Disaster Recovery and Resiliency of critical infrastructure
- Cost management and FinOps
- Vendor Lock-In and Dependency

This talk is inspired by the challenges we faced when architecting and running business-critical workloads spread across legacy and modern infrastructure. Hybrid cloud presents unique challenges for critical systems, but with the right strategies, organizations can overcome security risks, ensure compliance, and optimize performance. A well-planned hybrid cloud architecture will balance cost, scalability, and resilience while ensuring business continuity in high-stakes industries like FinTech, Telecommunications, healthcare etc.

Ensuring Zero Downtime: Resiliency Testing Strategy for Business-Critical Systems

Business-critical systems in payments that support real-time transaction processing are expected to be available and highly responsive 24/7/365. These systems must be fault-tolerant and highly resilient to any failures that might happen during payment transaction processing. Resiliency testing is the key to ensuring uptime and performance under unpredictable conditions.
With customers expecting continuous availability of business-critical systems, the companies must think differently not only how to build reliable systems but also how the critical systems are tested. The companies need to go beyond traditional testing and adopt resiliency testing practices as part of their Software Development Lifecycle.
This talk explores real-world strategies for testing the resiliency of business-critical software systems, including failure injection, chaos engineering and disaster recovery. You will learn how to plan for resiliency tests, proactively test for failures, optimize recovery time, and build reliable systems that can handle extreme loads. This ultimately helps to prevent costly outages, maintain business continuity, and build failure-resistant software systems.
Key Takeaways:
- Understanding resiliency testing and its importance for business-critical systems
- Tools and techniques for implementing resiliency testing
- How to introduce failures in a controlled environment and observe system behavior
- Simulating real-world failures – latency spikes, network disruption, process failures, platform service outrages
- Disaster Recovery and failover for rapid recovery after outages
- Automated Resiliency Testing in CI/CD pipelines

This talk is inspired by the challenges faced when implementing major changes to existing business-critical systems or when replatforming the existing legacy infrastructure that runs critical workload. The non-functional testing focuses on performance SLA of the system; however, it doesn’t provide evidence that the system has been built to withstand failures under load. Resiliency testing provides confidence that the system can handle failures gracefully and can continue to serve the customers without any impact.

Distributed Caching in Highly Available Real-Time Payment Systems

Have you ever wondered what it takes to build a highly resilient distributed caching platform for critical real-time payment systems? Join us as we share our journey of building a highly available and fault-tolerant caching solution while leveraging automation to achieve a faster MTTR.

Payment systems that support real-time transaction processing are expected to be available and highly responsive 24/7/365. These systems must be fault-tolerant and highly resilient to any failures that might happen during payment transaction processing.
Building a low-latency payment system that spans multiple geographic locations requires a distributed caching solution for low latency and high-throughput data access. While building a real-time transaction processing system at Discover - one that is continuously available and can process thousands of transactions per second with a sub-second response time - we decided to introduce distributed caching technology into our architecture to minimize the latency and to replicate the data across multiple data centers. This presented numerous challenges:
• How to run Active-Active cache clusters in multiple deployment regions with high-availability
• How to auto-heal cache cluster failures due to network partitioning or disaster
• How to automatically failover client applications in response to failures
• How to automatically recover failed clusters to reduce MTTR
• How to ensure data consistency after the cluster is recovered
• How to implement a real-time monitoring and alerting solution for client and cluster connectivity

In this talk, we will take you through our journey of how we built a distributed caching solution to solve the challenges we faced.
• Configuration-driven and thread-safe smart client solution that can intelligently detect failures
• Ability for the client application to failover if the error rate is above a certain threshold
• Automation to recover failed cluster and failback client connections after the cluster is recovered
• Workflow automation for data consistency after the cluster is recovered
• Real-time streaming of client connectivity metrics
• Observability solution to simplify operational support

Discover the journey of building business critical syatems

Have you ever wondered what it takes to build a highly resilient distributed platform for critical real-time payment systems? Join us as we share our journey of building a highly available and fault-tolerant system while leveraging automation to achieve a faster MTTR.

Payment systems that support real-time transaction processing are expected to be available and highly responsive 24/7/365. These systems must be fault-tolerant and highly resilient to any failures that might happen during payment transaction processing.
Building a low-latency payment system that spans multiple geographic locations requires a distributed caching solution for low latency and high-throughput data access. While building a real-time transaction processing system at Discover - one that is continuously available and can process thousands of transactions per second with a sub-second response time - we decided to introduce distributed caching technology into our architecture to minimize the latency and to replicate the data across multiple data centers. This presented numerous challenges:
• How to run Active-Active cache clusters in multiple deployment regions with high-availability
• How to auto-heal cache cluster failures due to network partitioning or disaster
• How to automatically failover client applications in response to failures
• How to automatically recover failed clusters to reduce MTTR
• How to ensure data consistency after the cluster is recovered
• How to implement a real-time monitoring and alerting solution for client and cluster connectivity

In this talk, we will take you through our journey of how we built a distributed caching solution to solve the challenges we faced.
• Configuration-driven and thread-safe smart client solution that can intelligently detect failures
• Ability for the client application to failover if the error rate is above a certain threshold
• Automation to recover failed cluster and failback client connections after the cluster is recovered
• Workflow automation for data consistency after the cluster is recovered
• Real-time streaming of client connectivity metrics
• Observability solution to simplify operational support

Architecting Scalable and Resilient Platform Services: Patterns & Best Practices

Have you ever wondered what it takes to create resilient and highly available platform services that support mission-critical software systems? Please join me to find out how you can set the right strategy and foundational architecture for building platform services that businesses can trust for their most critical workloads.

Payment systems that support real-time transaction processing are expected to be highly available and highly responsive 24/7/365. These systems must be fault-tolerant and resilient to any failures that might happen during payment transaction processing.
Mission-critical payment systems with distributed architecture often depend on platform services like distributed caching, messaging, event streaming, databases, etc. that should be independently designed for high availability and fault tolerance.
In this talk, I’ll share the approach we took for architecting and designing platform services within the payments domain that can be applied to any domain that supports business-critical processes. This methodological approach starts with establishing a capability view for platform services and then defining the implementation and physical views. You’ll also gain an understanding of other aspects of platform services like provisioning, security, observability, testing, and automation that are important for creating a well-rounded platform strategy supporting business-critical systems.

In this talk, I’ll share my experience of architecting and building platform services that support critical real-time payment systems. The concepts shared during the talk will help the audience apply learnings when designing systems for high availability and resiliency.

Self-Healing Systems: The Delicate Balance Between Resilience, Availability and Cost

The core platform systems that support mission-critical applications are expected to be highly available and resilient 24/7/365. These systems are built with self-healing capabilities to automatically detect and recover from failures with minimal human intervention. These systems aim to maintain seamless service delivery even when components fail.
In this talk, we’ll share insights, experiences, and innovative solutions for striking a delicate balance between resilience, availability, and cost. Attendees will gain insights into often-overlooked trade-offs involved in building self-healing architectures, from over-provisioning and redundancy to observability and failover strategies.
Key Takeaways:
• Understand the design triangle of self-healing systems for sustainable balance of availability and cost
• Understand the practical trade-offs between resilience and cost in self-healing system design.
• Architecting smarter systems that recover gracefully without burning through your budget
• Building cost-aware healing strategies that degrade gracefully

This talk is inspired by the challenges we faced when architecting and running distributed platforms that demanded self-healing capabilities. Designing and operating platforms that require self-healing capability requires careful assessment of resilience, availability, and cost considerations for a balanced approach, taking into account tradeoffs and risks.

Aman Sardana

Discover Financial Services, Expert Application Architect

Chicago, Illinois, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top