Session

Challenges of Spark Application coexisting with NoSQL databases

CapitalOne is first US bank to exist out of on-premises and moved completely on Cloud. Over this process of modernizing our application in CapitalOne Card Rewards, we developed ground up custom transactions processing application on open source technologies like Apache Spark, MongoDB, Apache Cassandra etc. This application currently processes millions of customer transactions daily providing them millions of miles, cash and points everyday. In process of building our application, we came across many challenging issues to have Spark application process data from MongoDB and Cassandra backend to serve customers. This talk is going to focus on few of those issues, what is the impact of those issue and how to mitigate them.To call out specifically following are list of issues this talk will focus on.

How Cassandra Key sequence is important and how it impacts in querying
How Cassandra batching helps and works well with Spark partitions
Importance of Cassandra Data Modeling and its implications after MVP/Deployment
How to manage Mongo Connection (at JVM level)
Implications of using MongoSpark connector on its Partitioner
All the issues highlighted are faced by us in our application. This talk will focus on what are these issues in Spark/Mongo/Cassandra app

Gokul Prabagaren

Lead Software Engineer at Capital One

McLean, Virginia, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top