Session

Deep dive on Apache Spark design pattern of filtering vs enriching the data

Apache Spark provides lot of options of joining the data for its datasets. This talk will focus on comparing the approach of Enriching the data vs filtering the data.How both approaches end up with same result and highlight the merits of Enriching the data approach helped us. We at CapitalOne are heavy users of Spark.This talk will provide more details of how we evolved from filtering to Enriching the data for credit card transactions and highlight what benefits we got by following Enriching the data approach. Being the financial institution, we are bound by regulation.We need to backtrace all credit card transactions processed through our engine. Will be providing the details on how Enriching the data approach solved us this requirement. This talk will provide more context on how financial institutions can use Enriching the data approach for their Spark workloads and backtrace all the data they processed this approach. We have used the filtering approach in Production and what were it issues and why we moved to Enriching the data approach in Production will also be covered in this talk. This se case is running successfully in Production processing billions of transactions yearly

Gokul Prabagaren

Lead Software Engineer at Capital One

McLean, Virginia, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top