Session

Enriching datasets approach using Apache Spark solves granular tracing issue in regulated finance

Apache Spark provides lot of options of joining the data for its data sets. This talk will focus on comparing the approach of Enriching the data (left outer join) versus filtering the data(inner join).How both approaches end up with same result and highlight the merits of Enriching the data approach helped us in CapitalOne. We at CapitalOne are heavy users of Spark from its initial days.This talk will provide more details of how we evolved from filtering to Enriching the data for credit card transactions and highlight what benefits we got by following Enriching the data approach. Being the financial institution, we are bound by regulation.We need to backtrace all credit card transactions processed through our engine. Will be providing the details on how Enriching the data approach solved us this requirement. This talk will provide more context on how financial institutions can use Enriching the data approach for their Spark workloads and backtrace all the data they processed this approach. We have used the filtering approach in Production and what were it issues and why we moved to Enriching the data approach in Production will also be covered in this talk. Attendees will be able to take away the more details on Enriching and filtering options to decide on their use cases.

Indy Cloud Conference 2020

Gokul Prabagaren

Lead Software Engineer at Capital One

McLean, Virginia, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top