Speaker

Debasish Das

Debasish Das

Sr Manager, Machine Learning, Credit Karma

Actions

Debasish Das joined Credit Karma in 2018 and leads ML Platform. Prior to joining Credit Karma Debasish worked at Verizon, Intel, Synopsys, Magma and Mentor Graphics. He did his PhD in EECS from Northwestern and BTech in CS from IIT Kharagpur. His current interests include scalable feature engineering, machine learning algorithms, prediction serving and optimization. His current focus is on developing machine learning workflows for financial recommendations and preventing assets fraud. He has contributed to open source projects like Tensorflow, Apache Spark and ScalaNLP Breeze

Vega: Unifying Machine Learning Workflows at Credit Karma using Apache Airflow

At Credit Karma, we enable financial progress for more than 100 million of our members by recommending them personalized financial products when they interact with our application. In this talk we are introducing our machine learning platform to build interactive and production model-building workflows to serve relevant financial products to Credit Karma users.

Vega, Credit Karma’s Machine Learning Platform, has 3 major components: 1) QueryProcessor for feature and training data generation, backed by Google BigQuery, 2) PipelineProcessor for feature transformations, offline scoring and model-analysis, backed by Apache Beam 3) ModelProcessor for running Tensorflow and Scikit models, backed by Google AI Platform, which provides data scientists the flexibility to explore different kinds of machine learning or deep learning models, ranging from gradient boosted trees to neural network with complex structures

Vega exposed a unified Python API for Feature Generation, Modeling ETL, Model Training and Model Analysis. Vega supports writing interactive notebooks and python scripts to run these components in local mode with sampled data and in cloud mode for large scale distributed computing. Vega provides the ability to chain the processors provided by data scientists through Python code to define the entire workflow. Then it automatically generates the execution plan for deploying the workflow on Apache Airflow for running offline model experiments and refreshes. Overall, with the unified python API and automated Airflow DAG generation, Vega has improved the efficiency of ML Engineering. Using Airflow we deploy more than 20K features and 100 models daily

Vega: Scaling MLOps Pipelines at Credit Karma using Apache Beam and Dataflow

At Credit Karma, we enable financial progress for more than 100 million of our members by recommending them personalized financial products when they interact with our application. In this talk we are introducing our machine learning platform that uses Apache Beam and Google Dataflow to build interactive and production MLOps pipelines to serve relevant financial products to Credit Karma users.

Vega, Credit Karma’s Machine Learning Platform, uses Bigquery, Apache Beam, Distributed Tensorflow and Airflow for building MLOps pipelines. Apache Beam with Dataflow Runner is used in Vega for scalable feature transformations, model chaining, batch scoring of Tensorflow and PMML models, model analysis and online model monitoring.

In this session we will walk you through the various scalable Apache Beam jobs that we use for training, deploying, monitoring and refreshing the models for our recommendation system. Overall, our MLOps pipelines leveraging Apache Beam have improved the efficiency of ML Engineering. Using our pipelines we deploy more than 500 Tensorflow and Tree models every week to production.

Debasish Das

Sr Manager, Machine Learning, Credit Karma

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top