Session

Data-led prediction with Spark and MLFlow

Explore the capabilities of Apache Spark paired with MLFlow, a comprehensive platform for managing the end-to-end machine learning lifecycle. Understand how the combination of these two open-source solutions can effectively allow applying the data-led machine learning prediction architecture.

Throughout the discussion, we'll examine the different architectures to expose ML models in production. We'll then focus on the data-led approach, demoing in practice how to implement it with Spark and MLFlow.

*Target audience:* data engineers, data scientists, ML engineers, and backend developers who work with machine learning deployment and/or MLOps.

*Technical level:* intermediate. We will discuss ML models exposure architectures, and show implementation code.

*Duration:* flexible from 30 to 1 hour (the depth will be adapted based on the available time.

The session is based on the real-life experience of bringing several ML models from the data scientists' hands to production, using Spark and MLFlow.

I've also shared this experience in a series of three Medium articles:
1 - https://itnext.io/intro-to-mlops-model-life-cycle-from-a-data-engineers-eyes-b9347440fae4?source=friends_link&sk=e313e9855176ba85064408d8251fd50b
2 - https://medium.com/israeli-tech-radar/avoid-the-ml-dependencies-syncing-black-hole-2de061c1870e?source=friends_link&sk=8ad64bb408f9d172c422ddd528ac5a99
3 - https://medium.com/israeli-tech-radar/uncovering-mlflows-spark-udf-e46603971afa?source=friends_link&sk=12ad3474db4d64c789dab8171aa8de74

Yerachmiel Feltzman

Senior Big Data Engineer @ Tikal

Tel Aviv, Israel

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top