Speaker

Julien Tournay

Julien Tournay

Data Engineer at Spotify

Actions

Julien has been working with Scala and contributing on various open-source projects for the past 10 years. Since 2018, he is working for Spotify as a Data Engineer in the Data & Insight tribe, building libraries and tooling used in most of Spotify's data pipelines.

Data processing at Spotify - Why portability matters

For the last few years, Spotify has been developing Scio, an open-source Scala framework to develop data pipelines and deploy them on any execution engine. During that time, Spotify has been successfully deploying and running thousands of unique Scio jobs in production.

One of the benefits of Scio is it’s portability. A Scio job can run on Dataflow, Flink, Hadoop or Spark without any code change. This gives our users the ability to adapt their infrastructure to their evolving needs.

In this talk, we will show how one can write a data processing job handling a large amount of data and run it on Dataflow. Then in just a few steps, we will show how the runtime can be completely changed, and the very same job can also be executed on Flink.

We will also explain why portability is important to every company, whether they are just starting their data-processing journey or if like Spotify, they already are data-driven.

Julien Tournay

Data Engineer at Spotify

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top