Session

Build with the end in mind: infrastructure-backed data science with Kubeflow

As data scientists, we usually prototype use cases and try to find the one that can generate business value with the data on hand. We jump straight to work and at the end of the PoC accidentally wow-ed the stakeholders so much that they want the solution in production tomorrow. We scramble around our Jupyter notebooks and scripts to put together a pipeline that we think is reliable, the infrastructure guy then turns around and says "I can't use any of this".

At our company, where we develop with deployment in mind with Kubeflow. From the beginning, infrastructure sits with data science to gather the requirements for production. We set up the Kubeflow pipeline to allow our experiments to run exactly as how it will be run in production. From the data scientist's perspective, it's the same as writing notebooks; from the infrastructure, it's the same as setting up Kubernetes.

In this talk, we will be presenting our data science workflow with Kubeflow both from the DevOp engineer's and data scientist's standpoints. We will also demonstrate how we have incorporated Kubeflow into our profile image analyser pipeline.

As the topic spans from infrastructure to data science, we believe there is a little bit of something for anyone in this talk - whether you are a data scientist, machine learning engineer, data engineer, infrastructure engineer or software engineer, as long as you are a cloud or data enthusiast!

Merelda Wu

Lead Data Scientist @ Melio Consulting

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top