
Rob de Wit
Developer advocate at Y42
Utrecht, The Netherlands
Actions
After graduating in 2019, Rob soon realized he wanted to learn more about the underlying infrastructure and engineering needed for data science and engineering. Since then, he has worked in various data-related roles, from data analyst to data platform engineer.
Nowadays, he works as a developer advocate for Y42 — a startup building a turnkey data orchestration platform. His favorite thing about work is meeting other practitioners, learning from them, and sharing the stuff he has learned himself.
Area of Expertise
Topics
Complete data pipelines with dlt and dbt
I'll discuss how to extend dbt with dlt (data load tool) to create a complete pipeline. I'll show the integration with dbt, and how dlt compares to alternatives like Airbyte and Fivetran.
GitOps for Data: bridging the divide between data and code with Virtual Data Builds
I propound that, in essence, there are two ways in which our pipelines can break: through breaking code changes and through “bad data” extracted from a raw source. In this talk, I will argue that these can both be remedied with a GitOps approach to data.
We will dive into what GitOps for data should look like and investigate how it would overcome the two root problems for pipelines. We will then investigate a solution that bridges the existing divide between data and code: Virtual Data Builds (VDBs).
Beyond the hype cycle — what you actually need from your data platform
As data engineers, we tend to overcomplicate matters. Nobody really cares whether we use a data warehouse or a data lakehouse. We just need a place to store and process our data into insights. In this talk, I'll propose a minimum viable data stack for the masses based on four fundamental principles: automatability, testability, observability, and reproducibility.
exec(ut)
Host
SNiC CreativIT
Track host
PyCon US
Transforming a Jupyter Notebook into a reproducible pipeline for ML experiments
DataTalks.Club
GitOps for ML: Converting Notebooks to Reproducible Pipelines
PyData Eindhoven
Becoming a Pokémon Master with DVC: reproducible machine learning experiments
Deep Learning World
Becoming a Pokémon Master with DVC: Experiment Pipelines for Deep Learning Projects
Utrecht University
Guest lecture: The Measurable World
Study Association Sticky
Guest talk: Fraud & Data at bol.com
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top