Speaker

Rob de Wit

Rob de Wit

Developer advocate at Y42

Utrecht, The Netherlands

Actions

After graduating in 2019, Rob soon realized he wanted to learn more about the underlying infrastructure and engineering needed for data science and engineering. Since then, he has worked in various data-related roles, from data analyst to data platform engineer.

Nowadays, he works as a developer advocate for Y42 — a startup building a turnkey data orchestration platform. His favorite thing about work is meeting other practitioners, learning from them, and sharing the stuff he has learned himself.

Area of Expertise

  • Information & Communications Technology

Topics

  • Data Engineering
  • Data Platform
  • Data Engineering Pipelines
  • Data Engineering with Python
  • Analytics and Big Data
  • Databricks
  • Snowflake
  • BigQuery
  • Terraform
  • SQL
  • Git
  • GitOps
  • Azure Data Platform

Complete data pipelines with dlt and dbt

I'll discuss how to extend dbt with dlt (data load tool) to create a complete pipeline. I'll show the integration with dbt, and how dlt compares to alternatives like Airbyte and Fivetran.

GitOps for Data: bridging the divide between data and code with Virtual Data Builds

I propound that, in essence, there are two ways in which our pipelines can break: through breaking code changes and through “bad data” extracted from a raw source. In this talk, I will argue that these can both be remedied with a GitOps approach to data.

We will dive into what GitOps for data should look like and investigate how it would overcome the two root problems for pipelines. We will then investigate a solution that bridges the existing divide between data and code: Virtual Data Builds (VDBs).

Beyond the hype cycle — what you actually need from your data platform

As data engineers, we tend to overcomplicate matters. Nobody really cares whether we use a data warehouse or a data lakehouse. We just need a place to store and process our data into insights. In this talk, I'll propose a minimum viable data stack for the masses based on four fundamental principles: automatability, testability, observability, and reproducibility.

exec(ut)

Host

March 2024 Utrecht, The Netherlands

SNiC CreativIT

Track host

November 2023 Utrecht, The Netherlands

PyCon US

Transforming a Jupyter Notebook into a reproducible pipeline for ML experiments

April 2023 Salt Lake City, Utah, United States

DataTalks.Club

GitOps for ML: Converting Notebooks to Reproducible Pipelines

February 2023

PyData Eindhoven

Becoming a Pokémon Master with DVC: reproducible machine learning experiments

December 2022 Eindhoven, The Netherlands

Deep Learning World

Becoming a Pokémon Master with DVC: Experiment Pipelines for Deep Learning Projects

October 2022 Berlin, Germany

Utrecht University

Guest lecture: The Measurable World

January 2021 Utrecht, The Netherlands

Study Association Sticky

Guest talk: Fraud & Data at bol.com

May 2020 Utrecht, The Netherlands

Rob de Wit

Developer advocate at Y42

Utrecht, The Netherlands

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top