Session

Microsoft Fabric - Code first with notebooks and Python

Since the advent of notebooks inside Microsoft Fabric, more capabilities for solving our data engineering tasks have entered the stage. Next to dataflows (Gen2), creating data pipelines based on Azure Data Factory is possible now. However, my favorite method to tackle data engineering challenges has been notebooks. However, I do not use Python and PySpark for data engineering tasks alone. I also use Python to extract data from REST APIs. In this session, I demonstrate how I use Python to extract data from REST APIs, including using secrets stored inside an Azure Key Vault. Next, I will showcase how common and not-so-common data cleansing and transformation tasks can be tackled using Python and PySpark.

This session will introduce Python for data engineering but will also cover advanced techniques like user-defined functions, method chaining, and package management. All examples will be available for download from a public git repo.

Also, this session is an introduction to data engineering using Python, especially PySpark. Some experience using programming languages will be helpful to follow every aspect covered in this session.

Tom Martens

Solution Architect @ Munich Re

Hamburg, Germany

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top