Session

Maximize Efficiency in Microsoft Fabric: When to Choose Python Notebooks Over Spark

In this presentation, we will explore how Python notebooks within Microsoft Fabric can offer a more efficient and flexible alternative to Spark notebooks, especially for data analysis tasks. We will begin with an overview of Spark, covering its core architecture, distributed computing model, and use cases where it excels in handling large-scale data processing. However, Spark’s resource overhead and complexity may not be necessary for every task. We will then discuss the challenges of using Spark, including its computational overhead and setup complexities.

Next, we will introduce Python notebooks in Microsoft Fabric, focusing on their role in streamlining data analysis workflows. By comparing Python notebooks to Spark, we will highlight when Python is a more lightweight and efficient solution, particularly for smaller datasets or tasks that don't require the full power of Spark’s distributed architecture. Through examples, we will see practical examples of Python notebooks used for data exploration and analysis, emphasizing their simplicity, lower resource requirements, and performance advantages in Microsoft Fabric.

By the end of this session, attendees will understand when to leverage Python notebooks over Spark, empowering them to optimize their data analysis workflows and choose the right tool for the right task.

Atte Sukari

Senior Data Engineer at Norrin

Helsinki, Finland

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top