Mastering Spark Notebooks and Capacity Optimization in Microsoft Fabric

Running Spark notebooks in Microsoft Fabric opens up powerful possibilities—but also introduces a compute model that can feel unfamiliar, especially to those coming from a traditional SQL Server or data warehouse background. Every notebook gets its own dedicated compute session, and while this provides strong isolation, it can quickly lead to unexpected capacity consumption and limits if not managed thoughtfully.

This session offers a deep dive into how Spark compute works under the hood in Fabric, with a focus on how to run more efficiently without wasting Capacity Units. We’ll explore the impact of autoscaling Spark pools, how bursting works in practice, and the introduction of the new Autoscale Billing model that charges based on actual vCore usage per second, rather than maximum allocation. You’ll learn how to take control of your workloads through techniques like using small or single-node Spark pools, orchestrating notebooks with runMultiple(), and sharing sessions through High Concurrency Mode—both interactively and within Pipelines.

Whether you're building data pipelines, running exploratory work, or managing shared capacity across a team, this session will help you understand how Spark in Fabric behaves, how it’s billed, and how to optimize it for both performance and cost.

Just Blindbæk

Microsoft BI architect, trainer, speaker and MVP | twoday

Århus, Denmark

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Mastering Spark Notebooks and Capacity Optimization in Microsoft Fabric

Just Blindbæk

Links

Actions