Session

From SQL to Spark and Back Again - Spark and Python for SQL Users

SQL and SQL Server have been the bedrock of analytics for decades. With the advent of the cloud in general, lakehouse architectures in particular, Spark and Python has had a meteoric rise in popularity due to its flexibility and capability of tackling the parallism problem in a way that has been the Achilles heel of classic SQL Server.

Spark is one of the fundamental workloads in Microsoft Fabric and offers a (potentially) low cost/high performance solution on demand for complex workloads. But how does it differ from SQL Server? Can you just convert SQL code to Python and spin up your Spark cluster? Short answer: maybe, but you really shouldn't. Let me show you a better way!

In this session, we will look at fundamental differences in how code is executed and how the underlying architectures differ, explore differences and similarities between Python and SQL, and dive into PySQL and when and why there might be a reason (or not!) to use PySQL over Python.

You will walk away from this session with a good understanding of how Spark differs from SQL Server, what antipatterns might be an issue, and how to make the most of your shiny Spark solution!

Alexander Arvidsson

Making Data Matter

Linköping, Sweden

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top