An Apache Spark query's journey through the layers of Databricks

A deep-dive session about Spark internals, where we explore how queries are executed in Apache Spark and within the layers of Databricks.

We will cover:

* Spark SQL and Catalyst
* A note on Tungsten
* Delta Lake
* Parquet files

These insights will be supported by glimpses into the official Apache Spark source code on GitHub.

The takeaway should be a better understanding of how queries are executed and some tools for problem-solving and optimizing for speed or cost.

Christian Henrik Reich

Sr Solution Architect @ Microsoft

Copenhagen, Denmark

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

An Apache Spark query's journey through the layers of Databricks

Christian Henrik Reich

Links

Actions