Session
An Apache Spark query's journey through the layers of Databricks
A deep-dive session about Spark internals, where we explore how queries are executed in Apache Spark and within the layers of Databricks.
We will cover:
* Spark SQL and Catalyst
* A note on Tungsten
* Delta Lake
* Parquet files
These insights will be supported by glimpses into the official Apache Spark source code on GitHub.
The takeaway should be a better understanding of how queries are executed and some tools for problem-solving and optimizing for speed or cost.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top