Session
Iceberg and Storage Partition Join
This session explores Iceberg optimizations and, in particular, the pivotal use of Apache Spark Storage-Partition Joins (SPJ) on Iceberg tables. We will deep dive on how Spark SPJ can completely eliminate shuffle operations, which is essential to running our most resource-intensive jobs. We will explain the many Spark SPJ enhancements for Iceberg developed by Apple, going over when and how to enable them for different use cases. Finally, we will discuss our results and areas of enhancement for SPJ in the community.

Himadri Pal
Principal Software Engineer - Data and AI at Apple
Cupertino, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top