Session
Quest to Delta Optimisation
Delta has become a widely used tool by data professionals to build effective and reliable Lakehouse’s in Databricks and MS Fabric.
Yet, questions arise regarding its performance with large datasets, its ability to handle skewed data, and its concurrent write management. In this session, will dive deep into optimization options and methods that will improve your Lakehouse performance.
Delta files are not ordinary data files but are key in making Lakehouse efficient, optimal, and scalable. However, optimizing delta files and tables can be challenging and even a daunting task. Techniques like partitioning and z-ordering can be limited, inflexible, and challenging to implement, especially when your data is constantly changing or growing.
This session will introduce you to the latest optimization techniques to enhance your query performance and simplify your optimization process. We will cover liquid clustering, a cutting-edge approach that offers flexibility and adaptability to data layout changes, and v-order, a write-time optimization for the Parquet file format that enables lightning-fast reads.
Furthermore, we will explore various other Delta file optimization techniques, such as data skipping, z-ordering, and vacuuming. These techniques will help you maximize the value of your Delta files while minimizing resource utilization and costs.
By the end of this session, you'll have the necessary knowledge and tools to optimize Delta files and tables for your own Lakehouse.
Falek Miah
Principal Consultant at Advancing Analytics
London, United Kingdom
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top