Session

Optimizing Your Apache Iceberg Lakehouse

Join Lester, author for the upcoming O'Reilly book Optimizing Your Apache Iceberg Lakehouse, for a practical, fast-paced session on improving query performance across your data lakehouse. While we focus on Apache Iceberg, the techniques apply broadly to Delta Lake and Apache Hive as well.

We’ll start with optimizations you can apply today as a table consumer: maintaining statistics, using effective filtering and projection, and leveraging caching to reduce latency.

Then we will go under the hood to show how your lakehouse tables should be structured and maintained to improve performance at scale, covering join optimization and file size considerations, as well as compaction, partitioning, bucketing, and file-level sorting.

You’ll learn how to:

- Reduce the amount of scanned data and speed up queries with statistics, filtering, and projection pruning.

- Design tables for scale with partition strategies based on best practices.

- Maintain tables with compaction, metadata rewriting, and expiration.

You will leave with practical guidance you can apply immediately—no replatforming required.


Early releases of the book available at https://learning.oreilly.com/library/view/optimizing-your-apache/0642572327040/

Lester Martin

Trino Developer Advocate - Starburst

Atlanta, Georgia, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top