Session

What New in Delta Lake - Deep Dive?

The Delta Lake file format is the foundation of the Lakehouse. In the past few years, the Delta Lake project has been one of the most active in the Spark ecosystem, with lots of new features added. But what do those new features mean for your data platform, what opportunities do they open up, and what do you need to do to take advantage of them?

This session starts with a quick overview of Delta Lake to ensure attendees are at the same level, and then dives into the latest features, showing how they work, how to use them, and when they are useful. We’ll cover:
- SQL merge improvements
- Using the Change Data Feed to read a stream of changes to a Delta table
- ‘CREATE TABLE LIKE’ syntax for creating empty tables from the schema of an existing table
- Shallow clones of tables for copying tables without copying data files
- Deletion vectors for better merge/update performance and GDPR compliance
- Table features, the metadata that describes the features a given Delta table supports
- File level statistics on specific columns to help with skipping files on read
- Delta Universal Format - allows the Delta table to be read as if it were an Iceberg table

By the end of the session attendee will have a better understanding of the great new features available, and how they can be used to improve their data platform.

Niall Langley

Data Engineer / Platform Architect

Bristol, United Kingdom

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top