Session
What New in Delta Lake - Deep Dive?
The Delta Lake file format is the foundation of the Lakehouse. In the past few years, the Delta Lake project has been one of the most active in the Spark ecosystem, with lots of new features added. But what do those new features mean for your data platform, what opportunities do they open up, and what do you need to do to take advantage of them?
This session starts with a quick overview of Delta Lake to ensure attendees are at the same level, and then dives into the latest features, showing how they work, how to use them, and when they are useful. We’ll cover:
- SQL merge improvements
- Using the Change Data Feed to read a stream of changes to a Delta table
- ‘CREATE TABLE LIKE’ syntax for creating empty tables from the schema of an existing table
- Shallow clones of tables for copying tables without copying data files
- Deletion vectors for better merge/update performance and GDPR compliance
- Table features, the metadata that describes the features a given Delta table supports
- File level statistics on specific columns to help with skipping files on read
- Delta Universal Format - allows the Delta table to be read as if it were an Iceberg table
By the end of the session attendee will have a better understanding of the great new features available, and how they can be used to improve their data platform.
Niall Langley
Data Engineer / Platform Architect
Bristol, United Kingdom
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top