From Apache Spark to Delta Lake: A practical introduction to scalable data engineering

This session will introduce you to the fundamentals of Apache Spark, the powerful distributed computing engine for big data processing. We’ll cover how it works, where it's commonly used, and when it’s the right choice for your data challenges. From there, we'll explore the lakehouse paradigm using Delta Lake and how it seamlessly integrates with Spark to enhance reliability and performance.

Through real-world examples and live demos, you'll see how Spark and Delta Lake work together to support both streaming and batch workloads in a unified architecture. We'll also dive deeper into the lakehouse approach—a modern architecture that combines the openness and flexibility of data lakes with the reliability and performance of data warehouses.

Thibauld Croonenborghs

Data Architect at AE

Brugge, Belgium

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

From Apache Spark to Delta Lake: A practical introduction to scalable data engineering

Thibauld Croonenborghs

Links

Actions