What's a Data Lake and What Does It Mean For My Open Source Stack?

Data lakes on open table formats like Iceberg are a popular way to manage large datasets for analytics, data science, and AI. This talk explains how data lakes work and how to adapt open source analytic stacks to use them. First, we'll tour projects like Arrow, Iceberg, and Unity Catalog that make data lakes possible. Next, we'll see how analytic engines like DuckDB, ClickHouse, and Spark are adapting. Finally, we'll survey a few projects that enable applications written in Python, Golang, or Rust to deliver fast query. You'll have to build the app yourself but this talk will show you a path to use data lakes and open source successfully.

Robert Hodges

CEO at Altinity

Berkeley, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

What's a Data Lake and What Does It Mean For My Open Source Stack?

Robert Hodges

Links

Actions