From Assembly Lines to Data Pipelines: A Blueprint for Building a Streaming Lakehouse

In 2022, we embraced a technology-first mindset, transitioning from a traditional automaker to a software-driven company that also builds cars. This shift forced us to rethink how we harness the 20+ petabytes of data generated daily by our global network of dealerships, connected vehicles, and factory IoT sensors.

Handling data at this scale across a global enterprise poses significant challenges—scalable ingestion, fault tolerance, low-latency processing, and support for both operational and analytical workloads. In this session, we'll share the blueprint we developed for building an enterprise-grade Lakehouse, grounded in the lessons we learned deploying it in production.

Whether you're starting your own streaming lakehouse journey or scaling an existing one, this talk will provide actionable insights from our experience—what worked, what didn't, and how we ultimately succeeded using tools like Ververica's Flink to build a robust, real-time architecture.

David Kjerrumgaard

Committer on the Apache Pulsar Project | Published Author | International Speaker | Big Data Expert

Las Vegas, Nevada, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

From Assembly Lines to Data Pipelines: A Blueprint for Building a Streaming Lakehouse

David Kjerrumgaard

Links

Actions