Session

An Introduction to Streaming Lakehouse Storage with Apache Pulsar

In the realm of modern data architectures, the notion of a "streaming lakehouse" has emerged as a comprehensive solution for managing both batch and streaming data within a single, adaptable repository. This session serves as a primer on the concept of streaming lakehouse storage and its integration with Apache Pulsar, a powerful platform for real-time messaging and event streaming.

In this talk, you will discover how Apache Pulsar lays the foundation for building a streaming lakehouse storage solution. This presentation will showcase several key aspects, including the seamless ingestion of data streams into Pulsar topics, efficient storage management utilizing Pulsar's tiered storage, integration with distributed storage systems such as Apache Hudi or Delta Lake, and the facilitation of real-time analytics through frameworks like Apache Flink or Apache Spark Streaming.

David Kjerrumgaard

Committer on the Apache Pulsar Project | Published Author | International Speaker | Big Data Expert

Las Vegas, Nevada, United States

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top