Session

The Convergence of Streaming and Data Lake Architectures for AI/ML

The exponential growth of data in recent years has accelerated the need for scalable, real-time data processing architectures to support AI and machine learning (ML) workloads. This talk explores the convergence of streaming and data lake architectures to address these challenges. Traditionally, streaming systems like Apache Kafka and data lakes such as Apache Hadoop have been used independently—streaming for real-time data ingestion and lakes for batch processing and long-term storage. However, the integration of these paradigms presents an opportunity to create a unified data architecture capable of supporting the diverse requirements of AI/ML workflows, such as low-latency processing, high throughput, and large-scale storage.

This presentation will discuss how recent advancements in both technologies, such as the development of stream processing frameworks (e.g., Apache Flink) and modern data lakehouses (e.g., Delta Lake), are facilitating seamless data flow between real-time streams and batch processing layers. Key topics will include the benefits of this hybrid approach for AI/ML, architectural patterns, and implementation strategies. The session will also cover use cases where companies have successfully leveraged this convergence to accelerate model training, enhance data governance, and optimize decision-making processes. Attendees will leave with practical insights into designing data platforms that effectively blend the strengths of streaming and data lake architectures for AI and ML applications.

Lisa N. Cao

Product Manager at Datastrato

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top