Comparing Apache Flink and Spark for Modern Stream Data Processing

Real-time data processing is essential for staying competitive in today’s fast-paced business environment, and choosing the right tool is a key decision. Apache Flink and Spark Structured Streaming are two leading stream processing frameworks, each with unique strengths and trade-offs.

This talk takes a look at our journey at Decodable, where we evaluated both tools and ultimately chose Apache Flink over Spark Structured Streaming for our stream data processing needs. By examining key differences between the two systems, we aim to provide a clear, technical comparison that will help you make informed decisions for your streaming data use cases.

Join us for this talk where we will discuss:
1. Design philosophies: Learn about the origins of both systems and some of the fundamental architecture design choices of Flink that makes it more attractive for streaming use cases.
2. (Stateful) streaming capabilities: We will dive into and compare similar features that both Spark and Flink offer in the various APIs, we will also share some features only available in Flink that make it a much richer streaming library. We will also talk about some of the data ecosystem tools/connectors that Flink supports natively, like Debezium.
3. Production readiness: We will also talk about some of the recent features of Flink that makes running Flink at scale easy, like the Kubernetes operator and its sophisticated auto-scaler.

Conference: Flink Forward Berlin 2024
Slides: https://speakerdeck.com/sharonx/comparing-apache-flink-and-spark-for-modern-stream-data-processing

Sharon Xie

Staff PM @ MongoDB

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Comparing Apache Flink and Spark for Modern Stream Data Processing

Sharon Xie

Links

Actions