Session

Beam on Flink: How It Works and Why It Matters

Apache Beam is a data processing framework which conceptually is very similar to Flink. However, Beam does not have a full-blown runtime of its own. Instead, Beam programs run on execution engines such as Apache Flink, Apache Spark, or Google Cloud Dataflow.

Why is this interesting for Flink users?

Firstly, Apache Beam has multi-language support built-in. Want to write your data processing in Python or Go instead of Java? No problem with Beam.

Secondly, you write your Beam job only once with a batch/stream agnostic API. Then you are free to execute it on the execution engine of your choice. Besides Flink, do you have a Spark cluster that you want to run a batch job on, or do you want to try out Google Cloud Dataflow? Beam has you covered with no additional code changes.

You may ask: How does this work? How does a Beam program relate to a Flink program? What are the advantages and disadvantages of having the flexibility to choose the programming language and the execution engine?

In this talk, we will show how Beam and Flink work together, where they shine and where you might be better off with just using pure Flink.

Maximilian Michels

Software Engineer

Berlin, Germany

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top