Steffen Hausmann

Steffen Hausmann

Specialist Solutions Architect Analytics at Amazon Web Services

Steffen Hausmann is a Specialist Solutions Architect for Analytics at AWS. He works with customers around the globe to design and build streaming architectures so that they can get value from analyzing their streaming data. He holds a doctorate degree in computer science from the University of Munich and in his free time, he tries to lure his daughters into tech with cute stickers he collects at conferences.

Current sessions

Build a Unified Batch and Streaming Pipeline with Apache Beam on AWS

In this workshop, you explore an end to end example that combines batch and streaming aspects in one uniform Beam pipeline.

You start off with building a Beam pipeline to analyse taxi trip events. You deploy the the pipeline to Amazon Kinesis Data Analytics for Apache Flink to analyse incoming events in a streaming fashion. You also implement archival of the streaming data to Amazon S3 for long term storage. Finally, you extend the Beam pipeline by adding new metrics to the generated output. After applying the changed to the streaming pipeline, you reprocess the archived data in a batch fashion to backfill dashboards with the newly added metrics.

Join us to learn how to leverage Beam’s expressive programming model to unify batch and streaming. You will also understand how to combine different AWS services to create Apache Beam based batch and streaming architectures in a fully managed environment on AWS.

Prerequisites

* You’ll implement the Beam pipeline with Java. Some rudimentary knowledge of Java or a similar language is useful.
* You’ll connect to a Windows instance during the workshop, so please come prepared with a working RDP client (https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/connecting_to_windows_instance.html#rdp-prereqs).
* You do NOT need an AWS account, we will create accounts for you and distribute credentials before the workshop.


Unify Batch and Stream Processing with Apache Beam on AWS

One of the big visions of Apache Beam is to provide a single programming model for both batch and streaming that runs on multiple execution engines.

In this session, we explore an end to end example that shows how you can combine batch and streaming aspects in one uniform Beam pipeline: We start with ingesting taxi trip events into an Amazon Kinesis data stream and use a Beam pipeline to analyze the streaming data in near real time. We then show how to archive the trip data to Amazon S3 and how we can extend and update the Beam pipeline to generate additional metrics from the streaming data moving forward. We subsequently explain how to backfill the added metrics by executing the same Beam pipeline in a batch fashion against the archived data in S3. Along the way we furthermore discuss how to leverage different execution engines, such as, Amazon Kinesis Data Analytics for Java and Amazon Elastic Map Reduce, to run Beam pipelines in a fully managed environment.

So you will not only learn how you can leverage Beam's expressive programming model to unify batch and streaming you will also learn how AWS can help you to effectively build and operate Beam based streaming architectures with low operational overhead.


Build a Unified Batch and Stream Processing Pipeline with Apache Beam on AWS

In this workshop, we explore an end to end example that combines batch and streaming aspects in one uniform Beam pipeline. We start to analyze incoming taxi trip events in near real time with an Apache Beam pipeline. We then show how to archive the trip data to Amazon S3 for long term storage. We subsequently explain how to read the historic data from S3 and backfill new metrics by executing the same Beam pipeline in a batch fashion. Along the way, you also learn how you can deploy and execute the Beam pipeline with Amazon Kinesis Data Analytics in a fully managed environment.

So you will not only learn how you can leverage Beam's expressive programming model to unify batch and streaming you will also learn how AWS can help you to effectively build and operate Beam based streaming architectures with low operational overhead.