Streaming Jobs with Azure Databricks 101

Streaming is one of the buzzwords used when talking about the Lakehouse. It promises to give us real time analytics by enabling a continual flow of data into our analytics platforms. It's used to power real time processes as diverse as fraud detection, recommendation engines, stock trading, GPS tracking and social media feeds. However, for data engineers used to working with batch jobs this can be a big paradigm shift.

In this session we take a look at Spark Structured Streaming:
- When and why should we use it
- How is it works
- Aggregating data
- Joining streams
- Late arriving data
- Latency and performance
- Running streaming pipelines

At the end of the session you'll know when and why to use Spark streaming, and what gotchas to look out for as you start your journey with streaming pipelines.

50 Minute Session

Niall Langley

Data Engineer / Platform Architect

Bristol, United Kingdom

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Streaming Jobs with Azure Databricks 101

Niall Langley

Links

Actions