Robin Moffatt

Principal DevEx Engineer, Decodable

Leeds, United Kingdom

Actions

Robin is a Principal DevEx Engineer at Decodable.
He has been speaking at conferences since 2009 including QCon, Devoxx, Strata, Kafka Summit, and Øredev. You can find many of his talks online at https://rmoff.net/talks/, and his blog articles at https://rmoff.net/. Outside of work, Robin enjoys running, drinking good beer, and eating fried breakfasts—although generally not at the same time.

Area of Expertise

Information & Communications Technology

Topics

Apache Kafka
stream processing
Analytics
Analytics and Big Data
Streaming Data Analytics
SQL
KSQL
Data Streaming
Data Integration
Apache Flink

🚢 All at Sea with Streams - Using Kafka to Detect Patterns in the Behaviour of Ships

The great thing about streams of real-time events is that they can be used to spot behaviours as they happen and respond to them as needed. Instead of waiting until tomorrow to find out what happened yesterday, we can act on things straight away.

This talk will show a real-life example of one particular pattern that it's useful to detect—ships engaged in potentially suspicious behaviour at sea. Transhipping is often used for legitimate purposes to optimise efficiencies but can also be used for nefarious purposes such as illegal fishing.

By capturing streams of maritime AIS data in real-time into Kafka and processing it with ksqlDB, it's possible to detect the kind of characteristics that could indicate behaviour of interest, such as ships moving slowly at close proximity for a length of time.

I'll demonstrate how the data was ingested from a raw TCP feed, unified with reference data from CSV files, and then processed to spot patterns with the resulting real-time stream of matches written to a new Kafka topic for validation and analysis.

Based on a blog I wrote: https://www.confluent.io/blog/streaming-data-with-confluent-and-ksqldb-for-new-use-cases-with-ais/

Kafka as a Platform: the Ecosystem from the Ground Up

Kafka has become a key data infrastructure technology, and we all have at least a vague sense that it is a messaging system, but what else is it? How can an overgrown message bus be getting this much buzz? Well, because Kafka is merely the center of a rich streaming data platform that invites detailed exploration.

In this talk, we’ll look at the entire streaming platform provided by Apache Kafka and the Confluent community components. Starting with a lonely key-value pair, we’ll build up topics, partitioning, replication, and low-level Producer and Consumer APIs. We’ll group consumers into elastically scalable, fault-tolerant application clusters, then layer on more sophisticated stream processing APIs like Kafka Streams and ksqlDB. We’ll help teams collaborate around data formats with schema management. We’ll integrate with legacy systems without writing custom code. By the time we’re done, the open-source project we thought was Big Data’s answer to message queues will have become an enterprise-grade streaming platform, all in 60 minutes.

🤖Building a Telegram bot with Apache Kafka and ksqlDB

Imagine you’ve got a stream of data; it’s not “big data,” but it’s certainly a lot. Within the data, you’ve got some bits you’re interested in, and of those bits, you’d like to be able to query information about them at any point. Sounds fun, right? Since I mentioned “querying,” I’d hazard a guess that you’ve got in mind an additional datastore of some sort, whether relational or NoSQL.

But what if I told you...that you didn’t need any datastore other than Kafka itself? What if you could ingest, filter, enrich, aggregate, and query data with just Kafka? With ksqlDB we can do just this, and I want to show you exactly how.

In this hands-on talk we'll walk through an example of building a Telegram bot in which ksqlDB provides the key/value lookups driven by a materialised view on the stream of events in Kafka. We'll take a look at what ksqlDB is and its capabilities for processing data and driving applications, as well as integrating with other systems.

Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline!

Have you ever thought that you needed to be a programmer to do stream processing and build streaming data pipelines? Think again! Apache Kafka is a distributed, scalable, and fault-tolerant streaming platform, providing low-latency pub-sub messaging coupled with native storage and stream processing capabilities. Integrating Kafka with RDBMS, NoSQL, and object stores is simple with Kafka Connect, which is part of Apache Kafka. ksqlDB is the source-available SQL streaming engine for Apache Kafka and makes it possible to build stream processing applications at scale, written using a familiar SQL interface.

In this talk, we’ll explain the architectural reasoning for Apache Kafka and the benefits of real-time integration, and we’ll build a streaming data pipeline using nothing but our bare hands, Kafka Connect, and ksqlDB.

Gasp as we filter events in real-time! Be amazed at how we can enrich streams of data with data from RDBMS! Be astonished at the power of streaming aggregates for anomaly detection!

No More Silos: Integrating Databases into Apache Kafka

Companies new and old are all recognising the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka streaming platform. With Kafka, developers can integrate multiple sources and systems, which enables low latency analytics, event-driven architectures and the population of multiple downstream systems.

In this talk, we’ll look at one of the most common integration requirements - connecting databases to Kafka. We’ll consider the concept that all data is a stream of events, including that residing within a database. We’ll look at why we’d want to stream data from a database, including driving applications in Kafka from events upstream. We’ll discuss the different methods for connecting databases to Kafka, and the pros and cons of each. Techniques including Change-Data-Capture (CDC) and Kafka Connect will be covered, as well as an exploration of the power of KSQL for performing transformations such as joins on the inbound data.

Attendees of this talk will learn:

* That all data is event streams; databases are just a materialised view of a stream of events.
* The best ways to integrate databases with Kafka.
* Anti-patterns of which to be aware.
* The power of KSQL for transforming streams of data in Kafka.

Recording and Slides: http://rmoff.dev/ksny19-no-more-silos

The Changing Face of ETL: Event-Driven Architectures for Data Engineers

Data integration in architectures built on static, update-in-place datastores inevitably end up with pathologically high degrees of coupling and poor scalability. This has been the standard practice for decades, as we attempt to build data pipelines on top of databases that do a poor job modelling the fundamental objects that drive our businesses and systems: events.

Events carry both notification and state, and form a powerful primitive on which to build systems for developers and data engineers alike. Developers benefit from the asynchronous communication that events enable between services, and data engineers benefit from the integration capabilities. Everyone gains from using the standards-based, scalable and resilient streaming platform.

In this talk, we’ll discuss the concepts of events, their relevance to both software engineers and data engineers and their ability to unify architectures in a powerful way. We’ll see how stream processing makes sense in both a microservices and ETL environment, and why analytics, data integration and ETL fit naturally into a streaming world. The talk will conclude with a hands-on demonstration of these concepts in practice using Apache Kafka and commentary on the design choices made.

Join this talk to learn:

* The power of events and unbounded data
* Streaming is not just for real-time applications—it’s for everyone
* Where a streaming platform fits in an analytic architecture
* How event-driven architectures can enable greater scalability and flexibility of systems both now and in the future

Recording and Slides: https://rmoff.dev/oredev19-changing-face-of-etl

From Zero to Hero with Kafka Connect

Integrating Apache Kafka with other systems in a reliable and scalable way is often a key part of a streaming platform. Fortunately, Apache Kafka includes the Connect API that enables streaming integration both in and out of Kafka. Like any technology, understanding its architecture and deployment patterns is key to successful use, as is knowing where to go looking when things aren't working.

This talk will discuss the key design concepts within Kafka Connect and the pros and cons of standalone vs distributed deployment modes. We'll do a live demo of building pipelines with Kafka Connect for streaming data in from databases, and out to targets including Elasticsearch. With some gremlins along the way, we'll go hands-on in methodically diagnosing and resolving common issues encountered with Kafka Connect. The talk will finish off by discussing more advanced topics including Single Message Transforms, and deployment of Kafka Connect in containers.

Recording and Slides: http://rmoff.dev/ksldn19-kafka-connect

🚂On Track with Apache Kafka: Building a Streaming Platform solution with Rail Data

Want to know what you can REALLY do with Apache Kafka once you get going? This talk will show off lots of integration and stream processing techniques.
What started out as a fun project integrating live streams of updates of UK rail data turned into a full-blown data platform, with integration from ActiveMQ and S3, through ksqlDB and stream processing for joins and decoding, into various targets including analytics on S3, Elasticsearch, PostgreSQL, and graph analysis on Neo4j. Also using Kafka compacted topics to demonstrate the theory of stream/table to store configuration to drive real-time alerts delivered through Telegram.
This talk will be a curated walk-through of the specifics of how I built the system, and code samples of the salient integration points in ksqlDB and Kafka Connect.
The data may be domain-specific but the challenges of handling batch and stream data to drive both applications and analytics are encountered by many, and this talk will give people lots of concrete examples on how to do it.

Recording & slides: https://rmoff.dev/oredev19-on-track-with-kafka

NDC Oslo 2020 Sessionize Event

June 2020 Oslo, Norway

NDC Oslo 2019 Sessionize Event

June 2019 Oslo, Norway

Robin Moffatt

Principal DevEx Engineer, Decodable

Leeds, United Kingdom

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Robin Moffatt

Actions

Links

Area of Expertise

Topics

Sessions

🚢 All at Sea with Streams - Using Kafka to Detect Patterns in the Behaviour of Ships

Kafka as a Platform: the Ecosystem from the Ground Up

🤖Building a Telegram bot with Apache Kafka and ksqlDB

Apache Kafka and ksqlDB in Action: Let's Build a Streaming Data Pipeline!

No More Silos: Integrating Databases into Apache Kafka

The Changing Face of ETL: Event-Driven Architectures for Data Engineers

From Zero to Hero with Kafka Connect

🚂On Track with Apache Kafka: Building a Streaming Platform solution with Rail Data

Events

NDC Oslo 2020 Sessionize Event

NDC Oslo 2019 Sessionize Event

Robin Moffatt

Links

Actions