Session

Learnings From Shipping 1000+ Streaming Data Pipelines To Production

Kafka Connect and Kafka Streams are foundational technologies in modern, real-time data architectures. They enable developers to build scalable, robust, and real-time data pipelines without having to handle the low-level consumer and producer APIs of Apache Kafka. In this talk, we share our most important, and often surprising learnings from using Kafka Connect and Kafka Streams to ship more than 1,000 streaming data pipelines to production. The goal of this talk is to enable you to build mature streaming data pipelines without having to go through the common pitfalls.

We walk you through our journey of adopting Apache Kafka, Kafka Connect, and Kafka Streams. We discuss the challenges that we faced and how we overcame them. Over the course of the talk, we provide answers to important questions, such as: Which metrics are useful for monitoring streaming data pipelines? How to deal with resource-leaking connectors impacting the health of a Kafka Connect cluster? How to start troubleshooting the performance of streaming data pipelines? How to tune Kafka Connect for handling slow data sources or data sinks? What’s missing in today’s ecosystem for streaming to become a commodity?

Hakan Lofcali

co-founder, CTO @ DataCater, previously AWS and ING Analytics

Düsseldorf, Germany

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top