Session

Scalable monitoring for real time data & ML systems

“Garbage in, garbage out” - is still true for data applications in 2021. In fact, this age-old saying still poses one of the biggest challenges to the data and ML applications. Debugging, troubleshooting & monitoring for data-related bugs takes over the majority of an engineer's day. In DevOps, software operations are taken to a level of an art. Sophisticated tools enable engineers to quickly identify and resolve issues, continuously improving software stability and robustness. In the data world, operations are still largely a manual process that involves Jupyter notebooks and SQL scripts. One of the cornerstones of the DevOps toolchain is logging. Traces and metrics are built on top of logs enabling monitoring and feedback loops. What does logging look like in a real time data and ML system?

In this talk we will show you how to enable statistical data logging for a data application. We will discuss how something so simple enables testing, monitoring and debugging of the entire data pipeline. We will dive deeper into key properties of a logging library that can handle TBs of data in a real time system with Apache Kafka, and how we can enable monitoring at scale of the modern data stack.

Andy Dang

WhyLabs, Co-Founder and Engineering Lead

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top