Session

Beyond Dashboards: Building Operational Intelligence for Open Source AI Infrastructure

Modern AI systems expose thousands of metrics, yet operators still rely on manual dashboard inspection during performance incidents, capacity planning exercises, and production outages. As AI infrastructure grows in complexity, raw telemetry alone is no longer sufficient.

This session introduces KVScope, an open source observability and diagnostics framework designed to transform low level AI infrastructure metrics into actionable operational intelligence. Rather than simply displaying dashboards, KVScope analyzes runtime telemetry and identifies meaningful operational states such as queue pressure, resource saturation, throughput degradation, latency spikes, and recovery patterns.

Using real world LLM serving workloads as a case study, we will explore how metrics can be normalized, correlated, and converted into timelines, events, and human readable narratives that explain what is happening, why it is happening, and what operators should investigate next.

Attendees will learn practical techniques for building intelligent observability systems, improving incident response, and operating modern open source AI infrastructure with greater confidence and efficiency.

Sai Sravan Cherukuri

Open Source Enthusiasts and DevSecOps Architect

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top