The Hidden Killers of P99 Latency in Kubernetes

Most Kubernetes environments are optimized for averages.

Users experience P99 latency.

Clusters that appear healthy at the infrastructure layer can still produce severe long-tail latency due to CPU throttling, autoscaling lag, DNS bottlenecks, queue amplification, garbage collection pauses, retry storms, and noisy neighbors.

This session breaks down the hidden causes of tail latency in Kubernetes-based systems and demonstrates practical techniques for identifying and mitigating them.

We will analyze:

CPU throttling and cgroup behavior
latency amplification across microservices
queueing effects in distributed systems
Kubernetes scheduling side effects
service mesh overhead tradeoffs
retry storms and cascading amplification
observability blind spots
common benchmarking mistakes

The talk combines performance engineering, observability, and distributed systems analysis to expose why “healthy” clusters can still deliver poor user experience.

Attendees will leave with actionable strategies to identify hidden latency amplifiers, reduce tail-latency variance, improve workload isolation, and build more realistic performance tests for cloud-native systems.

Sravanthi Naga

Senior Engineering Manager - Pega Systems

Hyderābād, India

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

The Hidden Killers of P99 Latency in Kubernetes

Sravanthi Naga

Links

Actions