Sravanthi Naga

Senior Engineering Manager - Pega Systems

Hyderābād, India

Actions

Sravanthi Naga is a seasoned technology leader with a strong background in Performance Engineering, DevSecOps, and cloud-native technologies. Throughout her career, Sravanthi has been instrumental in driving organizational excellence by focusing on high-performance, quality, and resilience in applications.

As a passionate advocate for continuous learning, she has honed her skills in empathetic listening and fostering a culture that values both individuals and system efficiency. Sravanthi believes in leading by example and empowering teams to achieve successful outcomes.

In her current role, she leads initiatives that bridge the gap between development, security, and operations, ensuring seamless integration and collaboration across teams. Her expertise in Kubernetes and DevSecOps has been pivotal in optimizing workflows and strengthening application security.

Sravanthi holds a degree in Computer Science and has earned several certifications in cloud computing and security. She is a sought-after speaker at industry conferences, sharing insights on performance optimization, security best practices, and the latest trends in technology.

Beyond her professional endeavors, Sravanthi is committed to mentoring the next generation of tech leaders and actively participates in community initiatives aimed at promoting diversity and inclusion in the tech industry.

Area of Expertise

Information & Communications Technology

Topics

Observability
Monitoring and Observability
App observability
Monitoring & Observability
Observability & Platform Engineering
Kubernetes Security
Kubernetes troubleshooting
ML Observability
Azure Kubernetes Services (AKS)
Performance Testing and Engineering
Observability and Analysis
Revenera
JFrog Xray
SAST
Kubernetes Operators

Latency Forensics: Uncovering Hidden P99 Bottlenecks in Kubernetes

Modern systems rarely fail at average latency — they fail in the long tail.

Applications that appear healthy in staging environments often experience severe P99 degradation in production due to noisy neighbors, burst amplification, CPU throttling, cache invalidation storms, thread contention, and distributed coordination delays.

This session explores a “Containerized Time Travel” approach for reproducing production-like latency behavior inside Kubernetes environments before deployment.

Modern distributed systems rarely fail because of average latency.

They fail because hidden bottlenecks silently amplify tail latency under realistic production conditions.

Applications that appear healthy in staging environments often experience severe P99 degradation in production due to CPU throttling, noisy neighbors, retry storms, queue amplification, cache invalidation cascades, and distributed coordination delays.

This session explores a “Latency Forensics” approach for uncovering hidden performance bottlenecks inside Kubernetes environments using replay-driven performance engineering and observability correlation techniques.

We will examine how realistic workload reconstruction helps teams:

reproduce production-like request behavior
expose hidden latency amplifiers
correlate infrastructure and application bottlenecks
benchmark workloads under realistic contention scenarios
uncover distributed-system side effects invisible in synthetic tests

AI-Driven Observability for MCP Workflows: Hunting P99 Bottlenecks in Autonomous Systems

AI agents and MCP-based workflows are introducing an entirely new class of performance challenges.

Unlike traditional microservices, autonomous workflows involve dynamic orchestration, multi-step reasoning chains, tool invocation latency, token amplification, unpredictable execution paths, and cascading retries across distributed systems.

Conventional observability and performance testing approaches struggle to capture these behaviors effectively.

This session explores how AI-driven observability and performance engineering techniques can be applied to MCP workflows to identify hidden latency bottlenecks, execution inefficiencies, and long-tail performance degradation.

Topics include:

tracing autonomous execution paths
token and context amplification effects
tool-call latency analysis
retry and orchestration amplification
performance testing AI-agent workflows
correlating infrastructure and inference latency
benchmarking unpredictable execution patterns

The session also explores how AI-assisted analysis can help engineers surface hidden bottlenecks faster and improve debugging efficiency in highly dynamic systems.

Attendees will leave with practical strategies to improve MCP workflow performance, reduce P99 latency in autonomous systems, and build observability for next-generation AI-native architectures.

100M Logs a Day: Performance Engineering an OpenSearch Observability Platform for Kubernetes

Kubernetes platforms generate massive volumes of logs from microservices, infrastructure, and platform services. At scale, OpenSearch observability pipelines often struggle with shard explosion, indexing bottlenecks, JVM pressure, and slow queries.

This session presents a practical architecture for operating an OpenSearch observability platform handling 100M+ logs per day from Kubernetes environments. We will walk through the end-to-end pipeline—from log collection to ingestion and indexing—and share performance engineering techniques used to maintain cluster stability under heavy workloads.

Topics include shard and index design, JVM and thread-pool tuning, optimizing indexing throughput, and using lifecycle policies and hot-warm architectures to scale efficiently. Attendees will gain actionable strategies for building resilient OpenSearch observability platforms for cloud-native systems.

Containerized Time Travel: Replicating Production Performance

One of the significant challenges faced by Kubernetes-based applications is that performance issues often only manifest in production environments, making them difficult to reproduce in development or staging settings. Imagine if we could "time travel" and recreate real-world production conditions within a controlled environment.

This session explores how a leading global bank (200M customers in 150+ countries) successfully recreated production workloads in controlled settings. By leveraging synthetic data generation, trace playback, and workload simulation, we tackled most performance issues without compromising sensitive data. This approach not only enhanced system reliability and reduced downtime but also improved the bank's ROI by decreasing operational costs by 30% and increasing transaction efficiency by 25%, resulting in significant annual savings.

Join us as we unravel the secrets of "time travel" to replicate production performance and resolve issues effectively.

The Hidden Killers of P99 Latency in Kubernetes

Most Kubernetes environments are optimized for averages.

Users experience P99 latency.

Clusters that appear healthy at the infrastructure layer can still produce severe long-tail latency due to CPU throttling, autoscaling lag, DNS bottlenecks, queue amplification, garbage collection pauses, retry storms, and noisy neighbors.

This session breaks down the hidden causes of tail latency in Kubernetes-based systems and demonstrates practical techniques for identifying and mitigating them.

We will analyze:

CPU throttling and cgroup behavior
latency amplification across microservices
queueing effects in distributed systems
Kubernetes scheduling side effects
service mesh overhead tradeoffs
retry storms and cascading amplification
observability blind spots
common benchmarking mistakes

The talk combines performance engineering, observability, and distributed systems analysis to expose why “healthy” clusters can still deliver poor user experience.

Attendees will leave with actionable strategies to identify hidden latency amplifiers, reduce tail-latency variance, improve workload isolation, and build more realistic performance tests for cloud-native systems.

OpenSearchCon India 2026 Sessionize Event

June 2026 Mumbai, India

Sravanthi Naga

Senior Engineering Manager - Pega Systems

Hyderābād, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Speaker

Sravanthi Naga

Actions

Links

Area of Expertise

Topics

Sessions

Latency Forensics: Uncovering Hidden P99 Bottlenecks in Kubernetes

AI-Driven Observability for MCP Workflows: Hunting P99 Bottlenecks in Autonomous Systems

100M Logs a Day: Performance Engineering an OpenSearch Observability Platform for Kubernetes

Containerized Time Travel: Replicating Production Performance

The Hidden Killers of P99 Latency in Kubernetes

Events

OpenSearchCon India 2026 Sessionize Event

Sravanthi Naga

Links

Actions