© Mapbox, © OpenStreetMap

Speaker

Raghavendra Sirigeri

Raghavendra Sirigeri

Founder, Questodev

Bengaluru, India

Actions

Raghavendra has 8+ years of experience building cloud native apps and has pioneered the use of cloud-native tech across organisations. He led cloud native initiatives at a Fortune 50 company in NYC. Raghavendra was a speaker at Web Summit 2022 and has delivered talks on developer tooling for the container ecosystem across India. Currently, he focuses on building Questodev, a platform enabling dev-tools companies to launch developer-friendly sandbox environments with built-in next-gen coding environments.

Area of Expertise

  • Information & Communications Technology
  • Real Estate & Architecture
  • Travel & Tourism

Topics

  • Cloud Native
  • Cloud Architecture
  • Developer Tools
  • Developer Relations
  • DevOps
  • Cloud & DevOps
  • AWS DevOps
  • DevOpsCulture
  • DevOps Skills
  • AI
  • LLMs
  • Large Language Models (LLMs)
  • Artificial intellince

Supercharging AI Inference on K8s: Demystifying KV Cache, LM Cache & Smart Routing

Running LLMs in production on Kubernetes is no longer just about deploying containers—it’s about surviving GPU scarcity, unpredictable workloads, and soaring inference bills. For many teams, scaling LLMs still feels like navigating a maze of latency and memory bottlenecks.

But what if you could unlock dramatically higher efficiency using engineering fundamentals rather than exotic hardware?

In this session, I break down how modern architectures—KV Cache, LM Cache, and cache-aware routing—transform Kubernetes into a highly optimized inference engine. The talk will include live demos, illustrations and sandbox environments for users to try out hands-on.

Raghavendra Sirigeri

Founder, Questodev

Bengaluru, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top