Session
Past, Present, and Future of Cloud Native Model Serving
As AI adoption accelerates, the need for scalable, flexible, and efficient model serving has become critical. This talk will explore the evolution of cloud-native model serving platforms — from early bespoke setups 2-3 years ago to today’s dynamic, Kubernetes-native solutions. We'll examine current challenges in productionizing large models, such as performance, cost, and portability, and highlight how open source innovation is addressing them. Finally, we’ll look ahead at emerging trends, including technologies that help with distributed inference, inference orchestration, disaggregated serving, KV-cache management, autoscaling, hardware acceleration, etc. Attendees will gain a clear picture of how the model serving landscape is evolving and how to prepare for what’s next.

Yuan Tang
Senior Principal Software Engineer at Red Hat; Project Lead at Argo, Kubeflow, and KServe
West Lafayette, Indiana, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top