Inference at Scale: Kubernetes and NVIDIA for AI Workloads

As AI workloads continue to grow, the challenge of deploying inference at scale has become critical. In this session, we’ll explore how Kubernetes and NVIDIA GPUs work in tandem to deliver scalable, efficient, and reliable AI inference services. Discover deployment patterns, best practices, and real-world strategies to optimize GPU utilization and performance, all within the Kubernetes ecosystem. Whether you’re running deep learning models or complex analytics, you’ll gain practical insights to supercharge your inference workloads and meet the demands of modern AI applications.

Abhishek Kumar Gupta

Sr. Staff Engineer @ NVIDIA

Santa Clara, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Inference at Scale: Kubernetes and NVIDIA for AI Workloads

Abhishek Kumar Gupta

Links

Actions