Session
Accelerating AI Workloads with GPUs in Kubernetes
As AI and machine learning become ubiquitous, GPU acceleration is essential for model training and inference at scale. However, effectively leveraging GPUs in Kubernetes brings challenges around efficiency, configuration, extensibility, and scalability.
This talk provides a comprehensive overview of capabilities in Kubernetes and GPUs to address these challenges, enabling seamless support for next-generation AI applications.
The session will cover:
- GPU resource sharing mechanisms such as MPS (Multiple-Process Service), Time-Slicing, MIG (Multi-Instance GPU), and vGPU (virtualization with vGPU) on Kubernetes.
- Flexible accelerator configuration via DevicePlugins and Dynamic Resource Allocation with ResourceClaims and ResourceObjects in Kubernetes.
- Advanced scheduling and resource management features including gang scheduling, topology-aware scheduling, quota management, and job queues.
- The open-source efforts in Volcano, Yunikorn and Slurm for supporting GPU and AI workloads in Kubernetes.
Yuan Chen
Nvidia, Software Engineer, Kubernetes, Scheduling, GPU, AI/ML, Resource Management
San Jose, California, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top