Session
Optimizing AI/ML Workloads on Kubernetes: Cutting Costs Without Compromising Scale
Unleash the power of distributed AI/ML training on Kubernetes without breaking the bank. Discover open-source gems like Kubernetes cluster autoscalers, DASK, and Volcano that unlock intelligent scheduling, autoscaling, and resource optimization across your cluster. Explore real-world case studies on maximizing GPU utilization, right-sizing resources, and leveraging spot instances to their full potential.
Cut your compute costs by up to 60% while maintaining peak performance. Gain valuable insights into evaluating and adopting cost-effective distributed training frameworks such as Horovod, TensorFlow Distributed, and PyTorch Lightning, tailored for Kubernetes environments. Leave with actionable strategies to optimize your AI/ML pipelines for both scalability and cost-efficiency on any cloud platform.

Sat Agrawal
Senior Principal Software Engineer @ Discover Financial Services
Jacksonville, Florida, United States
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top