Optimizing AI Workloads in Kubernetes: Pruning for Efficiency and Scale

AI workloads are resource-intensive, driving up costs. This talk explores model pruning techniques and Kubernetes-native strategies for scalable AI deployments, focusing on resource scheduling, autoscaling, and efficient inference serving in cloud.

Achyut Sarma Boggaram

Sr. Machine Learning Engineer

Austin, Texas, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Optimizing AI Workloads in Kubernetes: Pruning for Efficiency and Scale

Achyut Sarma Boggaram

Links

Actions