How to deploy GenAI applications on Minikube

As the leading role for managing and scaling containerized applications, Kubernetes can greatly ease the process of AI/ML applications deployment. The speaker will discuss the methodology of deploying GenAI applications on Minikube. At the beginning of the talk, the speaker will generally introduce the Hugging Face Whisper web GenAI application, an LLM to handle STT tasks. Then he will discuss the regular Kubernetes deployment method. Techniques including containerizing the Whisper web application, and setting up a Kubernetes cluster will be illustrated. The deployment and service configuration files will be analyzed. He will also introduce the other alternative deployment approach by using KubeAI and Helm, the easiest way to serve ML models in production. Concepts of speech to text API and autoscable will be explained. The architecture of KubeAI will be illustrated. Finally, a live demo of the Whisper application deployed on MiniKube via classical loadbalancer will be demonstrated.

Wentao Liu

Manager of omfoss.com

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

How to deploy GenAI applications on Minikube

Wentao Liu

Links

Actions