Session
MultiModel RAGs: Unlocking the Power of Multimodality with PaliGemma
Unlocking the Potential of Multimodality with Pali-Gemma. In this session, I’ll dive into the concepts and applications of multimodality, showcasing how combining multiple forms of data can enhance AI model capabilities. I’ll walk through the process of training on Colab/Kaggle, highlighting each step from data preparation to fine-tuning. Following the training phase, I’ll demonstrate how we can leverage Hugging Face Spaces for inference and deployment, providing a practical approach to deploying powerful Vision-Language Models (VLMs) in real-world applications. This session will serve as a comprehensive guide for anyone interested in building and deploying multimodal AI solutions, showing how PaLI-Gemma, a state-of-the-art model, is trained and implemented. By the end, participants will have a solid understanding of how multimodal RAGs work and how to bring them to production efficiently.
Shubham Agnihotri
Senior Manager - Generative AI - IDFC Bank
Mumbai, India
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top