Intro to Multimodal Retrieval-Augmented Generation (RAG)

RAG typically uses external data sources only based on text. With Gemini Pro Vision and multimodal embeddings, you can now perform multimodal RAG on text and images. In this session, you will gain hands-on experience by performing multimodal RAG on a financial document that contains both text and images (charts, diagrams).

Ankur Roy

Solutions Architect at Online Partner AB | Google Developer Expert in Cloud

Stockholm, Sweden

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Intro to Multimodal Retrieval-Augmented Generation (RAG)

Ankur Roy

Links

Actions