Multi-modal LLMs for application developers

Multi-modal large language models (LLMs) can understand text, images, or videos and with their ever increasing context size, they open up interesting use cases for application developers. In this talk, we’ll take a tour of Gemini, Google’s multi-modal LLM, and its open source version Gemma, showing what’s possible and how to integrate them in your applications. We’ll also explore techniques such as RAG, function calling, grounding, to supply LLMs with more up-to-date and relevant data and minimize hallucinations.

Mete Atamel

Software Engineer and Developer Advocate at Google

London, United Kingdom

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Multi-modal LLMs for application developers

Mete Atamel

Links

Actions