The power of multimodal models

AI is everywhere and its tasks are getting more and more complex. At the same time performance has to keep up as it plays significant role in deployed solutions.
When passing prompts, custom data and completions between models sounds like a disaster, multimodal models are coming to the rescue. Let's dive into the world of AI starting from an overview of AI in general and then getting familiar with all the terms, models and tasks they can help us with. Then we'll dive deeper into a fascinated world of vision language models: learn about underlying architecture and techniques of those models.
The talk will also highlight the practical applications of vision language models in real-world scenarios. In the end I'll show a demo using one of the latest vision language models for detecting specific features of an object in a picture.

Veronika Kolesnikova

Principal AI Engineer, Microsoft MVP (AI)

Miami, Florida, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

The power of multimodal models

Veronika Kolesnikova

Links

Actions