Large Multimodal Models: Understanding the State of the Art in Computer Vision

How do computers interpret and understand visual data? This session explores the mechanisms behind computer vision AI models, as well as the advancements in Large Multimodal Models (LMMs) that integrate computer vision and language understanding. We'll dissect the mechanisms that enable machines to process and interpret images, discuss the current state-of-the-art models, and evaluate whether leveraging existing LMMs or training the own AI models is more effective for specific vision tasks. Through analysis of various use cases, attendees will gain a comprehensive understanding of the capabilities and limitations of current technologies in the area of computer vision.

Agata Chudzińska

CTO / AI Solutions Architect at theBlue.ai GmbH

Poznań, Poland

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Large Multimodal Models: Understanding the State of the Art in Computer Vision

Agata Chudzińska

Links

Actions