Session

Pointing and 3D Spatial Understanding with Gemini 2.0

Pointing is an important capability for vision language models, because that allows the model to refer to an entity precisely. Gemini 2.0 Flash has improved accuracy on spatial understanding, with 2D point prediction as an experimental feature. Below you'll see that pointing can be combined with reasoning.

Arthur Kaza

Head of Tech Support-Mentorship & Data Analytics @Akieni (Yao Corp)

Kinshasa, Democratic Republic of the Congo

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top