Session

Semantic hide and seek - a gentle introduction to embeddings with a vector search game

GenAI has thrust embeddings and vector search to prominence, with many new technologies in this space. Embeddings capture some useful semantic - or meaningful - relationships between items. But what does a semantic vector with 100s of dimensions actually _mean_?

Whether you're building a RAG system or just curious, this talk will help build your intuition for how embeddings work, and when they fail.

We’ll explore embeddings for various applications, different types of data, and network architectures. But we'll focus in particular on a game of semantic hide and seek. Fans of the Wordle gamification universe may be familiar with Semantle. In this game, players seek a hidden word. A little like the game of Marco Polo, players “shout” out guesses, and the game “responds” with a semantic similarity score, from (roughly) 0 to 100. Behind the scenes, embeddings are doing all the work.

We’ll see when Semantle does and doesn’t match our intuition for similar meaning. We’ll also explore how biases may be encoded in embeddings. Because we love a puzzle, we’ll compare human and multiple machine-driven strategies for solving Semantle, and their robustness to divergent semantics (with a live demo).

To wrap up, we’ll discuss what all this means for building solutions with the current generation of LLMs.

An version of this talk was first delivered at the Inaugural GenAI Network Melbourne meetup.

David Colls

Head of Data, Product & Platforms at MYOB

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top