Vector Space Manipulation in LLMs

A vector space is a mathematical framework where words, phrases, sentences, or even entire documents are represented as numerical vectors. These vectors capture both semantic and syntactic relationships between linguistic units, enabling models to process and generate text effectively.

Words are mapped to high-dimensional vectors within a continuous vector space. In models such as Word2Vec, GloVe, and large language models (LLMs), each word is represented as a dense vector (e.g., 300 dimensions or more). These vectors are learned during training and encode semantic relationships. For example, the vectors for king and queen will be close to each other in the vector space due to their similar contexts. In LLMs like GPT and BERT, word vectors are not static but vary depending on context. This means the same word can have different vector representations based on the surrounding words. For instance, the word bank will have distinct vector representations in river bank versus financial bank.

In this workshop we will explore tactics to manipulate the vector space. These methods include Prompt engineering and poisoning data streams with in them, The method target RAG (Retrieval augment Generation) based LLM applications, LLM Agents and LLM that search the web for accessing information. The methods results in DoS conditions and manipulated data generation in LLM models. An attack scenario is putting a malicious comment in an online product review system, so when the LLM access it its output will be manipulated or its performance will be degraded.

Muhammad Mudassar Yamin

Associate Professor

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

Vector Space Manipulation in LLMs

Muhammad Mudassar Yamin

Links

Actions