Session

LLMs need a good retriever

LLMs are all around you. You cannot think of a conference that does not talk about large language models. In my line of business, I run into LLM applications regularly. Often, these encounters include RAG, short for Retrieval Augmented Generation. RAG solutions provide the knowledge that LLMs do not have themselves. Some examples are recent and private content. In short, for RAG, you create vectors from your content, store them in a vector store, fetch the best-matching chunks to a question, and return these as a context to a large language model that generates an answer to your question.
The retrieval aspect of RAG is more nuanced than it may seem. How do you determine the optimal chunks of content for a vector? Is a sentence, max tokens, or something else the best approach? Are these chunks indeed the most suitable context for the LLM? How can you ensure that your results perfectly fit the posed question?
If these questions sound familiar or want to learn more about RAG systems, this talk is for you. For the demos, I use my framework, "RAG4j," to interact with different LLMs and to create embeddings. The retriever is an in-memory store.

Jettro Coenradie

Fellow at Luminis working as Search and Data expert

Pijnacker, The Netherlands

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top