Session
System Design for the GenAI Era
We all learnt the basic modules of system design. Relational database, Blob storage, API Gateway, and Distributed Logging are familiar concepts. However, the rise of Generative AI applications brings a new set of challenges and tools to solve them.
In this talk, we will explore the new modules essential for developing Generative AI (GenAI) applications, addressing their unique challenges.
We will begin by talking about why using Large Language Models (LLMs) is insufficient. Next, we'll introduce guardrails, which are crucial for sanitizing both the input and output of LLMs and diffusion models. We'll then cover prompt compression techniques, designed to reduce the cost of generation.
Following this, we'll present prompt registries, which enable over-the-air A/B testing and updates of prompts, and the role of observability, highlighting tools such as openllmetry, a new standard based on opentelemetry.
We'll also examine how to enhance LLM capabilities with function calls, allowing LLMs to interact as agents and communicate with other external systems.
During the discussion, we will show code examples in Python.
By the end of this talk, attendees will have a clear understanding of the main modules used in building GenAI applications and gain valuable insights into designing their own.
First Delivered at AI Heroes 2024, Turin, Italy
Emanuele Fabbiani
Head of AI at xtream, Professor at Catholic University of Milan
Milan, Italy
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top