Session
One Brain, Many Skills: How to Serve 100 Models on One Foundation
Why move a mountain when you can just swap a brick? The first instinct for many MLOps enthusiasts is to deploy a full model for every specific task—but that’s an infrastructure nightmare.
The smarter way? Build your ML like a Lego constructor. In this talk, I’ll show you how to take one powerful foundation like Gemma 3 and "snap on" tiny, specialized LoRA adapters to handle 100 different tasks. You get all the intelligence of a massive LLM without the heavy lifting
What we will explore:
- Training LoRA adapters instead of re-training the whole model.
- Using TGI to hot-swap these specialized skills in milliseconds.
- Orchestrating it all with GCP Inference Gateway to maximize efficiency
Kateryna Hrytsaienko
Software Engineer Consultant Valtech | Woman Techmakers Ambassador| GDG Kyiv co-Lead and founder | Lector at Kyiv Polytechnic Institute and Kyiv School of Economics
Kyiv, Ukraine
Links
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top