One Brain, Many Skills: How to Serve 100 Models on One Foundation

Why move a mountain when you can just swap a brick? The first instinct for many MLOps enthusiasts is to deploy a full model for every specific task—but that’s an infrastructure nightmare.

The smarter way? Build your ML like a Lego constructor. In this talk, I’ll show you how to take one powerful foundation like Gemma 3 and "snap on" tiny, specialized LoRA adapters to handle 100 different tasks. You get all the intelligence of a massive LLM without the heavy lifting

What we will explore:

- Training LoRA adapters instead of re-training the whole model.
- Using TGI to hot-swap these specialized skills in milliseconds.
- Orchestrating it all with GCP Inference Gateway to maximize efficiency

Kateryna Hrytsaienko

Software Engineer Consultant Valtech | Woman Techmakers Ambassador| GDG Kyiv co-Lead and founder | Lector at Kyiv Polytechnic Institute and Kyiv School of Economics

Kyiv, Ukraine

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

One Brain, Many Skills: How to Serve 100 Models on One Foundation

Kateryna Hrytsaienko

Links

Actions