From Monolithic to Mosaic: Collaborative SLMs Ecosystems for Cost-Efficient, Edge-Ready solutions!

Large Language Models excel at filtering, summarization, and code generation, but their heavy compute needs drive up costs and limit scalability. In this talk, we propose a lightweight alternative that moves away from monolithic LLMs to a modular ecosystem of open-source Small Language Models (SLMs) managed by a central Master Agent.

The Master Agent dynamically directs requests to specialized Worker Agents, each running an SLM (like Phi3, Orca-mini) fine-tuned for a specific function. Distributing tasks among smaller models lowers resource consumption and cost.

This approach also meets the growing need for edge-compatible solutions. Compact SLMs can run on IoT devices and mobile apps, enabling low-latency, privacy-preserving, and even offline language processing.

Our implementation, primarily in Python and supported by open-source frameworks like Langchain and Hugging Face LLMs, demonstrates how modular specialization optimizes resource use, simplifies maintenance, and ensures robust failover. Attendees will learn how to integrate this multi-agent framework into their own projects, offering a flexible, affordable, and future-proof platform for advanced language processing.

Suvrakamal Das

Software Engineer @Mattoboard

San Francisco, California, United States

Actions

View Speaker Profile

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Session

From Monolithic to Mosaic: Collaborative SLMs Ecosystems for Cost-Efficient, Edge-Ready solutions!

Suvrakamal Das

Links

Actions