Session

Creating your own LLM from opensource models

From "Simple" Fine-Tuning to your own Mixture of Expert Model Using Open Source Models

Nowadays, training a large language model (LLM) from scratch is a huge effort, even for very large companies. Starting from pre-trained models to create your own custom models is no longer just an option for resource-constrained organizations; it has become a necessary starting point for many.

In this context, various techniques and strategies can help to maximize the potential of pre-trained models:

- Lora: A technique for low-rank adaptation, which allows for efficient fine-tuning of models by focusing on adjusting a small subset of the model's parameters.
- Quantization and QLora: Methods to reduce the computational complexity and memory footprint of models without significantly compromising their performance, enabling more efficient deployment and fine-tuning.
- Managing Multiple Lora Adapters: This involves using multiple Lora adapters to equip models with multiple skills, allowing for a flexible and modular approach to model capabilities.
- Fine Embeddings Management to Improve RAG (Retrieval-Augmented Generation): Enhancing the management of embeddings can significantly improve the performance of RAG systems, which combine the strengths of information retrieval and generative models.
- Mixing Models: Creating Your MoE (Mixture of Experts) Model: This advanced technique involves combining several fine-tuned models to create a Mixture of Experts model, leveraging the strengths of each individual model to enhance overall performance.

These strategies provide a robust toolkit for those who plan to adapt and to enhance pre-trained models to meet specific needs, even without deep expertise in machine learning. By understanding and applying these techniques, organizations can harness the power of modern AI with greater efficiency and effectiveness cutting costs.

Sebastiano Galazzo

Artificial intelligence researcher and proud dad

Milan, Italy

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top