Session

Pre-training vs. Fine-tuning in Transformers: What Works Best?

This talk will explore the interplay between **pre-training** large-scale Transformer models on general corpora and the subsequent **fine-tuning** on domain-specific tasks to achieve state-of-the-art performance. We will examine the trade-offs between these two stages, focusing on challenges such as overfitting, computational demands, and maintaining model generalization. Through empirical results from legal text analysis, we will demonstrate how fine-tuning can significantly enhance model performance, particularly when combined with strategies like data augmentation and retraining on incorrect predictions. Attendees will gain a deeper understanding of how to effectively leverage pre-training and fine-tuning for specialized tasks, optimizing performance while addressing practical challenges in real-world applications.

Sweta Patra

Software Engineer by profession,Technology Enthusiast at heart.

Pune, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top