Session

Multilingual AI: Trans-Tokenization and ZenML for Better Fine-Tuning

Ever wondered why fine-tuning machine learning models for multilingual applications often leads to subpar results? The problem lies in token representation limitations that hinder effective adaptation across different languages. The solution? Trans-tokenization.

In this talk, I will explore how trans-tokenization addresses these challenges by aligning token embeddings between languages, significantly improving adaptation and reducing performance gaps in low-resource languages. I will demonstrate how this technique, combined with ZenML, enhances the fine-tuning process, resulting in more accurate, efficient, and scalable models within the CNCF ecosystem.

Using an English-to-Dutch translation task as a case study, I’ll walk you through the integration of trans-tokenization with ZenML, offering practical steps to push the boundaries of multilingual AI in cloud-native applications. This talk is perfect for those looking to optimize models for global deployment.

Suvrakamal Das

Machine Learning Engineer

Kolkata, India

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top