T3: Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world

Barun Patra, Kabir Ahuja, Kalika Balia, Monojit Choudhury, Sunayana Sitaram, Vishrav Chaudhary

Abstract: This tutorial will describe various aspects of scaling up language technologies to many of the world's languages by describing the latest research in Massively Multilingual Language Models (MMLMs). We will cover topics such as data collection, training and fine-tuning of models, Responsible AI issues such as fairness, bias and toxicity, linguistic diversity and evaluation in the context of MMLMs, specifically focusing on issues in non-English and low-resource languages. Further, we will also talk about some of the real-world challenges in deploying these models in language communities in the field. With the performance of MMLMs improving in the zero-shot setting for many languages, it is now becoming feasible to use them for building language technologies in many languages of the world, and this tutorial will provide the computational linguistics community with unique insights from the latest research in multilingual models

Time Event Hosts
Sunday, 09:00 T3: Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world Barun Patra, Kabir Ahuja, Kalika Balia, Monojit Choudhury, Sunayana Sitaram, Vishrav Chaudhary
Information about the virtual format of this tutorial: