Jörg Tiedemann: Releasing the MAMMOTH - a framework for large-scale modular multilingual NLP models

preview_player
Показать описание
Abstract: Neural language models have been grown in size and importance over the past years. We address two challenging aspects in the field of NLP: The support of a wide variety of languages and the runtime efficiency of such models. We focus on encoder-decoder models and modular architectures that balance between task-specific components and parameter sharing. In particular, we want to achieve effective cross-lingual transfer learning while keeping language-specific modules that can operate independently. The latter is important for efficient inference reducing computational costs and energy consumption at runtime, a crucial task for modern NLP.

Special care is taken to optimize the scalability in multinode training on large HPC clusters such as LUMI. I will report the current stage of our research including initial results, our efforts on hyper-parameter tuning, the optimization of modular architectures, scalability benchmarks and the final goal of training a large-scale multilingual translation model with massively parallel data sets.

Speaker: Jörg Tiedemann works as professor of language technology at the Department of Digital Humanities at the University of Helsinki. His main research interest is in cross-lingual NLP and machine translation.
Рекомендации по теме