Tips and tricks for distributed large model training

preview_player

Показать описание

Discover several different distribution strategies and related concepts for data and model parallel training. Walk through an example of training a 39 billion parameter language model on TPUs, and conclude with the challenges and best practices of orchestrating large scale language model training.

Resource:

Speakers: Nikita Namjoshi, Vaibhav Singh

Watch more:

#GoogleIO

Рекомендации по теме

Комментарии

Thanks, could you also share any link of data parallelism implementation as well?

roy

This video especially the second half was a bit difficult to follow given too much assumed knowledge.

wryltxw

welcome to shbcf.ru