Distributed Deep Neural Network Training using MPI on Python

preview_player
Показать описание
Arpan Jain, Kawthar Shafie Khorassani

Deep learning models are a subset of machine learning models and algorithms which are designed to induce Artificial Intelligence in computers. The rise of deep learning can be attributed to the presence of large datasets and growing computational power. Deep learning models are used in face recognition, speech recognition, and many other applications. TensorFlow is a popular deep learning framework for python used to implement and train Deep Neural Networks (DNNs). Message Passing Interface (MPI) is a programming paradigm, often used in parallel applications, that allows processes to communicate with each other. Horovod provides an interface in python to couple DNN written using TensorFlow and MPI to train DNNs in less amount of time using the distributed training approach. MPI functions are optimized to provide multiple communication routines including point-to-point and collective communication. Point-to-point communication refers to a communication pattern that involves a sender process and a receiver process while collective communication involves a group of processes exchanging messages. In particular, the reduction is a collective function widely used in deep learning models to perform group operations. In this talk, we intend to demonstrate the challenges and elements to consider for DNN training using MPI in Python.

Deep Learning(DL) has attracted a lot of attention in recent years, and python has been the front runner language when it comes to the framework and implementation. Training of DL models remains a challenge as it requires a huge amount of time and computational resources. We will discuss the distributed training of the Deep Neural Network using the MPI across multiple GPUs or CPUs.

===

A FREE annual conference for anyone interested in Python in and around Ohio, the entire Midwest, maybe even the whole world.

Sat Jul 27 11:15:00 2019 at Hays Cape
Рекомендации по теме
Комментарии
Автор

Can use Nvidia Geforce GT 720 and can add single node-single GPU for DDL

kyawthuraoo