Training Distributed Deep Recurrent Neural Networks with Mixed Precision on GPU Clusters

Показать описание

Alexey Svyatkovskiy is a Data Scientist at Microsoft. In this talk, we evaluate training of deep recurrent neural networks with half-precision floats on Pascal and Volta GPUs. We implement a distributed, data-parallel, synchronous training algorithm by integrating TensorFlow and CUDA-aware MPI to enable execution across multiple GPU nodes and making use of high-speed interconnects. We introduce a learning rate schedule facilitating neural network convergence at up to O(100) workers.

About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.

Connect with us: