filmov
tv
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Показать описание
In this video from 2018 Swiss HPC Conference, Torsten Hoefler from (ETH) Zürich presents: Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis.
"Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this talk, we describe the problem from a theoretical perspective, followed by approaches for its parallelization.
Specifically, we present trends in DNN architectures and the resulting implications on parallelization strategies. We discuss the different types of concurrency in DNNs; synchronous and asynchronous stochastic gradient descent; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, we extrapolate potential directions for parallelism in deep learning."
"Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this talk, we describe the problem from a theoretical perspective, followed by approaches for its parallelization.
Specifically, we present trends in DNN architectures and the resulting implications on parallelization strategies. We discuss the different types of concurrency in DNNs; synchronous and asynchronous stochastic gradient descent; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, we extrapolate potential directions for parallelism in deep learning."
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
High-Performance Communication Strategies in Parallel and Distributed Deep Learning
Parallel/Distributed Deep Learning and CDSW
A friendly introduction to distributed training (ML Tech Talks)
Parallel Training of Deep Networks with Local Updates
Alpa: Automating Inter- and Intra- Operator Parallelism for Distributed Deep Learning
OSDI '22 - Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
An Uber Journey in Distributed Deep Learning
Demystifying Machine and Deep Learning for Developers : Build 2018
On the Acceleration of Deep Learning Model Parallelism With Staleness
Distributed Deep Neural Network Training using MPI on Python
A Resource Efficient Distributed Deep Learning Method without Sensitive Data Sharing | MIT
Model vs Data Parallelism in Machine Learning
OSDI '21 - P3: Distributed Deep Graph Learning at Scale
Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling (Logical Clocks AB)
Tackling the Communication Bottlenecks of Distributed Deep Learning Training Workloads
Efficient Distributed Deep Learning Using MXNet
Weekly #94: Distributed deep learning
AutoML20: Demystifying NAS in Theory and Practice
[SPCL_Bcast] Distributed Deep Learning with Second Order Information
Neil Gibbons - Demystifying Spark: A Deep Dive into Its Workings - SPS24
Generalized Pipeline Parallelism for DNN Training
[Uber Seattle] Horovod: Distributed Deep Learning on Spark
Distributed deep learning and why you may not need it - Jakub Sanojca, Mikuláš Zelinka
Комментарии