Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Показать описание

In this video from 2018 Swiss HPC Conference, Torsten Hoefler from (ETH) Zürich presents: Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis.

"Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this talk, we describe the problem from a theoretical perspective, followed by approaches for its parallelization.

Specifically, we present trends in DNN architectures and the resulting implications on parallelization strategies. We discuss the different types of concurrency in DNNs; synchronous and asynchronous stochastic gradient descent; distributed system architectures; communication schemes; and performance modeling. Based on these approaches, we extrapolate potential directions for parallelism in deep learning."

Рекомендации по теме

Комментарии

This is the coolest lecture in the world!! Exactly what I needed

Keepedia

starting at 13:35. I think it is not 4X because the size of output and weight are different?

zhengchunliu

Very comprehensive and informative talk!

rzhang

Is this paper published in some journal (not the arxiv version)?

jo-of-joey

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

High-Performance Communication Strategies in Parallel and Distributed Deep Learning

Parallel/Distributed Deep Learning and CDSW

A friendly introduction to distributed training (ML Tech Talks)

Parallel Training of Deep Networks with Local Updates

Alpa: Automating Inter- and Intra- Operator Parallelism for Distributed Deep Learning

OSDI '22 - Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning

An Uber Journey in Distributed Deep Learning

Demystifying Machine and Deep Learning for Developers : Build 2018

On the Acceleration of Deep Learning Model Parallelism With Staleness

Distributed Deep Neural Network Training using MPI on Python

A Resource Efficient Distributed Deep Learning Method without Sensitive Data Sharing | MIT

Model vs Data Parallelism in Machine Learning

OSDI '21 - P3: Distributed Deep Graph Learning at Scale

Distributed Deep Learning with Apache Spark and TensorFlow with Jim Dowling (Logical Clocks AB)

Tackling the Communication Bottlenecks of Distributed Deep Learning Training Workloads

Efficient Distributed Deep Learning Using MXNet

Weekly #94: Distributed deep learning

AutoML20: Demystifying NAS in Theory and Practice

[SPCL_Bcast] Distributed Deep Learning with Second Order Information

Neil Gibbons - Demystifying Spark: A Deep Dive into Its Workings - SPS24

Generalized Pipeline Parallelism for DNN Training

[Uber Seattle] Horovod: Distributed Deep Learning on Spark

Distributed deep learning and why you may not need it - Jakub Sanojca, Mikuláš Zelinka