Sanjeev Arora: Toward Theoretical Understanding of Deep Learning (ICML 2018 tutorial)

preview_player
Показать описание
Audio starts at 1:46

Abstract:
We survey progress in recent years toward developing a theory of deep learning. Works have started addressing issues such as: (a) the effect of architecture choices on the optimization landscape, training speed, and expressiveness (b) quantifying the true "capacity" of the net, as a step towards understanding why nets with hugely more parameters than training examples nevertheless do not overfit (c) understanding inherent power and limitations of deep generative models, especially (various flavors of) generative adversarial nets (GANs) (d) understanding properties of simple RNN-style language models and some of their solutions (word embeddings and sentence embeddings)

While these are early results, they help illustrate what kind of theory may ultimately arise for deep learning.

Presented by Sanjeev Arorau (Princeton U., Inst. For Advanced Study)

Рекомендации по теме
Комментарии
Автор

00:06:15 Talk Overview
00:08:20 Part 1: Optimization in deep learning
00:31:19 Part 2: Overparametrization and Generalization theory
01:09:42 Part 3: Role of Depth
01:17:15 Part 4: Theory for Generative Models and Generative Adversarial Nets (GANs)
01:32:49 Part 5: Deep learning-free text embeddings
01:51:51 Q & A

iuhh