Equilibrium approaches to deep learning: One (implicit) layer is all you need

Показать описание
Speaker: Zico Kolter, Carnegie Mellon University

Machine Learning Advances and Applications Seminar

Abstract: Does deep learning actually need to be deep? In this talk, I will present some of our recent and ongoing work on Deep Equilibrium (DEQ) Models, an approach that demonstrates we can achieve most of the benefits of modern deep learning systems using very shallow models, but ones which are defined implicitly via finding a fixed point of a nonlinear dynamical system. I will show that these methods can achieve results on par with the state of the art in domains spanning large-scale language modeling, image classification, and semantic segmentation, while requiring less memory and simplifying architectures substantially. I will also highlight some recent work analyzing the theoretical properties of these systems, where we show that certain classes of DEQ models are guaranteed to have a unique fixed point, easily-controlled Lipschitz constants, and efficient algorithms for finding the equilibria. I will conclude by discussing ongoing work and future directions for these classes of models.
Рекомендации по теме