filmov
tv
Implicit Geometry of Large Classification Models: Imbalance Trouble
Показать описание
Christos Thrampoulidis, Assistant Professor,
Electrical and Computer Engineering
University of Brittish Columbia
Abstract: What are the unique structural properties of models learned by deep neural network classifiers? Is there an implicit bias towards solutions of a certain geometry and how does this vary across architectures and data? Specifically, how does this geometry change under label imbalances, and is it possible to use this information to design better loss functions for learning with imbalances?
We address these questions by first investigating the implicit geometry of classifiers and last-layer embeddings learnt by deep-nets when trained beyond zero training-error with the cross-entropy (CE) loss. By characterizing the global optima of an unconstrained-features model, we formulate the SELI geometry and argue experimentally that it approximates the implicit geometry learnt by the deep-net well. Importantly, the SELI geometry remains invariant across different imbalance levels while having a simple explicit description, despite the asymmetries imposed by data imbalances on the geometric properties of different classes (minorities/ majorities).
In the second part of the talk, we show how to engineer the training loss in order to mitigate the asymmetries in the learnt geometry between minorities and majorities. We achieve this by introducing a rich family of CE parameterizations and by characterizing their implicit geometry. We then use this information to optimally tune the involved hyperparameters favoring larger margins for the minorities.
Throughout the talk, we also motivate further investigations into the impact of class imbalances on the implicit bias of first-order methods and into potential connections between such geometry structures and generalization.
Bio: Dr. Thrampoulidis is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of British Columbia. Previously, he was an Assistant Professor at the University of California, Santa Barbara and a Postdoctoral Researcher at MIT. He received a M.Sc. and a Ph.D. in Electrical Engineering in 2012 and 2016, respectively, both from Caltech, with a minor in Applied and Computational Mathematics. In 2011, he received a Diploma in ECE from the University of Patras, Greece. His research is on machine learning, high-dimensional statistics and optimization.
Electrical and Computer Engineering
University of Brittish Columbia
Abstract: What are the unique structural properties of models learned by deep neural network classifiers? Is there an implicit bias towards solutions of a certain geometry and how does this vary across architectures and data? Specifically, how does this geometry change under label imbalances, and is it possible to use this information to design better loss functions for learning with imbalances?
We address these questions by first investigating the implicit geometry of classifiers and last-layer embeddings learnt by deep-nets when trained beyond zero training-error with the cross-entropy (CE) loss. By characterizing the global optima of an unconstrained-features model, we formulate the SELI geometry and argue experimentally that it approximates the implicit geometry learnt by the deep-net well. Importantly, the SELI geometry remains invariant across different imbalance levels while having a simple explicit description, despite the asymmetries imposed by data imbalances on the geometric properties of different classes (minorities/ majorities).
In the second part of the talk, we show how to engineer the training loss in order to mitigate the asymmetries in the learnt geometry between minorities and majorities. We achieve this by introducing a rich family of CE parameterizations and by characterizing their implicit geometry. We then use this information to optimally tune the involved hyperparameters favoring larger margins for the minorities.
Throughout the talk, we also motivate further investigations into the impact of class imbalances on the implicit bias of first-order methods and into potential connections between such geometry structures and generalization.
Bio: Dr. Thrampoulidis is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of British Columbia. Previously, he was an Assistant Professor at the University of California, Santa Barbara and a Postdoctoral Researcher at MIT. He received a M.Sc. and a Ph.D. in Electrical Engineering in 2012 and 2016, respectively, both from Caltech, with a minor in Applied and Computational Mathematics. In 2011, he received a Diploma in ECE from the University of Patras, Greece. His research is on machine learning, high-dimensional statistics and optimization.