filmov
tv
Holger Rauhut: 'Learning Deep Matrix Factorizations Via Gradient Descent: Implicit Bias Towards ...'
Показать описание
Tensor Methods and Emerging Applications to the Physical and Data Sciences 2021
Workshop IV: Efficient Tensor Representations for Learning and Computational Complexity
"Learning Deep Matrix Factorizations Via Gradient Descent: Implicit Bias Towards Low Rank"
Holger Rauhut - RWTH Aachen University
Abstract: In many deep learning scenarios more network parameters than training examples are used. In such situations often several networks can be found that exactly interpolate the data. This means that the used learning algorithm induces an implicit bias on the chosen network. This talk will discuss the nature of such implicit bias for gradient descent algorithms in the simplified setting of linear network, i.e., deep matrix factorizations. Numerical experiments and first theoretical works suggest that the product of the gradient descent iterates converges, i.e., the linear network, converges to a matrix of low rank. We present a rigorous theoretical results for a further simplified matrix estimation scenario. In particular, we give a precise analysis of the dynamics of the effective rank of the iterates. We discuss a number of open problems and possible extensions to learning low rank tensor decompositions.
Institute for Pure and Applied Mathematics, UCLA
May 21, 2021
Workshop IV: Efficient Tensor Representations for Learning and Computational Complexity
"Learning Deep Matrix Factorizations Via Gradient Descent: Implicit Bias Towards Low Rank"
Holger Rauhut - RWTH Aachen University
Abstract: In many deep learning scenarios more network parameters than training examples are used. In such situations often several networks can be found that exactly interpolate the data. This means that the used learning algorithm induces an implicit bias on the chosen network. This talk will discuss the nature of such implicit bias for gradient descent algorithms in the simplified setting of linear network, i.e., deep matrix factorizations. Numerical experiments and first theoretical works suggest that the product of the gradient descent iterates converges, i.e., the linear network, converges to a matrix of low rank. We present a rigorous theoretical results for a further simplified matrix estimation scenario. In particular, we give a precise analysis of the dynamics of the effective rank of the iterates. We discuss a number of open problems and possible extensions to learning low rank tensor decompositions.
Institute for Pure and Applied Mathematics, UCLA
May 21, 2021
Комментарии