Invariance and Stability to Deformations of Deep Convolutional Representations

preview_player
Показать описание
The success of deep convolutional architectures is often attributed in part to their ability to learn multiscale and invariant representations of natural signals. However, a precise study of these properties and how they affect learning guarantees is still missing. In this talk, we consider deep convolutional representations of signals; we study their invariance to translations and to more general groups of transformations, their stability to the action of diffeomorphisms, and their ability to preserve signal information. This analysis is carried by introducing a multilayer kernel based on convolutional kernel networks and by studying the geometry induced by the kernel mapping. We then characterize the corresponding reproducing kernel Hilbert space (RKHS), showing that it contains a large class of convolutional neural networks with smooth activation functions. This analysis allows us to separate data representation from learning, and to provide a canonical measure of model complexity, the RKHS norm, which controls both stability and generalization of any learned model. This theory also leads to new practical regularization strategies for deep learning that are effective when learning on small datasets, or to obtain adversarially robust models.

Рекомендации по теме
Комментарии
Автор

Thank you so much for this lecture. From the point view of digital signal processing, a small downsampling ratio implies large sampling rate which preserves more frequency info (high freq info); a small filter window size (patch size) implies fewer frequency choices of the filter (frequency resolution), which means "smoother" in the corresponding continual domain, and therefore reduces instability. As for translation invariance, fine-grind filters (via more layers) work for not only global but also local translation invariances.
Another understanding from me about using small patch size for CNN is, although in signal process field a lager filter window size is preferred, here in CNN the tasks are mainly for classification rather than reconstructing the signal, and therefore stability is more important.

AdrianYang