Piotr Bojanowski: Unsupervised methods for deep learning and applications

Показать описание

Abstract:
Modern machine learning algorithms require large amounts of supervised data to train. Successful computer vision models such as Mask-RCNN require 80k MS-COCO images, but strongly rely on pre-training networks on 1.2M ImageNet images.
Obtaining reliable annotations is very costly. As long as the label set corresponds to objective properties such as facial landmark locations the annotation procedure is relatively easy. For more conceptual labels such as human actions in video, the process is illy defined and very time consuming. Finally annotations are even more demanding for expert domains such as law or medicine requiring highly skilled professionals in the loop.
In the absence of large supervised datasets like ImageNet, pre-training models using unsupervised learning algorithms can be beneficial. Such algorithms allow to encode priors about the data, making the training of complex models easier. Unsupervised pre training of neural networks has received quite a lot of interest in the mid 2000s, allowing to overcome data scarcity at that time.
In this talk, I will go over some of our contributions in this context. In the first part of this talk, I will discuss unsupervised learning methods for natural language processing. This includes character-level language modeling and character-based word vector representations. In a second part, I will present unsupervised models for computer vision that we designed to train convolutional neural networks without using labels.