Visual Self-supervised Learning and World Models - Dumitru Erhan, Google | GHOST Day: AMLC 2022

preview_player
Показать описание
GHOST Day: AMLC 2022

About the speaker:
Dumitru Erhan is a Staff Research Scientist and Tech Lead Manager in the Google Brain team in San Francisco. He received a PhD from University of Montreal (MILA) in 2011 with Yoshua Bengio, where he worked on understanding deep networks. Afterwards, he has done research at the intersection of computer vision and deep learning, notably object detection (SSD), object recognition (GoogLeNet), image captioning (Show & Tell), visual question-answering, unsupervised domain adaptation (PixelDA), active perception and others. Recent work has focused on video prediction and generation, as well as its applicability to model-based reinforcement learning. He aims to build and understand agents that can learn as much as possible to self-supervised interaction with the environment, with applications to the fields of robotics and self-driving cars. Dumitru divides his free time between family, cooking and cycling through the Bay Area!

Abstract:
In order to build intelligent agents that quickly adapt to new scenes, conditions, tasks, we need to develop techniques, algorithms and models that can operate on little data or that can generalize from training data that is not similar to the test data. World Models have long been hypothesized to be a key piece in the solution to this problem. But world models are only one of the potential ways to achieve this: there is a universe of ways to use unsupervised or weakly supervised data to learn better reusable representations for our problems. In this talk, I will describe a number of recent advances for modeling and generating image and video observations. These approaches can help with building agents that interact with the environment and mitigate the sample complexity problems in reinforcement learning, but also make supervised learning easier with few labeled examples. Such approaches can also enable agents that generalize quicker to new scenarios, tasks, objects and situations and are thus more robust to environment changes. Finally, I will have some speculative thoughts on compositional generalization and why I believe it's the natural next big challenge to explore in machine learning.
Рекомендации по теме