DeepMind x UCL | Deep Learning Lectures | 7/12 | Deep Learning for Natural Language Processing

preview_player
Показать описание
This lecture, by DeepMind Research Scientist Felix Hill, first discusses the motivation for modelling language with ANNs: language is highly contextual, typically non-compositional and relies on reconciling many competing sources of information. This section also covers Elman's Finding Structure in Time and simple recurrent networks, the importance of context and transformers. In the second part, he explores unsupervised and representation learning for language from Word2Vec to BERT. Finally, Felix discusses situated language understanding, grounding and embodied language learning.

Download the slides here:

Find out more about how DeepMind increases access to science here:

Speaker Bio:

Felix Hill is a Research Scientist working on grounded language understanding, and has been at DeepMind for almost 4 years. He studied pure maths as an undergrad, then got very interested in linguistics and psychology after reading the PDP books by McClelland and Rumelhart, so started graduate school at the University of Cambridge, and ended up in the NLP group. To satisfy his interest in artificial neural networks, he visited Yoshua Bengio's lab in 2013 and started a series of collaborations with Kyunghyun Cho and Yoshua applying neural nets to text processing. This led to some of the first work on transfer learning with sentence representations (and a neural crossword solver). He also interned at FAIR in NYC with Jason Weston. At DeepMind, he's worked on developing agents that can understand language in the context of interactive 3D worlds, together with problems relating to mathematical and analogical reasoning.

About the lecture series:

The Deep Learning Lecture Series is a collaboration between DeepMind and the UCL Centre for Artificial Intelligence. Over the past decade, Deep Learning has evolved as the leading artificial intelligence paradigm providing us with the ability to learn complex functions from raw data at unprecedented accuracy and scale. Deep Learning has been applied to problems in object recognition, speech recognition, speech synthesis, forecasting, scientific computing, control and many more. The resulting applications are touching all of our lives in areas such as healthcare and medical research, human-computer interaction, communication, transport, conservation, manufacturing and many other fields of human endeavour. In recognition of this huge impact, the 2019 Turing Award, the highest honour in computing, was awarded to pioneers of Deep Learning.

In this lecture series, research scientists from leading AI research lab, DeepMind, deliver 12 lectures on an exciting selection of topics in Deep Learning, ranging from the fundamentals of training neural networks via advanced ideas around memory, attention, and generative modelling to the important topic of responsible innovation.
Рекомендации по теме
Комментарии
Автор

RIP Felix, you truly were one of the best!

viggipedia
Автор

*DeepMind x UCL | Deep Learning Lectures | 7/12 | Deep Learning for Natural Language Processing*
*My takeaways:*
*1. Plan for this lecture **0:23*
*2. Background: Deep learning and language **3:03*
2.1 Language applications use deep learning in very different extent 4:12
2.2 Why is deep learning such an effective tool for language processing 7:08
2.3 Understand languages: this is import for building language models 7:50
*3. The Transformer **22:14*
3.1 Distributed representation of words 23:40
3.2 Self-attention over word input embeddings 32:13
3.3 Multi-head self-attention 38:55
3.4 Feedforward layer 41:57
3.5 A complete Transformer block 42:23
3.6 Skip connections 42:38
3.7 Position encoding of words 46:02
3.8 Summary 50:58
*4. Unsupervised and transfer learning with BERT **54:45*
4.1 Problems in language 55:39
4.2 BERT 59:42
-Unsupervised learning
--Masked language model pertaining 1:02:05
--Next sentence prediction pertaining 1:05:55
-BERT fine-tuning 1:09:55
-BERT supercharges transfer learning 1:12:05
*7. Extract language-related knowledge from the environment **1:13:55*
-Grounded language learning at DeepMind: towards language understanding in a situated agent
*8. To conclude **1:27:18*

leixun
Автор

This is hands down, The best explanation of Transformers!

prakhyatshankesi
Автор

Thank you very much for taking the time to prepare this incredible lecture series! #respectfrombrazil 🇧🇷

martinho
Автор

Thank you! This is a great series of lectures!

seremetvlad
Автор

Is the picture at 37:12 correct? Because, if we take a small amout of the value of each of the other words, plus the value of the word "beetle" to the next layer, then for me the v term from the word "the" should be connected to lambda1 and not the v term for the word "beetle". The same logic should be applied to the other words and their lambdas.

lukn
Автор

Thank you so much for the very informative lecture!

khadijakhaldi
Автор

I got Covid from 15:28 lol
Great lectures btw, huge thanks to DeepMind and UCL!

abdurrezzakefe
Автор

One of the best lectures in the series.

luksdoc
Автор

Thanks Felix! You're a great teacher. That's it.

iamjameswong
Автор

Great lecture and big thanks to DeepMind for sharing this great content.

lukn
Автор

It's really informative, thank you. There is only one noticeable failure - it is not a fruit fly on the picture :)

kirillazhitsky
Автор

Impressive effort has been done in preparation regarding lecture. Thanks for sharing the knowledge and research.

markusbuchholz
Автор

Looks like Linus Sebastian is taking the lecture :D

gaurav
Автор

Thank you for the amazing lecture. Why are there only feedforward, but not feedback mechanisms in language models? Would that make a difference? We process language both bottom up and top down. Our expectation of the world, our beliefs of people's intentions can influence how we process a sequence of sound, just like how topdown processes make us hallucinate certain aspects of vision. The skip level connections allow lower down information to feedback up, but does not allow higher level representations to influence representation lower down, at least not at inference time. Would it be possible to have such a structure in Transformers? Would it help?

YeTianlinguist
Автор

Amazing explanation of the Transformer, thanks so much

fgh
Автор

I'm completely lost. Is this a graduate level course?

ながれる季節
Автор

can anybody post the paper at the end where it says McClelland et 2019

cuenta
Автор

1:27:57 "We've reached the end of the lecture, because I urgently need to go now…"

Автор

Not easy to follow the exact steps with the visualization and explanation provided. I think more detail would be helpful.

orjihvy