Stanford CS25: V1 I Transformers United: DL Models that have revolutionized NLP, CV, RL

preview_player
Показать описание
Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding.

In this speaker series, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this by inviting people at the forefront of transformers research across different domains for guest lectures.

0:00 Introduction
2:43 Overview of Transformers
6:03 Attention mechanisms
7:53 Self retention
11:38 Other necessary ingredients
13:32 Encoder Decoder Architecture
16:02 Advantages & Disadvantages
18:04 Applications of Transformers
Рекомендации по теме
Комментарии
Автор

Thanks for sharing this lecture. I’m looking forward to the other videos!

dailygrowth
Автор

Great lecture, please post the other lectures as well

rishabhahuja
Автор

I am very much looking forward to these amazing lectures related to Transformer!

zihanwu
Автор

Thank you so much for sharing! I hope to learn how transformers can be used in climate modeling. Also it might be useful to quickly define 'token' for newcomers to the field.

bjornlutjens
Автор

With all due respect, this seems more like end of a semester project presentation. teaching a topic is not piling up materials it is starting from high-level concept to detail model. You might want to look into Justin Johnson's slides on self-attention for a reference on how to teach these concepts.

SuperAlijoon
Автор

Thank you for your work. I hope I will build it without hugging face and transformer. I love low level work.
I am sure the world will love you guys' work 🌎🌍🌏

jonathansum
Автор

The worst 20 mins transformer introduction I have ever seen, lol. But thanks for organizing the seminar, look forward to the following sessions from the speakers.

stevehan
Автор

I was expecting a much greater depth and reasons for various constructs of the transformer.

mananshah
Автор

What does "attend a token" at 17:20 ?

andrea-mjce
Автор

Is there any chance of getting the different links from the slideshow in the video description?

qilex
Автор

You mention several times that a self attention layer performs only linear operations, that is why FFNN block is needed. But why is it linear if it contains softmax on the arbitron weights, which are itself a function of the inputs?

arefaref
Автор

This is awesome! I’m wondering will there be any assignments published?

dilyarbuzan
Автор

Will you go into applying transformers for non NLP time series data?

simonberglund
Автор

Do you have practice exercises for this course? Without practice, lectures are useless, seems like understood but when someone asks and/or apply in a project, I struggle.

ppujari
Автор

Hello will the slides be publicly available? Thank you for the very nice content.

dimitrisproios
Автор

The lecture was excellent overall. However, it would be even more effective if a single instructor led the entire session for continuity and cohesion.

manoranjansahu
Автор

Is it required to have knowledge about LSTM or attention mechanism to understand this courses bundle ?

andrea-mjce
Автор

Usually when I view courses taught by professors they speak slowly and don't assume the viewers know everything.

MohanRadhakrishnan
Автор

They should've named "Attention is all you need" as "You just want attention" :D
A missed oppurtunity

achyuthakrishnakoneti
Автор

Release more of these videos on YouTube.

chyldstudios