Lecture 12.1 Self-attention

preview_player
Показать описание
ERRATA:
- In slide 23, the indices are incorrect. The index of the key and value should match (j) and theindex of the query should be different (i).
- In slide 25, the diagram illustrating how multi-head self-attention is computed is a slight departure from how it's usually done (the implementation in the subsequent slide is correct, but these are not quite functionally equivalent). See the slides PDF below for an updates diagram.

In this video, we discuss the self-attention mechanism. A very simple and powerful sequence-to-sequence layer that is at the heart of transformer architectures.

Lecturer: Peter Bloem
Рекомендации по теме
Комментарии
Автор

Finally an actual _explanation_ of self-attention, particularly of the key, value and query that was bugging me a lot. Thanks so much!

derkontrolleur
Автор

Google should rank videos according to the likes and the number of previously viewed videos on the same topics: this should go straight to the top for Attention/Transformer searches because I have seen and read plenty, and this is the first time the QKV as dictionary vs RDBMs made sense; that confusion had been so bad it literally stopped me thinking every time I had to consider Q, or K, or V and thus prevented me grokking the big idea. I now want to watch/read everything by you.

MrOntologue
Автор

This is the best explanation of self-attention I have ever seen! Thank you VERY MUCH!

ArashKhoeini
Автор

Wow - only 700 views for probably the best explanation of Transformers I came across so far! Really nice work! Keep it up!!! (FYI: I also read the blog post)

constantinfry
Автор

A very clear and broken down explanation of self-attention. Definitely deserves much more recognition.

sohaibzahid
Автор

Best explanation out there, highly recommended. Thank you!

dhruvjain
Автор

Saved lots of hours with this simple but awesome explanation of self-attention, thanks a lot!

tizianofranza
Автор

This is a really excellent video. I was finding this a very confusing topic but I found it really clarifying.

Ariel-pxhz
Автор

This is the kind of content that deserves the like, subscribe and share promotion. Thank you for your efforts, keep up!

szilike_
Автор

Literally the BEST explanation of attention and transformer EVER!! Agree with everyone else about why this is not ranked higher :(
I'm just glad I found it !

workstuff
Автор

The best ever video showing how self-attention works.

thcheung
Автор

holy shit, been trying to wrap my head around self-attention for a while, but it all finally clicked together with this video.
very well explained, very good video :)

HiHi-iugf
Автор

This is the best explanation of multi-head self attention I've seen.

josemariabraga
Автор

I think one of the best videos describing self-attention. Thank you for sharing.

farzinhaddadpour
Автор

best explanation i found for self attention and multi head attention on internet, thank you sir

AlirezaAroundItaly
Автор

I have gone through 10+ videos on this, but this is the best ...hats off

sathyanarayanankulasekaran
Автор

Read the blog post and then found this presentation, what a gift!

maxcrous
Автор

I had to leave a comment, the best explanation of Query, Key, Value I have seen!

free_guac
Автор

This is the best explanation i have ever heard

davidadewoyin
Автор

Finally i have intuitive view of seld_attention . Thank you😇

Mars.