5.1 Attention

preview_player
Показать описание
Dive into Deep Learning

5. Attention and Transformers
5.1 Attention Mechanism
5.2 Transformers
5.3 BERT and friends
Рекомендации по теме
Комментарии
Автор

Great video and I really like the d2l.ai project :)
My right ear enjoyed it the most (maybe switch to mono audio in the future)

floriandonhauser
Автор

Is there a mistake at 5:50? On that slide alpha takes as input both k and v, while in the computation of o it takes only k as input. On the following slide the v are suddenly dropped.

Apart from that great and simple explanation!

sebastianhofer
Автор

= 30 letters, and we can extend these words.

gren