5.1 Attention

preview_player

Показать описание

Dive into Deep Learning

5. Attention and Transformers
5.1 Attention Mechanism
5.2 Transformers
5.3 BERT and friends

Рекомендации по теме

Комментарии

Great video and I really like the d2l.ai project :)
My right ear enjoyed it the most (maybe switch to mono audio in the future)

floriandonhauser

Is there a mistake at 5:50? On that slide alpha takes as input both k and v, while in the computation of o it takes only k as input. On the following slide the v are suddenly dropped.

Apart from that great and simple explanation!

sebastianhofer

= 30 letters, and we can extend these words.

gren