Transformer-XL: Attentive Language Models Beyond a Fixed Length Context

Показать описание

#nlproc #transformers #attention #longcontext

Hello everyone! This is my first video, where I explain a research paper in the field of Natural Language Processing. I was inspired by @YannicKilcher and @aicoffeebreak to start explaining research papers.

Pre-requisites:

Reference:
Dai, Z., Yang, Z., Yang, Y., Carbonell, J. G., Le, Q., & Salakhutdinov, R. (2019, July). Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 2978-2988).

Check out my profile on other social media platforms -

Abhilash Nandy

Рекомендации по теме

Комментарии

Nice explanation. I am sure it will be helpful for a lot of folks out here!

avinabsaha

In standard transformer we have Q*transpose(K), could you explain in better way why authors of Transformer-XL did transpose(Xi) * transpose(Wq) + Xj + Wk ? (transpose Query instead of Key)

huskyhusky

Do you have any open research paper reading groups?

formerkid

Transformer-XL: Attentive Language Models Beyond a Fixed Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (AI Paper Summary)

PR-161: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

[Paper Review] Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer XL | AISC Trending Papers

Transformer-XL | Lecture 58 (Part 4) | Applied Deep Learning

Transformer-XL (Continued) | Lecture 59 (Part 1) | Applied Deep Learning

[DS Interface] Transformer XL : Attentive Language Models beyond a Fixed-Length Context

Transformer-XL (Q&A) | Lecture 54 (Part 3) | Applied Deep Learning (Supplementary)

Python autocompletion with a Transformer XL model

Transformers for beginners | What are they and how do they work

Transformer-XL

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Contrastive Decision Transformers

XLNet Made Easy PART 2

XLNet: Generalized Autoregressive Pretraining for Language Understanding | AISC

NLP E8: Decoder Models for Transformers

ADL TA Recitation: More on Transformers (20/04/14)

Voice Controlled robotics, using language transformer encoders and reinforcement learning

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

Transformer-XL

Transformer-based Double-token Bidirectional Autoregressive Decoding in Neural Machine Translation

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Transformer Language Model