REALM: Retrieval-Augmented Language Model Pre-Training (Research Paper Walkthrough)

preview_player
Показать описание
#languagemodel #realm #nlproc
⏩ Abstract: Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16% absolute accuracy), while also providing qualitative benefits such as interpretability and modularity.

Please feel free to share out the content and subscribe to my channel :)

⏩ OUTLINE:
0:00 - Background and Overview of REALM
04:07 - REALM's generative process
04:55 - Knowledge Retriever
07:02 - Knowledge-Augmented Encoder
09:07 - Understanding Pre-training and Fine-tuning Pictorially
10:44 - Training Challenges
12:03 - Maximum Inner Product Search (MIPS) working
14:07 - What does the retriever learn?
15:38 - Salient Span Masking
16:34 - Null Document
17:01 - Prohibiting trivial retrievals
17:48 - Initialization and Inverse Cloze Task

⏩ Paper Title: REALM: Retrieval-Augmented Language Model Pre-Training
⏩ Author: Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang
⏩ Organisation: Google Research

*********************************************
If you want to support me financially which totally optional and voluntary ❤️

*********************************************
*********************************************

Tools I use for making videos :)

#techviz #datascienceguy #naturllanguageprocessing #opendomain_qa #researchpaper #arxiv
About Me:
I am Prakhar Mishra and this channel is my passion project. I am currently pursuing my MS (by research) in Data Science. I have an industry work-ex of 3 years in the field of Data Science and Machine Learning with a particular focus on Natural Language Processing (NLP).
Рекомендации по теме
Комментарии
Автор

initial 2 minutes intuition is very helpful to get idea of this paper.
Thanks

DarshanTank-co
Автор

Siga postando os videos! Te desejo toda sorte com o canal! Continue postando os videos!

lhe desejo toda sorte com o teu canal!

gugapilar
Автор

Can REALM be used for fact extraction from documents?

swayattadaw