filmov
tv
Improving Language Models by Retrieving from Trillions of Tokens | NLP Journal Club

Показать описание
Abstract: We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. With a 2 trillion token database, our Retrieval-Enhanced Transformer (RETRO) obtains comparable performance to GPT-3 and Jurassic-1 on the Pile, despite using 25× fewer parameters. After fine-tuning, Retro performance translates to downstream knowledge-intensive tasks such as question answering. Retro combines a frozen Bert retriever, a differentiable encoder and a chunked cross-attention mechanism to predict tokens based on an order of magnitude more data than what is typically consumed during training. We typically train Retro from scratch, yet can also rapidly RETROfit pre-trained transformers with retrieval and still achieve good performance. Our work opens up new avenues for improving language models through explicit memory at unprecedented scale.
Improving Language Models by Retrieving from Trillions of Tokens | NLP Journal Club
RETRO: Improving language models by retrieving from trillions of tokens
RETRO: Improving Language Models by Retrieving from Trillions of Tokens
The Illustrated Retrieval Transformer
Stanford CS25: V3 I Retrieval Augmented Language Models
How Large Language Models Work
[Paper Review] Improving Language Models by Retrieving from Trillions of Tokens
Experience Grounds Language: Improving language models beyond the world of text
Building with Small Language Models (SLMs)
PR-379: Improving language models by retrieving from trillions of tokens
New Prompt Achieves 🚀 900% Logic & Reasoning Improvement (GPT-4)
Ofir Press | Complementing Scale: Novel Guidance Methods for Improving Language Models
Learning to Retrieve In-Context Examples for Large Language Models
Feed Your OWN Documents to a Local Large Language Model!
GPT-1 | Paper Explained & PyTorch Implementation
#100 Dr. PATRICK LEWIS - Retrieval Augmented Generation
[QA] Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
Are Bigger Language Models Better? | DeepMind Gopher and RETRO
How to Answer Any Question on a Test
Retrieval-Augmented Generation (RAG) | Improve the performance of large language models (LLMs)
ColPali: Bringing Vision Language Models to Document Retrieval
Building Better Large Language Models - Key Concepts for Prompting and Fine Tuning
[1hr Talk] Intro to Large Language Models
Andrew Ng's Secret to Mastering Machine Learning - Part 1 #shorts
Комментарии