Extending Context Window of Large Language Models via Positional Interpolation Explained

Показать описание

Рекомендации по теме

Комментарии

man I just love you for sharing and such easy to understand explanation

pengbo

Interesting paper, I am having similar problem while training on 300 long sequences I need to extend to 1000 and I am using RoPE. Do you know if this interpolation can be used with RoPE, or should I look into something like ALiBI ? I recall I was reading ALiBi also has some issues and accuracy is worse. There is also LongRoPE.

mateuszk

Thank you for another great video! 🙏
Does this also work for ALiBi?

Skinishh

Amazing explanation, I am thinking u r doing a PhD. do u have any idea how we can implement this method in code to finetune llama2? any resource is appreciated.

kibrutemesgen

Question : The model can still take only 2048 tokens, so we still have to chunk a 4096 tokens in two blocks, right? PI only deals with modifying the positional embedding. It cannot help with the fact that the attention is still on a window of 2048 tokens.

HarisJabbar

Extending Context Window of Large Language Models via Positional Interpolation Explained

Extending Context Window of Large Language Models via Positional Interpolation Explained

Extending Context Window of Large Language Models via Position Interpolation

LongRoPE: Expanding Context Window to 2M Tokens for Advanced Language Models

Extend context window from 4k to 128k tokens | New Large Language Models (LLMs) Paper

Ep 5. How to Overcome LLM Context Window Limitations

Extending the Context Window of LLaMA Models

Paper Club with Vahan - YaRN: Efficient Context Window Extension of Large Language Models

LLama-2 7B: 400K context length - Beyond Limits?

Using Claude 3.5 to improve a Flutter random workout application

Self-Extend LLM: Upgrade your context length

RoPE Rotary Position Embedding to 100K context length

How does GPT4's context window work

YaRN Mistral-7B: Largest Context Window EVER?! 4x More Than GPT-4!

YaRN: Efficient Context Window Extension of Large Language Models

Extending Context Windows in LLMs with Position Interpolation

The Context Window Paradox with LLMs

[short] LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Google just Solved the Context Window Challenge for Language Models ?

Does the size of your context window matter?

AI LLM with Largest Context Window? | Trivia #33

Why Do LLM’s Have Context Limits? How Can We Increase the Context? ALiBi and Landmark Attention!

Are Larger Context Window Sizes RAG Killers?

HUGE 🔥 Llama 2 with 32K Context Length

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens