Difference between RAG and Fine-tuning LLMs explained

preview_player
Показать описание
#artificialintelligence #datascience #machinelearning #llm #python #langchain
Рекомендации по теме
Комментарии
Автор

how does RAG require less computation cost if you need to store at inference time, in the tokens all the context? Since attention is quadratic in the number of tokens, the bigger the context, the bigger the computation cost should be, or am I wrong?

meehai_