RAG vs. CAG: Solving Knowledge Gaps in AI Models

preview_player
Показать описание


What if your AI can't answer who won the Oscars last year? 🎥 Martin Keen explains how RAG (Retrieval-Augmented Generation) and CAG (Cache-Augmented Generation) address knowledge gaps in AI. 🚀 Discover their strengths in real-time retrieval, scalability, and efficient workflows for smarter AI systems. 💻

#retrievalaugmentedgeneration #aiworkflow #machinelearning
Рекомендации по теме
Комментарии
Автор

Great job asking questions at the end. Validating what we learned from the video (esp. any new knowledge that we did not know earlier) makes the videos all the more fulfilling.

vishalmishra
Автор

When speaking about accuracy, I think it's important also to mention that larger context window usually decrease the Model accuracy because it tends to remember mainly the beginning and the end of the context.
So with CAG, growing your knowledge DB will impact negatively the accuracy of the LLM model, while with RAG it remains constant.

And also there is price that grows with the context windows...

In my opinion the only good reason to go with CAG is simplicity of implementation. Building a good RAG system can be quite complex, but CAG is very simple and straight forward. For a MVP or a simple product CAG might do the job.

wantstofly
Автор

A downside of leaning on a massive context window is that transformer architectures have a thing called quadratic complexity: every time you double the tokens the resources (like GPUs) can go up 4X. Plus, long context windows tend to forget the middle of their context. So, use the right tools for the right job....one more gotcha, the data that is pulled in for a specific conversation is only germane to THAT 'conversation.' So, other users - and often other conversations by that first user - do not benefit from either RAG or CAG - at least not out of the box. So, the notion of fine-tuning/training the model on the newer/needed data could be on an option. BUT, you then change the core model(s) - and maybe have a library of models and their versioning to manage => more LLMOps.

scottybb
Автор

You guys always explain things very clearly!

SimonBransfieldGarth
Автор

Absolutely love imagery for describing these flows. Wonderfully broken down into simplified explanations. Love your work. Keep it up!

farhanprine
Автор

I was always wondering why all these technical board representation videos I've seen seem to feature a left handy prepresenter who can also write reversely, from right to left. Then I just realise today that they must have filmed a right handy person who just writes normally and flip the video horizontally...

raistlinmajere
Автор

CAG will have higher costs for API to LLM per query right? due to tokens count always being higher.

rickharold
Автор

By far the best and most down-to-earth explanation of these complex concepts

ivozlatanov
Автор

Great presentation. I knew the concepts from experience with llms but had no words for it or the exact detail. So this was helpful to put stuff into perspective.

NeoStarImpact
Автор

Dumping huge amounts of data on LLMs, as they have a huge content window, may not be a good approach. I heard that LLMs will work effectively only when we provide smaller highly relevant data even though they offer a large context window.

QPT
Автор

When the *_data is TOO BIG_* for a context window is
exactly when you *_need RAG_* and apparently
exactly when *_CAG fails_* since the data
is TOO BIG to "put it all the data into the context window"

ThePresentFuture
Автор

CAG + RAG seems like a WINNING combo.
WINNERS!!!

jeffg
Автор

Do we really need a new arch term for one shotting an LLM with a large context window? I feel like RAG was just an answer to the shortfalls of defaulting to CAG before the term came out.

Mr.Andrew.
Автор

I love this style of explanation via writing on glass. I'll guess that this video is technically flipped 180 on it's Y axis, which would allow the presenter to write normally on the glass (instead of having to write backwards).

CarletonTorpin
Автор

You're a great teacher. Thank you for this 🙏🏽

emcquesten
Автор

Wonderfully explained, Sir! Thanks a bunch!

mamunahmed
Автор

❤ Martin - Thank you for the continual demystification of the constantly changing capabilities of AI. BTW, I thought yelling at the screen was a normal activity!

robert.murray
Автор

one of the very few videos whose like counter increases during watching !!!

imMavenGuy
Автор

Redis can be used everywhere on these stacks. From semantic cache to short-term memory.

Its the fastest vectordb in the market

gacerioni
Автор

this is actually used in LLMS chat like gpt CAG+RAG
RAG (Retrieval Augmented Generation) fetches relevant information from external sources (documents, databases, internet) when needed.
CAG (Cache Augmented Generation) stores previously retrieved information in the current conversation context, so follow-up questions can use this information without retrieving it again.
i use it daily but never understood it!

hiimunranked
visit shbcf.ru