RAG vs. CAG: Solving Knowledge Gaps in AI Models

Показать описание

What if your AI can't answer who won the Oscars last year? 🎥 Martin Keen explains how RAG (Retrieval-Augmented Generation) and CAG (Cache-Augmented Generation) address knowledge gaps in AI. 🚀 Discover their strengths in real-time retrieval, scalability, and efficient workflows for smarter AI systems. 💻

#retrievalaugmentedgeneration #aiworkflow #machinelearning

Рекомендации по теме

Комментарии

Great job asking questions at the end. Validating what we learned from the video (esp. any new knowledge that we did not know earlier) makes the videos all the more fulfilling.

vishalmishra

When speaking about accuracy, I think it's important also to mention that larger context window usually decrease the Model accuracy because it tends to remember mainly the beginning and the end of the context.
So with CAG, growing your knowledge DB will impact negatively the accuracy of the LLM model, while with RAG it remains constant.

And also there is price that grows with the context windows...

In my opinion the only good reason to go with CAG is simplicity of implementation. Building a good RAG system can be quite complex, but CAG is very simple and straight forward. For a MVP or a simple product CAG might do the job.

wantstofly

A downside of leaning on a massive context window is that transformer architectures have a thing called quadratic complexity: every time you double the tokens the resources (like GPUs) can go up 4X. Plus, long context windows tend to forget the middle of their context. So, use the right tools for the right job....one more gotcha, the data that is pulled in for a specific conversation is only germane to THAT 'conversation.' So, other users - and often other conversations by that first user - do not benefit from either RAG or CAG - at least not out of the box. So, the notion of fine-tuning/training the model on the newer/needed data could be on an option. BUT, you then change the core model(s) - and maybe have a library of models and their versioning to manage => more LLMOps.

scottybb

You guys always explain things very clearly!

SimonBransfieldGarth

Absolutely love imagery for describing these flows. Wonderfully broken down into simplified explanations. Love your work. Keep it up!

farhanprine

I was always wondering why all these technical board representation videos I've seen seem to feature a left handy prepresenter who can also write reversely, from right to left. Then I just realise today that they must have filmed a right handy person who just writes normally and flip the video horizontally...

raistlinmajere

CAG will have higher costs for API to LLM per query right? due to tokens count always being higher.

rickharold

By far the best and most down-to-earth explanation of these complex concepts

ivozlatanov

Great presentation. I knew the concepts from experience with llms but had no words for it or the exact detail. So this was helpful to put stuff into perspective.

NeoStarImpact

Dumping huge amounts of data on LLMs, as they have a huge content window, may not be a good approach. I heard that LLMs will work effectively only when we provide smaller highly relevant data even though they offer a large context window.

QPT

When the *_data is TOO BIG_* for a context window is
exactly when you *_need RAG_* and apparently
exactly when *_CAG fails_* since the data
is TOO BIG to "put it all the data into the context window"

ThePresentFuture

CAG + RAG seems like a WINNING combo.
WINNERS!!!

jeffg

Do we really need a new arch term for one shotting an LLM with a large context window? I feel like RAG was just an answer to the shortfalls of defaulting to CAG before the term came out.

Mr.Andrew.

I love this style of explanation via writing on glass. I'll guess that this video is technically flipped 180 on it's Y axis, which would allow the presenter to write normally on the glass (instead of having to write backwards).

CarletonTorpin

You're a great teacher. Thank you for this 🙏🏽

emcquesten

Wonderfully explained, Sir! Thanks a bunch!

mamunahmed

❤ Martin - Thank you for the continual demystification of the constantly changing capabilities of AI. BTW, I thought yelling at the screen was a normal activity!

robert.murray

one of the very few videos whose like counter increases during watching !!!

imMavenGuy

Redis can be used everywhere on these stacks. From semantic cache to short-term memory.

Its the fastest vectordb in the market

gacerioni

this is actually used in LLMS chat like gpt CAG+RAG
RAG (Retrieval Augmented Generation) fetches relevant information from external sources (documents, databases, internet) when needed.
CAG (Cache Augmented Generation) stores previously retrieved information in the current conversation context, so follow-up questions can use this information without retrieving it again.
i use it daily but never understood it!

hiimunranked

RAG vs. CAG: Solving Knowledge Gaps in AI Models

RAG vs. CAG: Solving Knowledge Gaps in AI Models

RAG vs. CAG: Solving Knowledge Gaps in AI Models – Architecture Comparison & Use Cases

What is Cache Augmented Generation (CAG) - CAG vs RAG

RAG vs. Fine Tuning

What is Retrieval-Augmented Generation (RAG)?

Goodbye RAG - Smarter CAG w/ KV Cache Optimization

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

RAG vs. Long Context Models: Is Retrieval-Augmented Generation Dead?

RAG Explained

KAG Framework SMASHES GraphRAG in Accurate Knowledge Generation

What is RAG ? #codebasics #data #datascience #ai #dataanalyst

What is RAG? Tech explained simply #tech #technology

What is Agentic RAG?

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

What is KAG ? Better alternate for RAG and GraphRAG

RAG vs CAG: The BIG Difference! | What is RAG? | What is Context-Augmented Generation? (Hindi 2025)

KAG Graph + Multimodal RAG + LLM Agents = Powerful AI Reasoning

Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer

HybridRAG: Ultimate RAG Engine - Knowledge Graphs + Vector Retrieval! Better Than GraphRAG!

Cache Augmented Generation or CAG #artificialintelligence #ai #rag #cag #ai #aimodel #ml #system

The only video you need to Master RAG in N8N (For complete beginners)

What is a Vector Database? Powering Semantic Search & AI Applications

RAG + Langchain Python Project: Easy AI/Chat For Your Docs

Is Agentic RAG A Game Changer?