Reliable, fully local RAG agents with LLaMA3.2-3b

Показать описание

LLaMA3.2 has released a new set of compact models designed for on-device use cases, such as locally running assistants. Here, we show how LangGraph can enable these types of local assistant by building a multi-step RAG agent - this combines ideas from 3 advanced RAG papers (Adaptive RAG, Corrective RAG, and Self-RAG) into a single control flow using LangGraph. But we show LangGraph makes it possible to run a complex agent locally.

Code:

Llama3.2:

Full course on LangGraph:

LangChain

Рекомендации по теме

Комментарии

Dude. You're the man. I've gone through most of your LangChain course and lots of the YT content. You're ... you have a knack for teaching.

christopherhartline

Awesome stuff. Langgraph is a nice framework. Stoked to build with it, working through the course now!

ytaccount

Great explanation, would be great to do one more tutorial using multimodal local RAG, considering the different chunks like tables, texts, and images, where you can use unstructured, Chroma, and MultiVectorRetriever completely locally.

homeandr

Amazing session and content explained very nicely in just 30 mins; Thanks so much

ravivarman

The tutorial was "fully local" up until the moment you introduced Tavily 😜😉.
Excellent tutorial Lance 👍

leonvanzyl

You are amazing, like always. Thank you for sharing

joxxen

Why did you use lama3.2:3b-instruct-fp16 instead of lama3.2:3b?

becavas

how to do these in .py files ? I mean since we are doing in jupyter, we can re-run the graph and re invoke. But when I am doing this in .py files, for each invocation, the whole graph is getting recreated and then getting compiled again and invoked.

asitnayak

Great Tutorial, so first of all: Thank you :-) But anyway i am not lucky with the results. Here is one example:
On a PDF which contains information on a filling machine i asked the question "How is the machine emptied":
The generated answer (simplified for this example) was "Step 1: do this, Step 2: do that, Step 4: do that".
And the answer grader decided that the answer is good. I expected that the answer grader should claim, that Step 3 is missing :-(

is it possible to make agent that when provide with few hundred links extracts info in all links and store it

developer-he

Question: You have operator.add on the loopstep, but tnen increment the loopstep in the state too… am i wrong in that it would then incorrect?

beowes

If different tools require different key word arguments, how can these be passed in for the agent to access?

sidnath

I'm a med student interested in experimenting with the following: I'd like to have several PDFs (entire medical books) from which I can ask a question and receive a factually accurate, contextually appropriate answer—thereby avoiding online searches. I understand this could potentially work using your method (omitting web searches), but am I correct in thinking this would require a resource-intensive, repeated search process?

For example, if I ask a question about heart failure, the model would need to sift through each book and chapter until it finds the relevant content. This would likely be time-consuming initially. However, if I then ask a different question, say on treating systemic infections, the model would go through the entire set of books and chapters again, rather than narrowing down based on previous findings.

Is there a way for the system to 'learn' where to locate information after several searches? Ideally, after numerous queries, it would be able to access the most relevant information efficiently without needing to reprocess the entire dataset each time—while maintaining factual accuracy and avoiding hallucinations.

marcogarciavanbijsterveld

Is there an elegant way to handle recursion errors?

hari

You make LLM to do all hard work for candidates filtering

hensonk

That's a great tutorial that shows the power of LangGraph. It's impressive you can now do this locally with decent results. Thank you!

SavvasMohito

Thanks it is indeed very cool. Last time you used 32Gb, do you think this will run with 16Gb? memory.

davesabra

Thanks for the video and sample putting all these parts together. What did you use to draw the diagram at the beginning of the video? Was it generated by a DSL/config?

AlexEllis

Great video. What tool did you use to illustrate the nodes and edges in your notebook?

Togowalla

Interesting, you basically use old school workflow to orchestrate the steps of LLM based atomic tasks. But what about to let the LLM to execute the workflow and also to perform all required atomic tasks? That would be more like agentic approach...

arekkusub

Reliable, fully local RAG agents with LLaMA3.2-3b

Reliable, fully local RAG agents with LLaMA3.2-3b

Reliable, fully local RAG agents with LLaMA3

Fully local RAG agents with Llama 3.1

What is Retrieval-Augmented Generation (RAG)?

Easy 100% Local RAG Tutorial (Ollama) + Full Code

Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)

What is Agentic RAG?

Learn RAG From Scratch – Python AI Tutorial from a LangChain Engineer

Build Humanized, Safe & Scalable AI Agents in Minutes | Lyzr Agent Studio Demo

Local UNLIMITED Memory Ai Agent | Ollama RAG Crash Course

This RAG AI Agent with n8n + Supabase is the Real Deal

Build a Talking Fully Local RAG with Llama 3, Ollama, LangChain, ChromaDB & ElevenLabs: Nvidia S...

RAG Explained

Building Production-Ready RAG Applications: Jerry Liu

RAG vs. Fine Tuning

Python RAG Tutorial (with Local LLMs): AI For Your PDFs

Local Retrieval Augmented Generation (RAG) from Scratch (step by step tutorial)

I Ditched Traditional RAG for Agentic RAG and Got SHOCKING Results!

ADVANCED Python AI Agent Tutorial - Using RAG, Langflow & Multi-Agents

How RAG Turns AI Chatbots Into Something Practical

Run All-in-One Local AI Infrastructure In MINUTES! (LLMs, RAG & More)

Make Your RAG Agents Actually Work! (No More Hallucinations)

No Code RAG Agents? You HAVE to Check out n8n + LangChain

How Does Rag Work? - Vector Database and LLMs #datascience #naturallanguageprocessing #llm #gpt