Fully local RAG agents with Llama 3.1

Показать описание

With the release of Llama3.1, it's increasingly possible to build agents that run reliably and locally (e.g., on your laptop). Here, we show to how build reliable local agents using LangGraph and Llama3.1-8b from scratch. We build a simple corrective RAG agent w/ Llama3.1-8b, and compare its performance to larger models llama3-70b, gpt4-o. We test our Llama3.1-8b agent on a corrective RAG challenge, and show performance and latency versus a few competing models. On our small / toy challenge, Llama3.1-8b performs on par w/ much larger models w/ only slightly increased latency. Overall, Llama3.1-8b model is a strong option for local execution and pairs well with LangGraph to implement agentic workflows .

Blog post:

Ollama:

Code:

LangChain

Рекомендации по теме

Комментарии

The link to code on GitHub is broken. Can you please fix it?

MansA-nl

I think there is a missing step in the rag flow. If the user knows he is "talking" to some documents, he might prompt "What is the summary?". In this case, the grader will always answer "No", and the web search will be useless. There has to be an additional step to evaluate the question - if it is similar as the one I just mentioned, then you would simply fetch a block of text from those documents and send to the LLM to summarize.

I have a bert classifier built specially for this on HF:

cnmoro

Llama likely will make its way into online Ai products (it already does). But until someone builds a one click Llama download and install, the general public will likely never run a local Ai. And they will certainly never jump deep into coding just to build simple agents. It is just way over the heads of most general computer users. And if one click instal is not done soon, people will gravitate towards the online subscription proprietary Ai offerings (OpenAi, Claude, Gemini, etc) and never look back.
I think that is what really killed Linux in competing in the OS space. Mac and Windows, even in the 90's was basically a one click installation process whereas Linux used command line, bin this bash that and installation was cumbersome and not easy....I know because I did it in the mid 90's. People will go for what is easy (Mac, Windows, an online Ai model) and once they are hooked into a particular Ai model, it will be darn hard to get them to change.
It's a real shame because Llama is a pretty terrific LLM but local installation is just a nightmare for the majority of the general computing public.

dbreardon

IMAGE EMBEDINGS NOT WORKING - TEXT FINE - BRO LLAVA RAG MULTIMODAL PLS

antonpictures

few thing have to change:

1. use :
from langchain_ollama import OllamaEmbeddings # from langchain_nomic.embeddings import NomicEmbeddings
....
embedding=OllamaEmbeddings(model='nomic-embed-text'),

MohammedAlshayeb-rm

Is it just me or is the code not accessible?

qactus

Can I run 405B LLMs with 8gb of ram? 🤣

farnsworth

I really like the langmith test section in the package. Great job!

nachoeigu

I got an error info as " ValueError: Node `retrieve` is not reachable" while running the example code.
Who can help me to figure out what happened?

kaneyxx

Why always openai embedding? Why not used faiss and open source one instead 😊

DhirajPatra

Thanks so much for this open information ❤

chukwuinnocent

Thanks for your great video. Could you recommend llama3.1 also for RAG based on documents in german language? All the time when we tried this, the results were much worse compared to using LLMs like gpt-4o or gpt-4o-mini. And could you explain why you are using the OpenAI embeddings? If I want to use this demo as a RAG app for asking questions to local document, do I only have to replace the WebBaseLoader by a DocumentLoader?

uwegenosdude

Thank you for sharing this informative video and its hands-on code. I've faced this connection error="Error running target function: [WinError 10061] No connection could be made because the target machine actively refused it" can anyone please guide me how to fix it?

ElaheKhatibi-qj

Any chance you might compare hosted api Llama 3.1 405b and Mistral Large 2 123b on same evals?

aaagaming

I find the fascination with parameter numbers boring. What I would like is a way to measure how much data a model can *hold*. Is there anything like that out there?

malikrumi

I would like to see a sample of how to use this to elaborate large size text that follow a structure or script... without losing coherence by re-evaluating the progress

alitomix

OOhh nice! I had some issues with llama3-groq-tool-use even after pulling with ollama and trying, it kept returning an empty list instead of the actual tool calls. Just tested this code though and it works great! Love it! Thanks!!! Love the videos from the channel!

automatalearninglab

thanks for this! Now get off the toilet and go put some clothes on

Ronaldograxa

Nice. I need to test this as well on more complicated agents setup, I had a case were some models would not complete, run into loops, having too many errors trying to call tools ... have to give 3.1 a go at it.

jwickerszh

I really like the way you explain it, makes it easy to learn the concepts.

aaagaming

Fully local RAG agents with Llama 3.1

Fully local RAG agents with Llama 3.1

Reliable, fully local RAG agents with LLaMA3.2-3b

Reliable, fully local RAG agents with LLaMA3

ADVANCED Python AI Agent Tutorial - Using RAG

✅ Easiest Way to Build AI Agents With RAG & CrewAI Locally

Local UNLIMITED Memory Ai Agent | Ollama RAG Crash Course

Understanding RAG Agents with Pinecone and LangChain

RAG In A Box: Run Your Own Local LLM Agent with Access To Your Personal Knowledge Base Using RAG

Python AI Agent Tutorial - Build a Coding Assistant w/ RAG & LangChain

Adding RAG to LangGraph Agents

RAG in 2024: Advancing to Agents

Real time RAG App using Llama 3.2 and Open Source Stack on CPU

Dynamic AI Agents with LangGraph, Prompt Engineering Enhancements + RAG

This RAG AI Agent with n8n + Supabase is the Real Deal

Build a Reliable RAG Agent That Can Scrape Any Website !!!

Llama2 with RAG and AI Agent Demo

Build your own RAG agent by Mykola Zaiets

FREE: Verba Ollama GUI RAG Agent!😱 A.I. RAG By Weaviate (Open Source)!🤖🚀 Ollama Support - Llama 3.1B...

Building an Agentic RAG locally with Milvus, Ollama and Llama Agents

Advanced AI Agents with RAG

Building and deploying multi-agent RAG systems in 2024 with LlamaIndex

RAG-App: Agentic RAG Engine - Semantic Search, AI Agents, Embeddings, Vector Search, & More!

CREATE Your Own AI App with Llama 3.2 Locally Today!

End to End Multi AI Agents RAG With LangGraph AstraDB And Llama 3.1