Local Agentic RAG with LLaMa 3.1 - Use LangGraph to perform private RAG

preview_player
Показать описание
In this video, you'll learn how to use Agentic RAG with a locally running, open-source model. We'll use the latest model, Llama 3.1 from Meta, in combination with Ollama. We'll write the agent using Langgraph and structure the code so that we can switch from Llama to OpenAI models simply by changing the value of an environment variable.

Timestamps:
0:00 Introduction
0:35 Download & Install Ollama
2:43 Agent walkthrough
3:48 Create functions for Nodes & Edges
13:20 Create Agent with StateGraph
15:30 Llama 3.1 vs GPT-4o-mini

#langgraph #llama3
Рекомендации по теме
Комментарии
Автор

I cannot tell you how thankful I am for your videos

SpenceDuke
Автор

Thanks for the video. Very educational! Would love to see you covering this topic on a bigger scale applications with unstructured KB. By the way, your Udemy course was awesome too!

sylap
Автор

Markus, you are a legend!!! This was very helpful, JUST like your other videos!!! Allow me to suggest topics for more videos. So, LangGraph is obviously great, so is LangChain, and so is LangSmith. But LangSmith isn't free. There are alternatives like Langfuse and Arize emerging. They use callbacks and I guess tracers module in langchain_core. The problem is it's rarely documented. If you can make a video, just like you did with runnable interface, for callbacks or tracers, then it will be very helpful. If you wanna provide more and go out of syllabus, you can integrate it with ELK stack or datadog or similar and replicate the logging behavior of all the chain internals or something. Anyway, I love your videos! Waiting for the next

AbhishekSingh-pjoo
Автор

Hy there! Markus, will you cover meta's new llama-agentic-system too? 🙏 plzZz, you teach like no other 🙃😊
Llama3.1 8b q8 function calls seems to work absolutely flawlessly. Thx for all your hard work!

FredericusRex-fp
Автор

like usual a great video ;) Question: wouldn't it make more sense to check the initial query for rewriting instead post retrieval ? this is my preference for 2 reasons: it allows users to not be very specific with their initial query and by rewriting the first query using the first retrieval will already be good. Using a fast smaller model like Command-R with Groq the speed is negligent.

awakenwithoutcoffee
Автор

would you provide a simple UI for this code?

artur
Автор

Great video Markus!
RAG and llama 3.1 8b worked totally fine for me! Just the token window with 128 k is a joke! Using more than 2000 characters (!) leads to unusable results. No difference to llama 3.0.
One question: why are you always return the entire state (graph nodes)? I always do the exact opposite. I just return the changed values. What is benefit? Thx

ki-werkstatt