Local UNLIMITED Memory Ai Agent | Ollama RAG Crash Course

preview_player
Показать описание
Learn how to create powerful Ai agents with Python in this easy to follow along crash course on Ollama RAG. In this video we build a RAG agent that stores every conversation in a PostgreSQL database, converts the SQL database into a vector embedding database on program startup using Ollama to run the open source Nomic embedding model and then logically retrieves multiples needles from the haystack of context before generating a response with the open source LLM of your choice.

#ai #aiagents #programming
Рекомендации по теме
Комментарии
Автор

I'm making an AI system named E.C.H.O. using a ton of ideas from your videos keep it up!!

snaxsammy
Автор

Absolutely Awesome, the pace of presentation is perfect.

gsanjeevkumar
Автор

I've never actually coded in python, and I was able to get this workin perfectly, (had to hitup claude 3.5 a few times in cursor) but it's a pretty cool feeling to complete this without any bugs, great pacing, great code, and great visual instructions! now to actually learn the code...

FightFlixTv
Автор

You have no idea the hell i went thru vectorizing my data a year ago... Thank you for invalidating a week of my life. 😅

UKnowIfUKnow
Автор

I absolutely love how your explanation is so perfect.

BuPhoonBaba
Автор

I'll try to build gui and streamline the installation as much as possible, will share the code later - if it's successful :) If not ... will keep using the CMD. Great job and thank you for sharing

actepukc
Автор

local LLMs would totally transform Accessibility. Instead of beating down web developers to make their sites accessible; you can have an agent that can see the screen.

robfielding
Автор

You are a master and maestro! Finally got to install and get postgresql going. Previously avoided as most tubers recommended using Docker and complicated terminal installation in Linux.
Managed to type it out and works like a charm.
Just one note - stream_response func() always has that store_conversations(), so I commented out scripts related '/memorize'. Kinda redunctant I feel for me.
Issuing /recall as prefix to your prompt and without makes this so flexible at times want to chat without using memory (psql).
Big Thanks to you once again Ai Austin.

ginisksam
Автор

All i can say is thank you for your time

zwelimdlalose
Автор

I just found your channel, great content.

OnigoroshiZero
Автор

I really liked your vid. I work with this stuff a bunch, and this sparks all kinds of ideas. Thank you for sharing! p.s. Man you deliver that stuff fluff free :o awesome.👍

michaelandersen
Автор

Wow I’ve done some similar stuff to what you are showing here but with different technologies. I must say this looks way more advanced.

Michael-nooe
Автор

A saint of the arcane, you have saved me many hours of research

j.h.oldman
Автор

Excelente trabajo maestro, demasiado bueno como para ser verdad. Muchas gracias.

FredyGonzales
Автор

@Ai Austin > in about 18:00 you are setting the 'system prompt'... Does't Ollama models have it's own? Doesn't it means there will be 2 system prompts passed in the query?

yngeneer
Автор

/memorize thanks man! It's working perfectly on my side

JoanApita
Автор

Bruh. Subscribed. Incredible tutorial. Thank you.

JR-kwsd
Автор

Have you heard of Docker? A simple docker-compose would have saved you a lot of pain.

codeman-dev
Автор

This is so valuable. I'm considering your subscription based on the quality of this video! I have a dumb beginner question, but could this RAG module plug into LangChain if I wanted to architect a more complex agentic workflow? I'll be parsing a lot of engineering requirements into various parameters that will make their way to various mathematical and geometrical outputs, so I'm envisioning your example as a great way to manage the main assistant, and then different flavors of this as expert agents in various disciplines within my LangChain graph. Not looking to do any finetiningof models, as we are a small team and I anticipate the LLM capabilities to continue to grow, so I'd like to future-proof my design wiht a lot of multi shot learning near my system's final outputs. -I2C_Jason

ic_jason
Автор

Hi, this tutorial is amazing. I have one question though: in the video you made a function to set the chromadb database for the embeddings but you did not make a function to update that database and sync it with the postgresql server updates. That means that whatever conversations the user has before ending the session will NOT be used in recalling. As I said, the chromadb database is only set before the loop and is not updated during the conversation (even though the sql server is being updated). I made an update function just in case someone wants to do that:

def update_vector_db():
conn = connect_db()
vector_db =
with as cursor:
chroma_ids = vector_db.get()['ids']
max_id = max(int(id) for id in chroma_ids) if chroma_ids else 0
cursor.execute('SELECT * FROM conversations WHERE id > %s ORDER BY id', (max_id, ))
new_conversations = cursor.fetchall()

conn.close()
for convo in new_conversations:
serialized_convo = f"prompt: {convo['prompt']} response: {convo['response']}"
response = ollama.embeddings(model='nomic-embed-text', prompt=serialized_convo)
embedding = response['embedding']
vector_db.add(
ids=[str(convo['id'])],
embeddings=[embedding],
documents=[serialized_convo]
)
print(f"Added {len(new_conversations)} new conversations to the vector database.")

omarfargally