Fully Local RAG for Your PDF Docs (Private ChatGPT Tutorial with LangChain, Ollama, Chroma)

preview_player
Показать описание

Fully Local RAG for Your PDF Docs (Private ChatGPT with LangChain, RAG, Ollama, Chroma)

Teach your local Ollama new tricks with your own data in less than 10 minutes, using RAG with LangChain and Chroma. Completely private and FREE! Upload multiple PDFs into your vector store and create embeddings so you can query the database and provide the LLM the context it needs. This is a beginner tutorial that includes multiple examples of using PDFs in your RAG and features PyPDFLoader, RecursiveCharacterTextSplitter, Chroma, OllamaEmbeddings, and ChatOllama. Models used mxbai-embed-large and llama3.2, along with OpenAI text-embedding-3-large and gpt-4o.

Included is a bonus web scraper Python script that uses pypeteer and downloads any web page into a local PDF. Once added to the vector store you can start chatting with your entire website!

Ever wanted to chat with your PDFs or train ChatGPT on your own data? This video will show you how!

Timestamps:
0:00 - Intro
0:52 - GitHub walkthrough
3:40 - RAG
6:15 - Models
7:35 - Ingestion
10:25 - Chatbot
14:25 - Chroma
18:30 - Scraper
20:05 - QA
22:30 - Conclusion
Рекомендации по теме
Комментарии
Автор

Very nice tutorial. Gonna be super useful for all those super long PDF:s

mentos.
Автор

Great explanation of all the processes. I like how you broke out the two processes of ingestion and chat, which most tutorials on the topic don't do.

justthefactsplease
Автор

It’s fantastic! your tutorial on creating a local RAG setup with LangChain, Ollama, and Chroma for PDF docs. Amazing 😮
I was wondering if you could consider making a video that goes a bit deeper, specifically focusing on building an offline RAG UI system for PDF files.

The idea would be to create an interface where non-technical users could upload PDF files directly through the UI and then interact with ChatBolt to ask questions about the content. It would be awesome if this system could run locally without relying on OpenAI or any external APIs, using a setup with Ollama (like the llama 3 model) and Chroma for the vector database.

This would really help people who want a secure, fully offline solution without needing coding knowledge.

Have a nice one 😊

samiurrehman
Автор

obrigado pelo seu video e pelo ensino 👏👏👏👏👏👏👏👏👏👏👏

sr.modanez
join shbcf.ru