Generative AI / LLM - Document Retrieval and Question Answering

Показать описание

With Large Language Models (LLMs), we can integrate domain-specific data to answer questions. This is especially useful for data unavailable to the model during its initial training, like a company's internal documentation or knowledge base.

This article helps you understand how to implement this architecture using LLMs and a Vector Database. We can significantly decrease the hallucinations that are commonly associated with LLMs.

Check out my Generative AI Series:

Generative AI — Getting Started with PaLM 2

Generative AI — The Evolution of Machine Learning Engineering

Generative AI — Best Practices for LLM Prompt Engineering

Generative AI — Document Retrieval and Question Answering with LLMs

Generative AI — Mastering the Language Model Parameters for Better Outputs

Generative AI - Understand and Mitigate Hallucinations in LLMs

If you enjoyed this video, please subscribe to the channel ❤️

🎉 Subscribe for Article and Video Updates!

You can find me here:

If you or your company is looking for advice on the cloud or ML, check out the company I work for.
We offer consulting, workshops, and training at zero cost. Imagine an extension for your team without additional costs.

#vertexai #googlecloud #machinelearning #mlengineer #doit

▬ My current recording equipment ▬▬▬▬▬▬▬▬

Support my channel if you buy with those links on Amazon

▬ Timestamps ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
00:00 Introduction
00:27 Architecture
01:16 Languages
01:45 Indexing vs. Tuning
03:33 Langchain
04:19 Data
04:40 Get data
05:10 LangChain chunking
06:02 Embedding and Vectordatabase
11:06 LLM answers questions
13:00 Bye

ML Engineer

Рекомендации по теме

Комментарии

Excellent walk-through, I'll have to give it a try. Thank you very much.

kenchang

I'm a junior engineer intern for a startup called Radical AI and I am doing exactly this process right now lol. You do a better job of explaining everything than my seniors.

christopheryoungbeck

Great video. what modifications should be done to run queries on public index (not under VPC)?

jcypwwo

Hey Sasha, thanks a lot for making this! It would be great to learn if there's an easy way to embed documents in Firebase. Would be extremely useful to have a workflow where embeddings are generated for each document when it's changed (e.g. a user updates content on an app) so that the query is always matched against realtime data sources.

I was also wondering if there's a way to do a semantic search query combined with regular filtering on metadata (e.g. product prices, size, etc). Would love to see a follow up tutorial on this in the future :)

itsdavidalonso

Great Video !! Does the LLM gets trained here ? This is a major doubt here. Or is it just used as an engine for answering based on the embeddings and similarity ?

shivayavashilaxmipriyas

Hi! Thanks for making this walkthrough, it was super helpful as a beginner. I was able to follow all the steps you detailed, however, when I try running the final product it produces the same context every time - regardless of the question I prompt. Do you have any idea why that might be? Thanks in advance!

campbellslund

Hi! Great video!
Is there a way that we can limit the model to respond only to the data that we gave him? Thank you!

vasilmutafchiev

Hey,

Thanks for the great content! I had 2 questions:

With this setup, what needs to happen if you want to add new data to the vectore store?
First we chunk the new document and create new embeddings and upload to the GS bucket, is that all or does something need to happen with the Matching Engine / Index?

Other question, do you know if the LangChain Javascript library has any limitations in this use case?

TarsiJMaria

7:38 for the embedding, is the data sent off the computer? It seems like it if you are using retries. If so, is there any way to completely contain this process so that no data leaves the machine? This would be relevant to at least the embedding, vector DB, and LLM prompts. Thank you

d_b_

Just noticed this channel. Great content, with code walkthroughs. Appreciate your effort!!

Have got a question @ml-engineer :
Is it possible to Question-Answer separate documents with only one index for the bucket? While retrieval or questioning from vector search, I want to specify which document/datapoint_id I want to query from.

Currently when I add data points for multiple documents to same index, the retrieval response for a query match is based on globally from all the documents, instead of the required one.

P.s. : I am using MatchingEngine utility maintained by Google.

dutpxio

Hey, sir I want to know if I have any company documents locally so how can we use it ? And load data and one more thing does it provides answer exactly mentioned in pdf or documents or perform any type of text generation on output?

Tech_Inside.

At 3:38 you say 'You need a Google project", but I'm not sure what that is exactly. Do I need a GCP account and then create a VM for that?

russelljazzbeck

Hey Sascha,

I've been playing around a lot more, but i've run into accuracy issues that i wanted to solve using MMR (max marginal relevance) search.
It looks like teh Vertex ai Vector store (in Langchain) doesn't support this, at least not the NodeJs version but if i'm not mistaken it's the same in python.

Do you know what the best approach would be?
As a workaround i'm overriding the default similarity search and filtering the results before passing it as context

TarsiJMaria

I have content of nearly 100 pages. Each page have nearly 4000 characters. What chunk size I can choose and what retrieval method I can you for optimised answers?

rinkugangishetty

Hello! thank you so much for the video. I have a problem at the last code cell:
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "DNS resolution failed for :10000: unparseable host:port"
debug_error_string = "UNKNOWN:DNS resolution failed for :10000: unparseable host:port

cinhlor

Thanks for the video, how do we evaluate the model?

saisandeepkantareddy

Can this be deployed to a Vertex AI endpoint as a custom container?

wongkenny

Nice, does the data is in xml format ?

HailayKidu

What if I want to batch process different sites ? What would be your approach ?

louis-philippekyer

Thank you for the explanation, really liked it !!

I was wondering if we use DPR (Dense Passage Retrieverl) on our own data and want to evaluate its performance like precision, recall and F1 score, if we have a small reference data which can serve as ground truth. Can we do that ? I am confused on the fact that since DPR is trained only on wiki data as far as i know, will it be nice to measure the efficiency of the DPR retrieval, when i follow this RAG approach ?

bivasbisht

Generative AI / LLM - Document Retrieval and Question Answering

How Large Language Models Work

Introduction to large language models

What are Generative AI models?

simpleshow explains: Generative AI, Large Language Models and ChatGPT

LLM Explained | What is LLM

Gen AI Course | Gen AI Tutorial For Beginners

Introduction to Generative AI

Fine Tuning LLM Models – Generative AI Course

How GoML is building the future of generative AI on AWS

Roadmap to Learn Generative AI(LLM's) In 2024 With Free Videos And Materials- Krish Naik

AI vs ML vs DL vs Generative Ai

The Evolution of AI: Traditional AI vs. Generative AI

What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata

What is Retrieval-Augmented Generation (RAG)?

[1hr Talk] Intro to Large Language Models

Introduction to LLM and Generative AI

How to tune LLMs in Generative AI Studio

Has Generative AI Already Peaked? - Computerphile

LLMs vs Generative AI: What’s the Difference?

How Large Language Models (LLM) In Generative AI Are Trained ?

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

Should You Use Open Source Large Language Models?

Build a Large Language Model AI Chatbot using Retrieval Augmented Generation

How to Build an LLM from Scratch | An Overview