ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain

Показать описание

ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain

In this video, I will show you, how you can chat with any document. Let's say you have a folder and inside the folder, you have different file formats. Let's say you have PDF file. You have text file. You have read me file and others. I will show you how you can take all of your data, split the data into different chunks, do the embeddings using the OpenAI embeddings, store that into the Pinecone vectorstore. Finally, you can just chat with your own documents and get insights out of it, similar to ChatGPT for with your own data. Happy Learning.

👉🏼 Links:

🔗 Other videos you might find helpful:

💰🔗 Some links are affiliate links, meaning, when you use those, I might get some benefit.

#openai #llm #datasciencebasics #chatwithdata #documents #chatgpt #nlp

Рекомендации по теме

Комментарии

Many Thanks for your great work!
It's very well explained and applies to real uses of AI.

gilbertomendes

So every time I need to chat with my own data I will have to embedding the query? That’s make it much more expensive isn’t?

Alimenteocerebro

Thank you. How could I print from which document (title) it is from and which page (s). It is useful when multiple files of multiple pages in the sorce directory. Thank you for your time.

motopaediatheview

I've identified the problem in your code. The issue lies in the creation of chat history. Your code expects a list of tuples, but in your Gradio app, you're creating a list of lists (nested lists), which is causing the code to malfunction.

Please try using the following code instead and replace it in your Gradio block. This updated code should resolve the issue and make it work correctly.

import gradio as gr
with gr.Blocks() as demo:
chatbot = gr.Chatbot()
msg = gr.Textbox()
clear = gr.Button("Clear")

def respond(user_message, chat_history):
print(user_message)
print(chat_history)
if chat_history:
chat_history = [tuple(sublist) for sublist in chat_history]
print(chat_history)

# Get response from QA chain
response = qa({"question": user_message, "chat_history": chat_history})
# Append user message and response to chat history
chat_history.append((user_message, response["answer"]))
print(chat_history)
return "", chat_history

msg.submit(respond, [msg, chatbot], [msg, chatbot], queue=False)
clear.click(lambda: None, None, chatbot, queue=False)

demo.launch(debug=True, share=True)

IamalwaysOK

great Video. Thanks for your time and explanation.

peralser

I tried your tutorial, but get stuck on the steps to Pinecone, error: AttributeError: init is no longer a top-level attribute of the pinecone package. Do you have an updated notebook?

francoist

Its Chunks and not Choonks
just for Fun, Dont take it video is informational and Perfect

tattooGuri

as always great tutorials! I would love to see this same topic but without using openai..

fabsync

hello sir,

what will be evaluation metrics we should use for our usecase. kindly let me know

imranmunshi

Awesome tutorial.
Thank you for sharing!

chineduezeofor

My question would be. How would you accommodate new random data that has to be introduced to this? Will be do the vectorization process all over again or is there a better way to handle it even for 1 document?

tusharbhatnagar

Thank you for the good video. I am curious why you stored the vectors in chroma first and again in pincone again? Thank you

ramp

How can I utilize this ChatBot for my SQL documents?

muratalarcin

Please do again this video with Streamlit

HeroReact

Amazing tutorial! Is there a way to add in the sources as well with the responses?

mayank

Were you able to figure out the error when entering the second query? I’m running into the same issue.

nitroeh

May I know which website are you using to execute step by step. I learnt a lot form this tutorial

siddhu

many thanks for great tutorial, but It seems slow, is there any way make it run faster? thanks advance

hoduchoa

why we split data with chunk of 1000 or 1500 and then get 4 most relevant chunks? why not more than 1500 or 1000 character per chunk? or why not more than 4 releant chunks? is there limitation of characters to feed the chatGPT with data? how much is the limitation? after using the code I checked my API usage in OpenAi and saw that I have used instructGPT. what is instructGPT?

mrmortezajafari

Getting some numpy error: "AttributeError: module 'numpy.linalg._umath_linalg' has no attribute '_ilp64' " in all your LangChain related colab notebooks

snehitvaddi

ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain

Train ChatGPT On Your Data (Easy Method)

Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

Chat GPT: How To Train ChatGPT On Your Own Data (Quick 2024)

3 Ways ChatGPT Can Leak YOUR Data

Create your own ChatGPT with CUSTOM PDF Data using LangChain | Gen AI Use Case | Satyajit Pattnaik

How to Train ChatGPT with Your Own Data

Can ChatGPT work with your enterprise data?

Create Your Own ChatGPT with PDF Data in 5 Minutes (LangChain Tutorial)

Web scraping in R with ChatGPT (4 Examples) no HTML knowledge needed

How to Use ChatGPT with Your Own Data

ChatGPT for your data with 5 lines of code

Azure OpenAI BYOD: ChatGPT with Your Own Data!

How To Train ChatGPT AI WIth Your Data

Safely Use ChatGPT with Your Data

Does ChatGPT save your data? (And how to turn it off.)

Chat with Your SQL Data Using ChatGPT

Apply ChatGPT to your own data.

How to Use ChatGPT-4 Advanced Data Analysis - Analyse your data with AI

How to train ChatGPT on your data? Discover with Botsonic #chatgpt #chatgptalternative

Does ChatGPT store your data and can you stop it?

How To Train ChatGPT On Your Own Data Tutorial

Get your data into ChatGPT: CSV, JSON, Databases & more

Train AI on Your Data... In 3 Levels

Train ChatGPT on your data