Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

Показать описание

Follow me on social media for more tips & fun:

Disclaimer: This description may contain affiliate links. Cryptocurrencies are not investments and are subject to market volatility.

TechLead

Рекомендации по теме

Комментарии

This is rare. TechLead is actually uploading a useful coding tutorial instead of his opinions.

davidl.e

This is the way to explain LangChain in the style of TechLead. You nail it. Hopefully more this stuff in the future. Thanks.

jayhu

This is exactly what I desired to do with my own data, but I haven’t spent any time yet to research and figure out a way to do it. I am glad there is a public way to do it.

TheRealTommyR

hours and hours of chatgpt courses... i learned more by watching 5 minutes of your video. congratulations for the clarity and the practical approach👍

e-matesecom

Awesome seeing TechLead do programming, the Maestro at work.

charleswhite

This one video alone saves so much time. Instead of watching hours of some of the playlists out there. It's better to start here and then go straight to the Langchain docs to work out other use cases. Excellent TechLead.

jsnmad

By far one of the best ChatGPT video tutorials I've seen on YouTube. Great work

adasi

Glad to know I'm not the only one doing this.

As a student I've been feeding ChatGPT all my previous course work, its able to answer essay prompts and other homework related tasks in my writing style and or in simmilar formats as if I was the one writing it. I'm able to save alot of time by doing this

john.

So after digging into the code, I found that Langchain is actually doing the following things: 1. for all your data, store then in vector storage using embeddings; 2. when you query something, it first did a similarity search in the embeddings database, and find out the files that's related to your question; 3. After finding the related files, it takes all the text of that file, together with a context message: "Use the following pieces of context as the 1st system message to answer the user's question. \nIf you don't know the answer, just say that you don't know, don't try to make up an {your text data}".

This somehow tells us these points:
1. why it's sometimes not having outside world's information? If the question you asked is not in your document, or if it's not trained on the data for your question, it will return nothing valuable as instructed.

2. Is there a limit on the sizes of your data? Yes, you can't use it with super large files because it's doing a document filtering and it will send all text related to the API server, recently the gpt-3.5-turbo-16k might be the good model to use and it's best the total size of related docs is less than 16k tokens. Which means the best practice would be grouping your data into different topics and try to ensure any query, if responded with similarity search, the total size of returned document is not exceeding the token size limit of the model. I think16k is roughly the size of a 13-15 pages paper.

3. By removing/changing the system message, you might get better results for common sense questions. I really don't like the system messaged by default, since in a playground, asking gpt-3.5-turbo-16k "Who is George Washington?" will give you better answers comparing to the langchain solution with an empty system message.

4. The langchain is using unstructured library (it reports errors when I didn't install it), which means you can not only use txt files, but also pdf files, word files, etc. Haven't tested it out but highly likely support query of multiple pdf files using similar code in the video. So you can put multiple pdfs in a folder, using a directory index creator and ask questions for your papers, I think (haven't tested it out)

5. The langchain not only supports ChatGPT models, but also other models in the chat_models package. Google PALM2 chat is also supported as of Jul 10, 2023, if someone has the key, you can use other models too. While I don't think PALM2 has the common sense knowledge as good as ChatGPT, but I think it is a better language generating model comparing to at least gpt-3.5-turbo-16k, so PALM2 may produce better results on your data and OpenAI's models are better in answering common sense questions after changing the default system message. OpenAI said general access to gpt-4 is starting, and people with history of successful payment using OpenAI API will get the access immediately a few days ago. The access to new developers will be rolled out until end of July.

Also I think it's quite cool to be able to use your own data, if you want to create something like an AI assistant, you can always use code to collect current time, user information and put those in a folder, so the assistant will be able to do much more than current ones.
Another very cool thing is auto-gpt which works great using gpt-4, gpt-3.5 is not smart enough and behaves much worse than gpt-4. If you asked auto-gpt something, it will be able to google itself and replied with the real time information. Also the example of auto-gpt is cool telling you how it could create a recipe based on the next holiday. Hopefully the access to gpt-4 is coming sooner.

RunningBugs

May be 8 months late and Langchain has been updated since, but this is one of the best videos I watched. Thank you.

hichamalaoui

First great video.
Second I just had to comment on the "one language" you mentioned programmers claiming that's all they wanted to know.
Last count i have coded in over 15 languages since i wrote my first line of code back in 1985.
We have not deployed anything using LangChain yet (we have only been using LlamaIndex) but for the same reason that i know so many languages, we will be using LangChain soon to see what it can do.
As for plugin, i will always be for building your own so you have full control and can do things that the plugin "left out." Things like ability to use your own data (and keep it on your servers).

We have found that if you are deploying a Help feature for your application you do not want to allow the code to get information from "the outside world."

larryczerwonka

Great video. I'm a junior data scientist in Belgium and it's actually helping me for one of my projects. You're totally right when you say that everyone should learn Python. I only learned C and C# during my studies but now that I've learned python I'm using it almost everyday.

andygilet

Im studying law at the moment and I’m seriously scared about how this will change the legal industry. Honestly could see it replace 90% of lawyering.

MacroAnarchy

Great vid, especially the in end with MS’s case study of customer reviews for cars. For those, who actually struggling to find real world applications for the new AI stuff. Thank you!

fenchelteefee

Wow, this was awesome. All this information in one place. Also, I appreciate your fast dialog and sticking to the important points. I subscribed and will recommend this site to others.

andrespineda

Semantra is a pretty cool tool to analyze your documents and be able to search them with natural language. It's probably more research-oriented since it links you to the different pages and snippets that match your query.

jcollins

Thanks TechLead, it's nice to see this type of videos !

seize

Eye opener! I am a tech student, and was researching whether we could make a custom GPT of our own. This was on point! Thanks @techlead!

riyaski

You made my day. I've been struggling with fine tuning a GPT 3 model with mediocre success and an enormous data collection and preparation effort. It would never even get close to the results achieved with langchain within 1 minute of coding and 9 minutes of data preparation.

danield.

Loved this. I am a Sales guy with zero coding exp. I listen to content like yours to glean some nuggets to better understand the impacts and have meaningful conversations with my customers. Truly helpful content.

sr

Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

How To Build Your Own AI With ChatGPT API

Run your own ChatGPT at HOME!

Run your own AI (but private)

NEW ChatGPT Update: Create Your Own GPT's! (Full Guide)

Create Your Own ChatGPT with PDF Data in 5 Minutes (LangChain Tutorial)

ChatGPT in Python for Beginners - Build A Chatbot

How to Train Chat GPT on Your Business 🎓

Make Amazing App (API App) Using ChatGPT & Flutter IDE in 8 MIN!!

Chat GPT: How To Train ChatGPT On Your Own Data (Quick 2024)

Advanced ChatGPT Guide - How to build your own Chat GPT Site

How I Coded An Entire Website Using ChatGPT

How To Use ChatGPT To Learn ANY Skill Quickly (Tutorial)

Run Your Own Local ChatGPT: Ollama WebUI

How to use ChatGPT to build Business Ideas, Sites & Personal Projects

ChatGPT Tutorial: How to Use Chat GPT For Beginners 2024

Create a ChatGPT Voice Assistant in 8 Minutes (Python Tutorial)

Automating My Life with Python & ChatGPT: Coding My Own Virtual Voice Assistant | Code With Me

How to Make a Website using ChatGPT (Full Tutorial)

Build your own AI chatbot in 2 minutes without code

Azure OpenAI BYOD: ChatGPT with Your Own Data!

How to Train ChatGPT with Your Own Data

Create your own Chat-GPT for your entire codebase

Training Your Own AI Model Is Not As Hard As You (Probably) Think