GPT-4 Tutorial: How to Chat With Multiple PDF Files (~1000 pages of Tesla's 10-K Annual Reports)

preview_player
Показать описание
In this video we'll learn how to use OpenAI's new GPT-4 api to 'chat' with and analyze multiple PDF files. In this case, I use three 10-k annual reports for Tesla (~1000 PDF pages)

OpenAI recently announced GPT-4 (it's most powerful AI) that can process up to 25,000 words – about eight times as many as GPT-3 – process images and handle much more nuanced instructions than GPT-3.5.

You'll learn how to use LangChain (a framework that makes it easier to assemble the components to build a chatbot) and Pinecone - a 'vectorstore' to store your documents in number 'vectors'. You'll also learn how to create a frontend chat interface to display the results alongside source documents.

A similar process can be applied to other usecases you want to build a chatbot for: PDF's, websites, excel, or other file formats.

Visuals & Code:
🖼 Visual guide download + github repo (this is the base template used for this demo):

Timestamps:

01:02 PDF demo (analysis of 1000-pages of annual reports)
06:01 Visual overview of the multiple pdf chatbot architecture
17:40 Code walkthrough pt.1
25:15 Pinecone dashboard
28:30 Code walkthrough pt.2

#gpt4 #investing #finance #stockmarket #stocks #trading #openai #langchain #chatgpt #langchainjavascript #langchaintypescript #langchaintutorial
Рекомендации по теме
Комментарии
Автор

Timestamps:

01:02 PDF demo (analysis of 1000-pages of annual reports)
06:01 Visual overview of the multiple pdf chatbot architecture
17:40 Code walkthrough pt.1
25:15 Pinecone dashboard
28:30 Code walkthrough pt.2

chatwithdata
Автор

This tutorial is legit the best. Custom Chatbots are going to be a huge biz op.

TheRonellCross
Автор

The excalidraw documents are invaluable to the presentation and really help visualize the flow of data between the pinecone database and langchain is used to make it all work. Thanks a lot for those diagrams, your efforts are not unrecognized. The possibilities of this are literally endless

byte_easel
Автор

Thanks for this, your vids are some of the best on YT now for this type of thing as right now in this domain there is only two words for all the content out there - information overload!!! I am still going through your prev video on PDF ingestion and processing, thank you for sharing your invaluable information in a style that makes it understandable.

JohnnyLonghorn
Автор

Thank you for your great videos, I really like them.
One UX improvement comment: Try serving the PDFs as static content with your web application, then when you append #page=42 to the PDF's link, a user can go directly to that page in their browser. Very easy to implement and your users will save a copy and paste and 3 clicks.

tslg
Автор

I work in an academic research setting researching an extremely niche subject in material science and these kind of tools are what I have been waiting for. I can't wait for us to be able to upload the papers and books that contain all current available knowledge on the topic and be able to interact with that knowledge pool.

LimabeanStudios
Автор

so interesting. as a non -techie trying to implement this is wild but fascinating

harischsood
Автор

Thank you for this contribution to the common good. You are the man.

clockworkOMNI
Автор

Cool stuff, It is just unbelievable how much depth and breadth all the computing field has become, Too stretched out and too spread, Too much subjects for to explore, Chating with books is a cool idea that could really help us access and organize scientific human knowledge in all fields.

astroid-wspy
Автор

Woah! 😱These diagrams make it so much easier to understand concepts better than other Youtube videos! Thanks for spending so much time and effort!

ADHDOCD
Автор

This is the best tutorial I've seen on embeddings search, the applications of this are endless, really excited to start building. Thank you so much for the work that you're doing :)

yajatgulati
Автор

You are great at teaching the entire process! Please continue this series :)
Thank you!

duhai
Автор

Just found this channel, 16k subs now, not for long. You are making great tutorials that I will say I should have made months ago. Keep making videos you can hit 2m subs in less than a year if you make content more relevant and useful to a wider audience and more entertaining

CowCoder
Автор

best video i found for my personal project on the internet

rhythmgaidhani
Автор

Brother...! This is what I am talking about!
My sellers are looking for this kind of capability for research as well as for searching our internal content management system.

rugerdie
Автор

Great video. But the problem with your solution is you have hit the OpenAI api at least 5 times and it would make it costly and not scalable. Other than that, a good project.

MehdiAllahyari
Автор

Can you please show how the front end connects and how to build it? This is definitely the missing piece of the puzzle for me. Thanks for the videos. They're great!

JasonMelanconEsq
Автор

1. What is the tutorial about?
- The tutorial is about how to chat with multiple massive PDF documents across multiple files, specifically Tesla's 10-K Annual Reports.
2. How many pages of PDFs are involved in this tutorial?
- The tutorial involves around a thousand pages of PDFs from the 2020, 2021, and 2022 annual reports of Tesla.
3. How does the chatbot work when searching for specific information in the PDFs?
- The chatbot is able to cross-check and provide a reference page when answering a specific question about the risk factors or financial performance of Tesla over the past three years.
4. Can the chatbot analyze multiple PDF files simultaneously?
- Yes, the chatbot is able to analyze and answer questions about multiple PDF files from different years of Tesla's annual reports.
5. Is the tutorial code available for public release?
- The code for the tutorial is not currently available for public release as it is still being tested and may be buggy.

1. What is the process of converting PDF documents into number representations?
- The PDF docs are converted into text, as PDF is binary and needs to be in text format.
- The text is split into chunks to fit into Open AI's context window.
- Open AI creates embeddings, which are number representations of the text.

2. What is a vector store?
- A vector store is a database that contains number representations of documents in different categories or spaces.
- The vector store can also store the text of the documents and relevant metadata.

3. How does GPT-4 retrieve information from the vector store?
- The question is converted into numbers and a specific namespace or box is specified to retrieve relevant documents.
- The relevant documents are combined with the question and GPT-4 looks at the context to provide a prompt response.

4. What is the challenge of analyzing information across multiple namespaces or years?
- Extracting the namespace from the question is required to search for information in the relevant namespaces.
- The model needs to dynamically determine which year or namespace the user is referring to.

5. How can GPT-4 assist in extracting the namespace from a question?

- GPT-4 can be used to extract the namespace from the question and dynamically determine which year or namespace the user is referring to.
- This allows for analyzing information across multiple years and namespaces.

1. What is the purpose of the script called "ingest data"?

- The script called "ingest data" is used to load each PDF report in the reports folder into text.

2. How does the dynamic context work in the system being described?

- The dynamic context specifies the namespaces to look at for relevant documents.
- When there is a question, the system reverts the question to embeddings and checks the specified namespaces for relevant documents.
- It retrieves the relevant documents for each namespace and then proceeds with the usual procedure.

3. What is the website called "secant Alpha" used for?

- The website called "secant Alpha" is used by many investors.
- It provides information on revenue growth.

4. What happened when the speaker asked the system about Tesla's estimated revenue growth for 2022?

- The system searched for and found the relevant documents from the 2022 namespace.
- The system calculated the estimated revenue growth for Tesla based on the consolidated statement of operations.
- The result was surprising and unexpected.

5. What is the output of the "ingest data" script?

- The output of the "ingest data" script is a JSON file containing information on each PDF report in the reports folder.
- The file includes information on the year of the report and the name of the company.

1. What is the purpose of page numbers and references in the UI of the project?
- They allow users to easily locate and reference specific pages and original sources.

2. How are the PDFs translated into text in the project?
- The PDFs are ingested and split into different categories and chunks of 1000-1200 characters, with each chunk assigned a namespace.

3. What is the purpose of Pinecone in the project?
- Pinecone is used as a database to store the chunks of text and metadata, converted into embeddings (vectors) for similarity calculations.

4. What are some limitations of Pinecone, and how are they overcome in the project?
- Pinecone has limits on the number of vectors that can be inserted at once, so chunks are split into smaller groups (e.g. 50). The API keys, environment variable, index name, and number of dimensions must also be specified correctly.

5. What is the purpose of the dimensions in the embeddings?
- The dimensions represent different spots in an array of vectors, with each dimension containing a number representing a specific aspect of the text or metadata being analyzed. In this project, OpenAI creates 1536 dimensions for each embedding.

1. What has been trending on GitHub for the past couple of days?
- Answer: The transcript does not provide a clear answer to this question.

2. What are the indexes and namespaces in the code?
- Answer: The indexes and namespaces are components used in the code to retrieve information from different sources.

3. How can one explore what the vectors look like in the code?
- Answer: One can click "fetch" to see what the vectors look like, which includes the namespace, ID, and decimal number representations of the text.

4. What should one do to ensure successful ingestion in the code?
- Answer: One should ensure that their config has matching namespaces as in the dashboard, set environment variables correctly, and avoid tampering with versions to avoid breaking changes.

5. What is the purpose of the second phase in the code?
- Answer: The purpose of the second phase is to chat and retrieve information dynamically by specifying the namespace to retrieve information from.

1. What is the custom QA chain and what does it do?
- The custom QA chain is a tool created by the speaker.
- It takes the model, index, and namespace to effectively set the stage for the Standalone question and retrieve specific name spaces and relevant documents to provide a response.
- The custom QA chain is just one of three chains available in Lang chain, which includes chat Vector DBQ a chain and Vector DBQ a chain.

2. What is the purpose of the chat history in the code?
- The chat history is set to nothing and is not directly relevant to the custom QA chain.
- It is used in the API implementation of the same logic seen in main.ts.

3. How many files were tested with the custom QA chain, and what were the results?
- The speaker was able to read three different files with over close to a thousand pages of in-depth financial analysis.
- Gpt4 was able to analyze all three years and provide decent analysis.

4. What is the purpose of the front end in the repo?
- The front end in the repo is already available for use.
- It is an adaptation of the custom QA chain tool.
- The speaker is experimenting with talking across different PDFs.

5. What is the compound annual growth rate for Revenue over the past three years?
- The transcript does not provide an answer to this question.

2. What kind of questions were asked about Tesla's revenue?
- The questions asked were about growth potential, profitability, and risk factors management.

3. What is the K1 ?
- K1 is the number of reference documents that are returned per PDF when using GPT to analyze data.

4. What kind of future changes will the repository make?
- The repository will add features for analyzing multiple PDF files in the future.

5. Where can you find more information about GPT and its applications?
- The speaker suggests checking out their workshops and signing up for the waitlist in the description section for more in-depth step-by-step details.

labsanta
Автор

Thanks for the detailed presentation and explanation of your concept! This is really exciting to learn - subscription is set, I'm looking forward to more videos. Thanks mate!

Lutherbaer
Автор

This is amazing man. Thank you! It will take a while to fully understand how to implement this for myself but I appreciate the knowledge.

BlaziNTrades