Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation

Показать описание

LLMs like ChatGPT are known to hallucinate. If we can ground the LLM with an external memory (e.g. document, pdf), this may let the LLM generate more reliable outputs. We can also augment the output with the reference link (like Bing Search)!

For this tutorial, we use OpenAI Embeddings, Tokenizer (tiktoken), PineCone.

Disclaimer: Please do not openly show your OpenAI / PineCone API key like me. I am only showing it for educational purposes and have deleted the exposed key.

~~~~~~~~~~~~~~~~~~~

References:

~~~~~~~~~~~~~~~~~~

0:00 Introduction
0:48 Prepare Documents for Loading
4:15 Generate Embeddings in Chunks
9:40 Retrieval-Augmented Generation
16:04 Conclusion

~~~~~~~~~~~~~~~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

John Tan Chong Min

Рекомендации по теме

Комментарии

Do let me know if you have any clarifications here!

johntanchongmin

Great John!!!! I calculated similarity on the way you shared, but also used Spacy similarity to see what would return, but it gives me VERY different results. Do you have any insights or guidance on what is the right one?

leoccleao

Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation

Tutorial #2: OpenAI Vector Embeddings and Pinecone for Retrieval-Augmented Generation

OpenAI Embeddings and Vector Databases Crash Course

Vector Embeddings Tutorial – Code Your Own AI Assistant with GPT-4 API + LangChain + NLP

$0 Embeddings (OpenAI vs. free & open source)

LangChain (OpenAI) Vector Embeddings For Beginners

Vector Search RAG Tutorial – Combine Your Data with LLMs with Advanced Search

Vector databases are so hot right now. WTF are they?

A Beginner's Guide to Vector Embeddings

Create Serverless RAG Application Easily

OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!

Vector Databases simply explained! (Embeddings & Indexes)

Mastering Vector Databases & Embeddings (OpenAI embedding and Chroma vector database)

How to use openai embeddings for qa? #embedding #pinecone #openai #vector #gpt4 #chatbot #langchain

Open AI Embeddings in Azure Vector Database of Cognitive Search

BEST OPEN Alternative to OPENAI's EMBEDDINGs for Retrieval QA: LangChain

LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners

Vectoring Words (Word Embeddings) - Computerphile

Using OpenAI embeddings on Supabase database - SupabaseTips

How to use a vector database [Node.js + Chroma + OpenAI]

Word Embedding and Word2Vec, Clearly Explained!!!

Generating embeddings and RAG using vector DB #ai #datascience #embedding #machinelearning #nlp

OpenAI Python Vector Embeddings: Tutorial | Panda, Cost Calculation, CSV files, Similarity Search p2

Chroma - Vector Database for LLM Applications | OpenAI integration

Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)