Why are vector databases so FAST?

preview_player
Показать описание
Vector databases are fascinating, and I'm surprised more people aren't talking about what makes them so fast.

You can use a vector database to store arrays of floating-point numbers. You can then search the database using a similarity function to return any vector that's similar to another one.

This video will show you how these databases work behind the scenes and how you can use serverless Pinecone in a few lines of code.

I teach a live, interactive program that'll help you build production-ready Machine Learning systems from the ground up. Check it out here:

To keep up with my content:

Рекомендации по теме
Комментарии
Автор

You are great. People who following you are the ones who care about understanding the root concepts which is rare to find nowadays because everyone copying and pasting without understanding

ah
Автор

How do you only have 40k followers? Amazing content. Been looking for this for over a year. Thank you!

jeromegouvernel
Автор

The actual discussion about vector database starts at 14:45. Before that, it is a just a review of embeddings and RAG framework

MrTulufan
Автор

He is absolutely right. Unless you take course in vector database, it is not easy to find material on 'how vector database works at low level'. Thank you for your content.

tee_iam
Автор

Hey Santiago, keep going with your choice of shirts!

PuerinTheHunter
Автор

This video, from its content to your performance, is fantastic.

vinj
Автор

Thanks for the lesson. Always good to understand how things are getting done in the background. Great Explanation!!

toddroloff
Автор

Thank you very much for this amazing content!! It is so educative :)

nachoeigu
Автор

Wonderful video. Any chance of a video comparing HNSW vs Faiss vs Annoy?

LiebsterFeind
Автор

Thanks a lot, Santiago! You are one of two authors I follow in YouTube and mainly in LinkedIn. The content is just a gold.
My question is about that serverless thing. You provide the cloud and region but don't provide your aws credentials. Does it mean that it is free? As far as I understood, the cloud provider in this case is used to store the data. What is we don't delete the database at the end? Will have the bills for storing the db?

oseteg
Автор

@22:22 this really helps on understanding the efficiency of the vector search algorithms. and the drawing reminds me the SVM borders/boundaries.
by the way, great shirt! :)

emrahe
Автор

Thanks you for explaining this, I had the intuition that this is how the indexing worked via clustering but you helped crystallise my thoughts on this. One thing I think might have been missed is the trigonometric functions used like cosine take into account the direction of the vector towards the next cluster. So the cosine function uses the vectors like a compass. When grouping the vectors your quantizing or approximately all related vectors to the centroid. So obviously reducing accuracy because your not pointing to the exact point in the cluster but to the centre. How are the results selected is there an attempt to research the selected related records using the original vector or is it simply random selection.

justindressler
Автор

In many ways, when you calculate the embeddings, and you reduce a fragment of data to a single vector, you are calculating a kind of hash.

ernestuz
Автор

Have nothing to tell, than You are fantastic!

riemannderakhshan
Автор

"ok so I'm going to execute this" <cut> "BOOM it's just that fast!"
really?... really?.... You add a cut between those two sentences? I'm hoping this was unintentional. (thankfully the next search didn't have a cut)

Great video otherwise. I'd love to see you dive into the actual indexing though so we can actually see how it works. This was quite high level.

nope
Автор

Awesome as always. I live in Florida as well what are my chances to meet you in person AND how did you automate your responses to all comments you get as ♥ Please write something as well 🙂

lokeshsharma
Автор

Nice ...Can u do one on Graph Database too?

KumR
Автор

Thanks for the cool video to make me better understand this topic. If I do not want to put my data into a cloud, what other vector db could you recommend? ChromaDB?

uwegenosdude
Автор

This guy is creating amazing content and subscriber is 40k??

rally_furymoments
Автор

Love the shirt, where did you buy it?

delvoneu