Matthijs Douze on Quantization and FAISS - Weaviate Podcast #29

preview_player
Показать описание
Hey everyone, thank you so much for watching another episode of the Weaviate podcast! This episode features Matthijs Douze, one of the most talented and accomplished scientists we've hosted on the Weaviate podcast! Matthijs has pioneered the use of Product Quantization to compress vector representations and enable even faster and more efficient approximate nearest neighbor vector search. Matthijs told an incredible story about the history of this research, from searching from SIFT vectors for Computer Vision Search applications like real-time CD Cover album search to the problems facing modern IVF-PQ systems and the use of PQ in graph-based HNSW search. This is also a very special episode as Abdel Rodriguez makes his debut on the Weaviate podcast to discuss Weaviate's efforts in integrating PQ support and the unique challenges with this algorithm and the incremental updates required for a Vector Database. On this topic, Etienne Dilocker also returned to discuss the topic of Vector Database vs. Library with Matthijs, who is one of the lead developers of the Faiss library. This was a really information heavy podcast, please don't hesitate to ask us any questions or present any of your ideas! Thanks again for listening!

Chapters
0:00 Welcome Matthijs!
0:30 Background
1:30 Initial work with SIFT feature vectors
4:20 Image Vectors enabled Large-Scale Indexing of Continuous Data
5:00 CD Cover Image search in real-time
5:30 Inverted Index of Visual Bag of Words features
6:50 Similarity of Inverted Index and Quantization
7:30 Adding Hamming Embeddings to the Inverted Lists
11:00 Comparison to Locality Sensitive Hashing
12:10 Better ways to Quantize
18:45 Chunk Vector -- “Product” Quantization
22:00 Efficiency Trick for Query Quantization
27:30 Llyod’s Optimality Conditions
30:40 Open-Source and Benchmarks
32:40 Pain Points of IVF-PQ Method
38:14 Connection to Graph-based ANN (i.e. HNSW)
46:30 FAISS - Facebook AI Similarity Search
59:45 Welcome Abdel Rodriguez!
1:02:55 Challenge of Online PQ for Databases
1:06:55 Vector Database versus Library
Рекомендации по теме
Комментарии
Автор

If You add some editing and animation here more much helpful

only_friendship