What is Matryoshka Embedding Models ? Similar accuracy with a smaller embedding size, speedups

preview_player
Показать описание

For those unfamiliar, "Matryoshka dolls", also known as "Russian nesting dolls", are a set of wooden dolls of decreasing size that are placed inside one another. In a similar way, Matryoshka embedding models aim to store more important information in earlier dimensions, and less important information in later dimensions. This characteristic of Matryoshka embedding models allows us to truncate the original (large) embedding produced by the model, while still retaining enough of the information to perform well on downstream tasks.

If you like such content please subscribe to the channel here:
Рекомендации по теме
Комментарии
Автор

The exact term is subspace instead of subest. For instance if we have 1024 dimension vector for embedding, we can have 512, 256, 128 subspace vectors in MRL embeddings.For those unfamiliar, "Matryoshka dolls", also known as "Russian nesting dolls", are a set of wooden dolls of decreasing size that are placed inside one another. In a similar way, Matryoshka embedding models aim to store more important information in earlier dimensions, and less important information in later dimensions. This characteristic of Matryoshka embedding models allows us to truncate the original (large) embedding produced by the model, while still retaining enough of the information to perform well on downstream tasks.

RitheshSreenivasan
Автор

Amazing video, would love to see more videos on different embedding comparisons like Ada, Elser, Mistral etc

jackmartin
Автор

great video. I have a question though. If I want to convert the embedding to smaller dimensions how do I do that? I want to make use of the same thing. I have a vector db with embedding of 3072 lets say and i want to use matryoshka such that first i get 10 documents from there using dimension 512 and then later on re rank on the 10 documents with full embeddings. Does this mean that I need to have two indices one with full embedding and one with the truncated one? I am confused here. Can you please help me ? In case you have any code also that would help. Couldnt find anything on the internet

susovandey