The Biggest Misconception about Embeddings

preview_player
Показать описание
The biggest misconception I had about embeddings!

Visuals Created Using Exclidraw:

Icon References :

Bird icons created by Mihimihi - Flaticon

Whale icons created by Freepik - Flaticon

Carrot icons created by Pixel perfect - Flaticon

Kale icons created by Freepik - Flaticon

Book icons created by Good Ware - Flaticon

Book icons created by Pixel perfect - Flaticon

Sparkles icons created by Aranagraphics - Flaticon

Flower icons created by Freepik - Flaticon

Feather icons created by Freepik - Flaticon

Communication icons created by Freepik - Flaticon

Student icons created by Freepik - Flaticon

Lunch icons created by photo3idea_studio - Flaticon
Рекомендации по теме
Комментарии
Автор

It feels like the shorter your video is the more informative it is 😅, you don't only explain what's embedding is but also explain how it can differ based on problem statement in less than 5 minutes

shoaibsh
Автор

Great explanation Ritvik. This is true for context-aware embeddings, which were made possible since the development of Transformers. In the pre-transformers era, the embeddings were static embeddings, meaning that a word had the same embeddings regardless of the context.

DecodingtheEverythingSTEM
Автор

I've never commented on any of your videos before but thought it was time to do so after this one.
Thank you so much for all the great work!
For me you're the best explaining data science and ML concepts on youtube.
I also love how broad your range of topics is. I feel like I used your content to understand concepts in NLP and general Data Science but also RL or Bayesian Approaches to Deep Learning.
Your real life and intuition explanations are really strong. Keep it up!

SierraSombrero
Автор

Dude, your videos are so damn mind-opening.

jfndfiunskj
Автор

In a RAG-based Q&A system, the efficiency of query processing and the quality of the results are paramount. One key challenge is the system’s ability to handle vague or context-lacking user queries, which often leads to inaccurate results. To address this, we’ve implemented a fine-tuned LLM to reformat and enrich user queries with contextual information, ensuring more relevant results from the vector database. However, this adds complexity, latency, and cost, especially in systems without high-end GPUs.

Improving algorithmic efficiency is crucial. Integrating techniques like LORA into the LLM can streamline the process, allowing it to handle both context-aware query reformulation and vector searches. This could significantly reduce the need for separate embedding models, enhancing system responsiveness and user experience.

Also, incorporating a feedback mechanism for continuous learning is vital. This would enable the system to adapt and improve over time based on user interactions, leading to progressively more accurate and reliable results. Such a system not only becomes more efficient but also more attuned to the evolving needs and patterns of its users.

Pure_Science_and_Technology
Автор

this is a fantastic video. I found myself confused as to why NNs needed an embedding layer each time and why we didn't just import some universal embedding dictionary. This made that super simple! Parrots and carrots and kales and whales and cocks and rocks!

gordongoodwin
Автор

I would like to point out an important distinction: The *concepts* described by the symbols in context of other symbols can have vastly different embeddings. The *symbols* themselves however need absolute/fixed embeddings. If you use multiple symbols in a sequence, like words in a sentence, you can use all the other symbols in order to give each other context.

So the raw input embeddings are always the same. In that case, I would argue that the initial "common misconception" is actually accurate.

Using a model like a transformer allows you to input a sequence of (fixed) symbol-embeddings and end up with contextualized embeddings in place of those symbols. The transformer then iteratively applies *transformations* on those embedding vectors depending on the *context* .

The symbol "parrot" always starts as the same fixed embedding vector, no matter in which context it appears. But depending on the context, the repeated transformations done by the transformer eventually *map* that vector to another vector close to "parrot" if the context is a poem, or yet another vector close to "kale" if the context is a cooking recipe.

This is why word2vec back then just was not enough. It only computed something similar to those input embeddings and then stopped there without doing those transformations.

nüchtern_betrachtet
Автор

Would love to see more about embeddings

lechx
Автор

This is enlightening. It conveys how embedding works in an intuitive way.

shirleyhu
Автор

This was perfectly explained. Cheers mate

shawn.builds
Автор

So now I know how to make a hip hop AI agent that can rap. What out Kendrick!

Kivoswag
Автор

One thing I don't understand is that why are these embeddings learned through deep learning with non-linearity in-between could be compared with linear metrics such as the most commonly used cosine similarity. I can't find a good discussion anywhere.

zeroheisenburg
Автор

Excellent video. Thanks for taking the time to share.

polikalepotuaileva
Автор

best explanation I have seen of embeddings by far, Thanks 🌻

baharrezaei
Автор

Love love your videos! Very clear with meaningful examples!

adaoraenemuo
Автор

Thanks for the explanation! Really easy to understand after watching this video!! keep up the good work

andreamorim
Автор

wow that's quite an uncover and an unlock for me. thanks a ton and cheers
b

behrampatel
Автор

I got a good grade for my ml exam because of you!

blairt
Автор

This is amazing. Thank you for doing this video. This drives home a very imp. point. Can we fine tune the already present state of art embeddings to my specific world of context? Also, I would really like to know at least conceptually how some of these popular embeddings are created, like SBERT, RoBerTa etc.

pratik.patil
Автор

When the sample size is large, does the embedding for individual words start to converge?

randoff
welcome to shbcf.ru