Rasa Algorithm Whiteboard - BytePair Embeddings

preview_player
Показать описание
BytePair embeddings are a really cool idea. BytePair Embeddings can be seen as a lightweight variant of FastText. They need less memory because they are more selective in what subtokens they remember. This also makes them useful in certain scenarios because they can ignore subwords as well. They're also available in 275 languages!

If you want to see the Rasa NLU examples repo, go here:

If you want to see the Whatlies repo for these embeddings, go here:

If you want to see the BPEmb repo, go here:
Рекомендации по теме
Комментарии
Автор

thanks bro, best explaination i can find.

alanliang
Автор

This channel is a gold mine! Thank you very much for sharing your insights!

faangsde
Автор

Great video! I think the video description lacks a word in the sentence "They need way memory..."?

Автор

How does the algorithm know when to stop merging tokens?

distrologic
Автор

Thanks sir.. one query.. what's the difference between byte pair and wordpiece tokenization?

piyalikarmakar