Fuzzy String Matching with BERT, TF-IDF using PolyFuzz

preview_player
Показать описание
PolyFuzz performs fuzzy string matching, string grouping, and contains extensive evaluation functions. PolyFuzz is meant to bring fuzzy string matching techniques together within a single framework.

Join this channel to get access to perks:

If you do have any questions with what we covered in this video then feel free to ask in the comment section below & I'll do my best to answer those.

If you enjoy these tutorials & would like to support them then the easiest way is to simply like the video & give it a thumbs up & also it's a huge help to share these videos with anyone who you think would find them useful.

Please consider clicking the SUBSCRIBE button to be notified for future videos & thank you all for watching.

You can find me on:

#NLP #Stringmatching
Рекомендации по теме
Комментарии
Автор

Amazing suddenly i required this and got your video thank you 🙏

soumyaranjansethi
Автор

Hi Bhavesh, thank you. fuzzywuzzy or PolyFuzz, which one is more powerful and accurate

vinayradhi
Автор

Hi @bhavesh I tried polyfuzz with 3 models Bert tfidf and editDistance. It took me around 49 mins and colab went offline. I was trying 28k fromlist to match 1400 in tolist. You have any better suggestions?

mmeeran
Автор

Had to do something like this.. had to match a name against a large dictionary which has ethnicity of all names. Used Fuzzywuzzy library. Worked well but given the size of the dictionary, it was painfully slow. Not sure if that functionality is available in this lib. Will definitely try this out. Thanks for the video.

madhu
Автор

I have a question with the BERT approach. When you try to get the similarity of "apple" and "app" (for example) you need to convert them into a numeric vector (an embedding). But, to get the embedding for "apple" (for example, and the same for "app"), you should see this token in the whole sentence (to get the "contextual embedding"). How they get these embedding if only a word is provided (for example, the token "apple")?

nicolasmontes
Автор

What is this notebook app that you are using?

bbkmujo