BERTopic : Topic Modelling with Transformer Embeddings , arxiv dataset python demo #NLP #tutorial

preview_player
Показать описание
In this video I discuss about BERTopic. BERTopic is a topic modelling technique that leverages huggingface transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions.
I explain the functionality of BERTopic with a python demo on arxiv dataset

If you like such content please subscribe to the channel here:

Relevant Links:
Рекомендации по теме
Комментарии
Автор

Great demo! Very clear explanation and trouble-shooting tips! welldone!

prabhacar
Автор

Hi, great video!
Is it possible to map each topic to their respective documents?

parthrangarajan
Автор

1. It is mentioned that the existing topic modeling methods such as LDA/NMF methods have too many parameters to be tuned, and this seems to be the major motivation for BERTopic approach. How is this challenge solved in this approach? What is the difference in the number of parameters in topic modeling methods  LDA/NMF vs. BERTopic method?

2.Why UMAP has been used for dimensionality reduction? Why is it the most effective clustering algorithm? What is the best way to balance the loss of information with low dimension reduction and poor clustering? How the parameters are tuned for this? Why HDB SCAN has been used?

seemarani
Автор

Very well explained. Can you suggest a source for text preprocessing before BERTopic.

mmishrafaculty
Автор

Very detailed explanation thank you sir 🙏

adityay