Count Vectorization in Python | CountVectorizer | Natural Language Processing with Python and NLTK

preview_player
Показать описание
[NLP with Python]: Count Vectorization in Python nltk
Join Skillshare and get 2 months Free.

CountVectorizer explained
n grams
tf-idf
#Count #Vectorization #python
Рекомендации по теме
Комментарии
Автор

You saved my a**. Thanks a lot sir. <3

yashsgupta
Автор

Is there data leakage calling fit_transform on both train and test data? Should you call it on just the training data?

tjbwhitehea
Автор

Sir I have a question and i hope that you will answer it. The clean_text() is nothing but the combined work of tokenization & stemming. In this video you have explicitly read pure(untouched) dataset and passed the clean_text() as analyzer in CountVectorization.
Unlike what you did in your previous videos, opening dataset again and again, I kept adding output columns in the main set.
clean_text() work is to stem the 'msg', but since I already have a stemmed_msg, can't I pass stemmed_mgs directly in .fit_transfrom(), without analyzer=clean_txt, if yes then please tell how because I tried and I received error.

shivendrayadav
visit shbcf.ru