End to End Text Classification using Python and Scikit learn

preview_player
Показать описание
#datascience #textclassification #nlp

In this video we will create an end to end NLP pipeline starting from cleaning text data, setting NLP pipeline, model selection, model evaluation, handling imbalanced dataset among others

In next set of videos we will use complex models to see how we can improve the performance of this model
Рекомендации по теме
Комментарии
Автор

Lovely presentation Sir .Thank you so much for the detailed video .Hats Off to you !

anuradhabalasubramanian
Автор

Extremely useful tutorial. The explanation and pointers make it specifically easy to follow and learn especially the usefulness of H2o...Thank you very much.

ijeffking
Автор

Wow it is a great tutorial, eagerly waiting for the next video.

deepakkumar-zcok
Автор

Nice tutorial and providing details of every argument and there usefulness. Use of H2O autoML is new learning for me. Thank you.

sukamal
Автор

I really liked watching the video. Few approaches as you mentioned could be word2vec approach but the draw back of it is that it is very data specific and generalizing it on unseen data is tough, one more pipeline could be
1. Transforming using GloVe +H2O pipeline you just mentioned
2. Using ELMo, BERT for transforming (look into flair package) and then implementing H2O pipeline
These two approaches could be very good to boost your f1 score.

karimbaig
Автор

Congrats on hitting the 20K mark, sir! Glad that I am a part of this.

sharanbabu
Автор

Great tutorial .. would oversampling or undersampling methods help with imbalanced class in this case.. I have used it for non-NLP use cases but not yet on NLP

shrikantkulkarni
Автор

Amazing tutorial! I have just one query. The aml.train() step just takes too long, more than an hour and I haven't been able to proceed. Do I need to change the runtime type or something to make it run faster?

Mayuresh
Автор

Excellent tutorial, Srivatsan. Learn something new about H20. Thanks very much for the explanation. However one request if you can go bit slow; sometimes I need to pause the video to understand the concept.

aboseutube
Автор

Great content. Happy that I learnt something new !! Thanks so much Srivatsan

vinothkumar-xwwy
Автор

Extremely useful content Sir! Thanks a lot for the demonstration (y)
Also Sir, I recall that in one of your videos that I saw (related to NLP), you had used named entity as features while training a classification model, was that the case ? I am just asking Sir because I am unable to find that video now.

Please let me know your suggestions.

siddhantkhare
Автор

Thank you for the great Tutorial. What if we combine the last two classes into one. and later have another binary classifier for them. Will that help improve the Recall ?

sushantpenshanwar
Автор

just one question, as you used test size as 80% but generally i know we take something like reverse for test size

rockshubham
Автор

Thanks for the lecture.I think train size must be 75% but u have taken test_size is 75, please check it once.

sasikiran
Автор

Hai sir, this class weights concept work for every boasting technique right !

kishorereddy
Автор

As you coverted h2o xgboost parameter to actual parameter how can i convert h2o GBM parameter to actual parameter for running a manual GBM algorithm with same paramter as h2o used.

bhaveshsalvi
Автор

Hey Srivatsan! I was trying to find the Notebook File for this project on your Github. Can you please help here! Thank you :)

Akash
Автор

Awesome explanation. Could you please provide github link for this code

kimayashah
Автор

How to add l1/l2 regularozation in the h2o automl itself

karimbaig
Автор

how to convert h2o hyperparameters into GBM hyperparameters instead of XGBoost hyperparameters?(GBM model gave least mean per class error for my dataset)

moushtaqahmad
welcome to shbcf.ru