13.4.1 Recursive Feature Elimination (L13: Feature Selection)

preview_player
Показать описание

In this video, we start our discussion of wrapper methods for feature selection. In particular, we cover Recursive Feature Elimination (RFE) and see how we can use it in scikit-learn to select features based on linear model coefficients.




Logistic regression lectures:

L8.0 Logistic Regression – Lecture Overview (06:28)

L8.1 Logistic Regression as a Single-Layer Neural Network (09:15)

L8.2 Logistic Regression Loss Function (12:57)

L8.3 Logistic Regression Loss Derivative and Training (19:57)

L8.4 Logits and Cross Entropy (06:47)

L8.5 Logistic Regression in PyTorch – Code Example (19:02)

L8.6 Multinomial Logistic Regression / Softmax Regression (17:31)

L8.7.1 OneHot Encoding and Multi-category Cross Entropy (15:34)

L8.7.2 OneHot Encoding and Multi-category Cross Entropy Code Example (15:04)

L8.8 Softmax Regression Derivatives for Gradient Descent (19:38)

L8.9 Softmax Regression Code Example Using PyTorch (25:39)

-------

This video is part of my Introduction of Machine Learning course.

-------

Рекомендации по теме
Комментарии
Автор

Prof Raschka, this is fantastic content! I've been following your course for a while now and always end up learning something new. Really appreciate all your hard work that goes into creating this material and sharing it with the world.

djethereal
Автор

Thank you it helped me a lot in my major project.

afreenbegum
Автор

thank you for this great leacture. In your other leacture about sequential feature selection you showed that backward selection (sbs) is superior than forward selection (sfs) according to a study. How does recursive feature elimination compare to sbs and sfs?

emsif
Автор

Thank you so much for the lectures, Prof Sebastian!
I was wondering what is the difference between the L1-regularized method and the RFE? I understand that one is embedded and the other is a wrapper, but they look pretty similar. Thanks in advance!

camilafragoso
Автор

Thank you very much!
1)Can you explain about wrapper methods -- can I use some fraction of training data set because using with whole dataset sometimes can be huge? If yes, what is best practices?
2)Is there any usage of test data for feature selection or I should select features based on training set only?

ocamlmail
Автор

Doesn't RFE work with non-linear model models like DT, EXtra tree classifier, random forest, and k-nn? since Every model has coefficients weights.

rajeshkalakoti
Автор

Hello Sebastian, really great content! I have two questions and it would be great to get an answer.
1. Is RFE actually model agnostic, why or why not?
2. Could I use RandomForest, Gradient Boosting, Neural Networks as RFE core algorithm and is it recommendable, if not, why?

Thank you a lot!!!

sasaglamocak
Автор

Don't we need to do hyperparameter tuning of the estimator inside the RFE? If no, then why?

shubhamtalks
Автор

Is it a must to split data into x_train, x_test, y_train, y_test when using RFE?

weii
Автор

Hi prof raschka. Can we use coefficients for feature importance even when features are normalised. It feels like a wrong parameter to use when p values are present

adityask
visit shbcf.ru