Understanding and Applying XGBoost Regression Trees in R

preview_player
Показать описание
===== Likes: 67 👍: Dislikes: 2 👎: 97.101% : Updated on 01-21-2023 11:57:17 EST =====
Ever wonder if you can score in the top leaderboards of any kaggle competition? Look on further! Check out the XGBoost Regression Model, an ensemble boosting method, known for its powerful robustness. This is an industry grade model that is heavily applied in many industries. Knowing this algorithm can get your foot in the door in many industries containing Data Science!!!

Github link:

Data Set Link:

--------------------------------------------------
Additional Material to check out !

Ensemble Method Boosting:

Ensemble Method Stacking:

Ensemble Method Bagging:

Methods of Sampling:
--------------------------------------------------

0:00 - Algorithmic procedure of XGBoost Regression
5:53 - Understanding Data & Applying XGBoost
7:57 - Explaining XGBoost Regression Gridsearch Parameters
10:17 - TrainControl & Final model & Evaluation
Рекомендации по теме
Комментарии
Автор

Thank you, I spent hours searching tutorials for this model and none of them worked for me.

mateoq
Автор

Exceptional video, thank you so much❤

MrTuck
Автор

Good tutorial. Thanks for sharing. I have a question on the feature importance. How can we get Feature importance from XGBoost? Can you add that xgb.importance object for our reference?

srinivasanbalan
Автор

Thank you for this video. it helped me in immeasurable ways. please, how can I get the R2?

Joga_teve
Автор

Hi Spencer, Can be possible you can do a model whit AdaBoost algorithm for variable cuantitative continuous? is necessary transform a target variable numeric to category to apply this algorithm? thank for your conteny is wonderfull for us!

micalaravena
Автор

Hi,

i tried running your code, however when i ran the xgb_tune i got this error "Error: Please make sure that the outcome column is a factor or numeric . The class(es) of the column: 'tbl_df', 'tbl', 'data.frame'. what do i do know ?

bhuvanaelango
Автор

Fantestic video. BTW, how to calculate the AUC of this model?

kellychen
Автор

Hi Spencer Pao:

Thank you for your video. i have two questions.
1. how could I know which independent attributes are important in the regression?
2. Why other people use the following coding for GBDT. why yours are very different than theirs?
bst_model <- xgb.train(params = xgb_params,
data = train_matrix,
nrounds = 1000,
watchlist = watchlist,
eta = 0.001,
max.depth = 6,
gamma = 0,
subsample = 1,
colsample_bytree = 1,
missing = NA).

juanwang
Автор

Can you also show how to get Gini for the model?

nikeshnavele
Автор

Hey Spencer, Thanks a lot for the video. Really liked it.
I wanted to point out though, you encoded 'county' and 'state' column as numeric in your data pre-processing stage. This seems like an incorrect way to encode these columns as XGBoost will see it as ordinal values rather than nominal data type. This can result in a brittle model and over-fitting easily.
Hope this helps.
Please keep creating more content, much appreciate the work!

Aman-cyzj
Автор

Hey Spencer! Sorry to be commenting so long after this post haha. This is very impressive, your walk through was so much more in depth than other xgboost for regression guide I've found. I've got a question for you: I have 14 predictor variables, most are binary, some are continuous, and then there are three categorical. I'm worried about one of those categorical variables because it has 14 levels. Will including that variable (after I've changed all categorical variables to numeric) mess anything up with the model? I was thinking I should just not include it, but it looks like your data has multiple categorical variables that also have more than a few levels.
For background, I'm only using the xgboost decision tree for variable selection and an insight into variable importance. I will be plugging in the recommended variables into a LR model for interpretability purposes.
Let me know what you think! Great content! I'm glad I found your page!

jakewhitworth