7.5 Gradient Boosting (L07: Ensemble Methods)

preview_player
Показать описание

In this video, we will take the concept of boosting a step further and talk about gradient boosting. Where AdaBoost uses weights for training examples to boost the trees in the next round, gradient boosting uses the gradients of the loss to compute residuals on which the next tree in the sequence is fit.

-------

This video is part of my Introduction of Machine Learning course.

-------

Рекомендации по теме
Комментарии
Автор

this is the first video of yours I have come across, and it's by far the best I have found on this topic. Will be binging everything you have to offer from now on. Thanks for all the content, man!

deltax
Автор

I really liked the way you explained the steps with numbers. It helped me a lot to understand the notations of the equations.

nazmuzzamankhan
Автор

Thank you for the great explanation ! I liked the way you say "prediction" :)

yerhoam
Автор

Well I understood the Gradient Boosting part, as in we focus on the residuals and further make trees to lower the loss of previously_made_trees.

But couldn't grasp how XGBoost achieves this via parallel computations. Guess will have to read the paper : )

newbie
Автор

Hi Professor, thank you very much for the educational video! Do you have any thoughts on how this stepwise additive model compares to fitting a very large model with many parameters in a "stepwise" fashion based on gradient descent? For example, freezing and additively training subnetworks of a neural model.

justonecomment
Автор

Why does the tree in step 2 not have a third decision node to split Waunake and Lansing?

urthogie
Автор

Very nice video :) I was wondering why for gradient boosting we fit the derivative instead of the residual ? Intuitively that's what I would do :/

asdf_
Автор

i still wondering in minutes 13.19, why you choose age >= 30 as a root node? is that from residual or else?

muhammadlabib