filmov
tv
Gradient Descent algorithm Simplified
Показать описание
- Gradient descent is an optimization algorithm used to obtain the optimized network weight and bias values
- It works by iteratively trying to minimize the cost function
- Gradient Descent algorithm works by calculating the gradient of the cost function and move in the negative direction until the local/global minimum is achieved. If the positive of the gradient is taken, local/global maximum is achieved.
- The size of the steps taken are called the learning rate. If learning rate increases, the area covered in the search space will increase so we might reach global minimum faster . However, we can overshoot the target.
- For small learning rates, training will take much longer to reach optimized weight values.
GRADIENT DESCENT WORKS AS FOLLOWS:
1. Calculate the gradient (derivative) of the Loss function 𝝏𝒍𝒐𝒔𝒔/𝝏𝒘
2. Pick random values for weights (m, b) and substitute
3. Calculate the step size (how much are we going to update the parameters?)
𝑺𝒕𝒆𝒑 𝒔𝒊𝒛𝒆 =𝒍𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝒓𝒂𝒕𝒆∗𝒈𝒓𝒂𝒅𝒊𝒆𝒏𝒕= 𝜶∗𝝏𝒍𝒐𝒔𝒔/𝝏𝒘
4. Update the parameters and repeat
𝒏𝒆𝒘 𝒘𝒆𝒊𝒈𝒉𝒕 = 𝒐𝒍𝒅 𝒘𝒆𝒊𝒈𝒉𝒕 –𝒔𝒕𝒆𝒑 𝒔𝒊𝒛𝒆
𝒘_𝒏𝒆𝒘=𝒘_(𝒐𝒍𝒅 )−𝜶∗𝝏𝒍𝒐𝒔𝒔/𝝏𝒘
I hope you will enjoy this video and find it useful and informative.
Thanks and happy learning!
Ryan
#gradientdescent #deeplearning #machine learning #AI
- It works by iteratively trying to minimize the cost function
- Gradient Descent algorithm works by calculating the gradient of the cost function and move in the negative direction until the local/global minimum is achieved. If the positive of the gradient is taken, local/global maximum is achieved.
- The size of the steps taken are called the learning rate. If learning rate increases, the area covered in the search space will increase so we might reach global minimum faster . However, we can overshoot the target.
- For small learning rates, training will take much longer to reach optimized weight values.
GRADIENT DESCENT WORKS AS FOLLOWS:
1. Calculate the gradient (derivative) of the Loss function 𝝏𝒍𝒐𝒔𝒔/𝝏𝒘
2. Pick random values for weights (m, b) and substitute
3. Calculate the step size (how much are we going to update the parameters?)
𝑺𝒕𝒆𝒑 𝒔𝒊𝒛𝒆 =𝒍𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝒓𝒂𝒕𝒆∗𝒈𝒓𝒂𝒅𝒊𝒆𝒏𝒕= 𝜶∗𝝏𝒍𝒐𝒔𝒔/𝝏𝒘
4. Update the parameters and repeat
𝒏𝒆𝒘 𝒘𝒆𝒊𝒈𝒉𝒕 = 𝒐𝒍𝒅 𝒘𝒆𝒊𝒈𝒉𝒕 –𝒔𝒕𝒆𝒑 𝒔𝒊𝒛𝒆
𝒘_𝒏𝒆𝒘=𝒘_(𝒐𝒍𝒅 )−𝜶∗𝝏𝒍𝒐𝒔𝒔/𝝏𝒘
I hope you will enjoy this video and find it useful and informative.
Thanks and happy learning!
Ryan
#gradientdescent #deeplearning #machine learning #AI
Комментарии