Adam Optimization Algorithm (C2W2L08)

preview_player
Показать описание

Follow us:
Рекомендации по теме
Комментарии
Автор

Clarification about Adam Optimization

Please note that at 2:44, the Sdb equation is correct. However, from 2:48, the db² lost the ².

The bottom right equation should still be:

Sdb = β₂Sdb + (1 - β₂)db²

manuel
Автор

any time I want to implement ML from scratch, I watch all Andrew's videos from beginning to end! I don't know how to express my appreciation to this great man.

mostafanakhaei
Автор

This video is closely related to the video "Bias Correction of Exponentially Weighted Averages". Please revisit that video if you feel this is too confusing.

pipilu
Автор

from 0:00-4:36, S_db is missing a square on db element, it should be s_db = b_2*s_db +(1-b_2)*db^2

danlan
Автор

I don't understand why some people hating, - yes, Proff missed a couple of symbols (once in a lifetime)
The matter of truth - without his or Geoffrey's videos to watch we would be totally fucked ))

IgorAherne
Автор

i am confuse to the maximum level, can i buy more brain power like i buy more rams?

douglaskaicong
Автор

The very best and most succinct explanation of ADAM I've ever seen. Things become crystal clear if one watches L06 to L08 in a row.

mllo
Автор

Why did you erase the squared at 2:46? Shouldn't RMSprop have a squared term for the bias as well?

jerrylin
Автор

Only understood his friend has nothing to do with Adam optimization!

sahanmendis
Автор

This nailed down the Adam paper. Thanks alot

jerekabi
Автор

please apply a low pass filter on the audio of this video

aamad
Автор

Haha showing Adam there was hilarious :>

EranM
Автор

Eve Optimization Algorithm will come soon!

Troglodyte
Автор

Could anyone give me a list of the notations he mentions in the video or direct me towards a video that has those explained? Main issue with understanding the concept in the video is the lack of explanation of the notations used.

GRMREAPR
Автор

You are so sweet. Thank you Sir, for these awesome videos!

submagr
Автор

Why do we split W and b here? Like the bias vector and the weight matrix if I understand it correctly. Cant we just use a multiplication of them and work with the overall matrix?

llst-shjf
Автор

I assume we use epsilon to avoid dividing by 0?

bayesed
Автор

you really dont think that statement of the problem that ADAM solves is of relevance, when you are introducing ADAM?

omidtaghizadeh
Автор

Explain what is v what is w not just saying terms some of us are begginers

NicksonMugo-tgsg
Автор

what is t I do not completely understand

sashakobrusev
visit shbcf.ru