Dive Into Deep Learning, Lecture 2: PyTorch Automatic Differentiation (torch.autograd and backward)

Показать описание

In this video, we discuss PyTorch’s automatic differentiation engine that powers neural networks and deep learning training (for stochastic gradient descent). In this section, you will get a conceptual understanding of how autograd works to find the gradient of multivariable functions. We start by discussing derivatives, partial derivatives, and the definition of gradients. We then discuss how to compute gradients using requires_grad=True and the backward() method. Thus, we cover classes and functions implementing automatic differentiation of arbitrary scalar-valued and non-scalar-valued functions. We also discuss the Jacobian matrix in PyTorch. Differentiation is a crucial step in nearly all machine learning and deep learning optimization algorithms. While the calculations for taking these derivatives are straightforward, working out the updates by hand can be a painful and tedious task.

#Autograd #PyTorch #DeepLearning

Рекомендации по теме

Комментарии

In my opinion, what makes this a great video is its (quick) review of the prerequisite knowledge, and including simple examples to that we can compute by hand and then verify that we actually get the same thing when using PyTorch.

Thanks a lot!

adnanhashem

This is what I was searching for months. I m so thankful Dr. Science for breaking down this concept for us

michaelscience

This is what a proper tutorial should be. Thanks a lot. Subscribed

sktdebnath

Truly thankful to you. To the point without confusion. Thank you once again

reddysekhar

Excellent videos and the textbook, deeply admire your contributions

atanudasgupta

why do you need to multiply v vector to the jacobian matrix? and what is v vector exactly?

edd-bpmj

thanks man i was finding it extremely difficult to understand the maths behind backward and detach (although i have doen it in my high school) because no one was explaining them in this you 😍😍

nitinsrivastav

Thanks for the video, it really cleared up Pytorch autograd, now I will be making notes on this gold nugget

azzyfreeman

Amazing as always!! Very helpful and valuable video!

parisahajibabaee

14:50 x.grad contains the valuesd of partial{y} / partial{x}
17:50 x.grad.zero_()
25:00 gradient for multiple inputs -> multiple outputs. Since the Jacobian is a matrix, we need to input a 1-d tensor to get a valid vector-output. => But our loss function has been a scalar, so this is why I am not accustomed to this form.
34:10 explaining .detach(). => treat those as constants, not a variable that we differentiate w.r.t.

홍성의-iy

Nice explanation. Found it from youtube recomendations

ГеоргийЧерноусов-оъ

24:10
y.backward(torch.ones(3))
the input to y.backward is upstream gradient
and the size of the upstream gradient should be same as the size of y

and in previous cases we dont need insert upstream gradient cause
y, backward() same as y.backward(torch.tensor([1]))

ItahangLimbu

First of all, I want to express my gratitude to you for the work you have done. There is one thing i want you to ask: why do we write the partial derivatives of the scalar function in the form of column? Whereas, following the logic of Jacobian matrix, it should be a row. Thanks in advance!

Gibson-xnxk

Thank you sooo much! This finally clicked for me.

asfandiyar

thank you very much for uploading this video! very helpful!

rohithpeesapati

great explanation! I hope it will get the likes it deserves

nicolaemaria

hi !! thanks for the wonderfull video :) can you please explain why vector v is 1?? and what is derivative w.r.t to self??

arpit

thnks myfriend - in last section - whats is the z.sum ? what is the the (SUM) function for ? why yyou put the sum ?

ramincybran

Please Do cover a playlist on Graph Neural Networks (at least discuss all the basics and methods of GNNs). The internet world lacks quality contents on this topic

callpie

29:49 - As far as I understand in 'a.grad' should turn out [12., 18.]

greender

Dive Into Deep Learning, Lecture 2: PyTorch Automatic Differentiation (torch.autograd and backward)

Dive into Deep Learning - Lecture 1: PyTorch Tensor Basics, Operations, Functions, and Broadcasting

Dive into Deep Learning D2L at WAIC'20

Dive Into Deep Learning - Lecture 3: Build a Simple Neural Network from Scratch with PyTorch

Dive into Deep Learning (Study Group): Introduction to Deep Learning | Session 1

Dive Into Deep Learning, Lecture 2: PyTorch Automatic Differentiation (torch.autograd and backward)

MIT Introduction to Deep Learning (2023) | 6.S191

Dive into Deep Learning: Coding Session #1 – Setup & MLP (APAC)

Dive into Deep Learning: Coding Session #1 – Setup & MLP (Americas/EMEA)

EfficientML.ai Lecture 18 - Diffusion Models (Zoom Recording) (MIT 6.5940, Fall 2024)

Dive into Deep Learning (Study Group): Convolutional Neural Networks | Session 6

Dive into Deep Learning with Scala by Sören Brunk

How I’d learn ML in 2024 (if I could start over)

Dive Into Deep Learning Session 1 | Introduction

But what is a neural network? | Chapter 1, Deep learning

Dive into Deep Learning: Coding Session #2 – CNN model (APAC)

Deeper Dive Into Deep Learning: a survey of techniques (Raphael Gontijo Lopes)

Dive into Deep Learning - Lecture 4: Logistic/Softmax regression and Cross Entropy Loss with PyTorch

Dive into Deep Learning – Lec 6: Basics of Object-Oriented Programming in PyTorch (torch.nn.Module)...

Dive Into Deep Learning - Lecture 5: Parameter Access, Initialization, and storage in PyTorch

Dive into Deep Learning: Coding Session#5 Attention Mechanism II (Americas/EMEA)

Deep Dive into Deep Learning Pipelines continues - Sue Ann Hong & Tim Hunter

NLP | Dive into Deep Learning for NLP - Part 1

Deep Dive into Deep Learning Pipelines - Sue Ann Hong & Tim Hunter

Dive into Deep Learning Lec7: Regularization in PyTorch from Scratch (Custom Loss Function Autograd)