PYTORCH COMMON MISTAKES - How To Save Time 🕒

preview_player
Показать описание
In this video I show you 10 common Pytorch mistakes and by avoiding these you will save a lot time on debugging models. This was inspired by a tweet by Andrej Karpathy and that's why I said it was approved by him :)

Andrej Karpathy Tweet:

❤️ Support the channel ❤️

Paid Courses I recommend for learning (affiliate links, no extra cost for you):

✨ Free Resources that are great:

💻 My Deep Learning Setup and Recording Setup:

GitHub Repository:

✅ One-Time Donations:

▶️ You Can Connect with me on:

OUTLINE:
0:00 - Introduction
0:21 - 1. Didn't overfit batch
2:45 - 2. Forgot toggle train/eval
4:47 - 3. Forgot .zero_grad()
6:15 - 4. Softmax when using CrossEntropy
8:09 - 5. Bias term with BatchNorm
9:54 - 6. Using view as permute
12:10 - 7. Incorrect Data Augmentation
14:19 - 8. Not Shuffling Data
15:28 - 9. Not Normalizing Data
17:28 - 10. Not Clipping Gradients
18:40 - Which ones did I miss?
Рекомендации по теме
Комментарии
Автор

Here is the outline for the video, let me know which ones you think I missed:
0:00 - Introduction
0:21 - 1. Didn't overfit batch
2:45 - 2. Forgot toggle train/eval
4:47 - 3. Forgot .zero_grad()
6:15 - 4. Softmax when using CrossEntropy
8:09 - 5. Bias term with BatchNorm
9:54 - 6. Using view as permute
12:10 - 7. Incorrect Data Augmentation
14:19 - 8. Not Shuffling Data
15:28 - 9. Not Normalizing Data
17:28 - 10. Not Clipping Gradients
18:40 - Which ones did I miss?

AladdinPersson
Автор

Common mistakes for me:
Getting confused with tensor dimensions (as a new guy you can spend plenty of time before harnessing the power of unsqueeze())
Forgetting .cuda() or .to(device)
Getting confused with convnet dimensions after conv layer is applied
Not attempting to balance or disbalance the dataset on purpose, which can be useful
etc.
Love your
videos man, they've helped me alot.

igordemidion
Автор

So much good info here. I’ve been doing ML for 5 years n it is always good to review the basics every now n then.

Hoxle-
Автор

I honestly didn’t expect this video to be this professional and informative judging by the thumbnail and title

FaisalAES
Автор

This channel doesn't provide the basic tutorials which are there in the documentations and that's why it's very awesome. Thanks for your genuine content :D

sagnikroy
Автор

Could you clarify at 7:03 how does softmax on softmax lead to vanishing gradient?

Han-veuh
Автор

Some of my favourites are breaking the computational graph (e.g. using numpy functions instead of pytorch ones) or backpropagating somewhere you shouldn't.
Or getting dimensionalities wrong and getting screwed over by Numpy''s automatic broadcasting.
Or in general not looking for existing Pytorch functions and reinventing the wheel over and over again.

MrCmon
Автор

The first tip just led me to the solution. Thanks!

emanuelhuber
Автор

Great thanks from Russia. Really love your videos. In a very short time a got PyTorch essentials with the help of yours. So many models have been understood and implemented with your help. Keep it going buddy!!!!

gomeincraft
Автор

AMAZING VIDEO, THANKS VERY VERY MUCH!!!

mihneaandreescu
Автор

These practical tips are really useful.

orjihvy
Автор

Extremely informative as always.
Thank you !

sulavojha
Автор

These videos are always so fire, thank you sir

pawnagon
Автор

You are the best! Just fixed few things

ceo-s
Автор

Very useful tips for a novice like me
Thank you

jim_
Автор

Many many many thanks to your video! The contents are all gold to newbie pytorch user and such a great guide!

siyuancheng
Автор

I made all these mistakes when I was newbie at Pytorch and still do it now sometimes
This is a very helpful video

saminchowdhury
Автор

My fun mistake - added a ReLU in the last layer (before CrossEntropyLoss) - the model trains poorly for a while, then just stops training (once all logits have been driven below zero).

xlxlxl
Автор

I don't think we need to shuffle the validation or test set right? Cuz there we will only be making the predictions and calculating our metrics like loss and accuracy, which are totally unaffected whether you shuffle or not.
Plz do correct me if I'm wrong, thanks.

wolfisraging
Автор

1:52
Low loss doesn't mean overfitting (I agree it's a good idea to run on small dataset at first don't get me wrong)

MorisonMs