Converting from PyTorch to PyTorch Lightning

Показать описание

In this video, William Falcon refactors a PyTorch VAE into PyTorch Lightning. As it's obvious in the video, this was an honest attempt at refactoring a new repository without having prior knowledge of it. Despite this, the full conversion took under 45 minutes.

This video is meant to show all the details and issues you might run into while converting a model.

The original VAE is here:

The refactored Lightning VAE is here:

00:00 - Intro
00:55 - Why you need Pytorch lightning (even though PyTorch is already simple)
01:51 - Advantages of 16-bit precision
02:27 - Tour of the PyTorch Lightning repo
03:28 - Finding the "magic" (ie: the training loop core code)
07:47 - training_step
10:34 - train_dataloader
12:09 - configure_optimizers
12:54 - training_step vs forward
14:44 - validation_step
23:55 - dataloaders passed into .fit() vs inside LightningModule
26:38 - how to structure forward
29:26 - validation_epoch_end
30:52 - Using tensorboard (or any other logger)
33:59 - automatic model checkpointing
34:44 - how to add all Trainer args to Argparse automatically
35:56 - single-GPU training
38:22 - multi-GPU training
39:32 - 16-bit precision training
40:41 - summary

Рекомендации по теме

Комментарии

I’ve been writing my own train and validation loops, logging etc but your refactoring video touched so many things we do over and over in every model. Thanks for the awesome library and a very useful video to show its benefits.

katnoria

Great tutorial, looking forward to the next one. I'm currently struggling to convert a complex training loop from StyleGan2

bogabrain

Really good, I can definitely see the benefit of using lighting. I've been refactoring my code as I watch. It's not very often I finds YouTube video that's I've rewatched as many times as this one. One thing that I may have missed or you may have accidentally edited out or skipped. As part of the refactor of the data loaders you seem to have switched to passing the args to the next constructor via the hparams parameter. At 25:04 you mention that "we're going to generalise it (args) in a second". Then you move on to refactoring "forward" and at 32:18 you're still using args but when you return from tensorboard at 33:06 you're using hparams.

davewaterworth

I would love a video just on loggers and a maybe a complex NLP example.

SudarshanSrinivasan

Heard the Tensorflow comment and thought "oh maybe this video is super old" ... nope 😅. I do that kind of debugging in Tensorflow 2 all the time.

JackofSome

The refactored Lightning VAE (last version) cannot automatically download MNIST even if download=True in train_dataloader. An error was reported from val_dataloader. Is this something about self-check? I have to download it myself or put dataloader outside (just like you did at the beginning).

CD-kdem

Great video! Could you please clarify how to get a speedup from multi-gpu? It seems from tqdm line that training time for an epoch is getting larger in multi-gpu case, compare e.g. 37:54 and 38:50.

renatabbyazov

If you're using "cross_entropy_with_logits" -- shouldn't you remove "sigmoid" also?

not_a_human_being

I am a fan of PytorchLightning. However, I can't find any information about train/eval modes when we use batchnorm and dropout. Is PytorchLightning able to handle batchnorm and dropout management automatically? Do i need to use model.train() and model.eval() when i use batchnorm and droput in PytorchLightning?

aykutcayir

Please do a video on handling video on vocab and multi-task learning

thak

I'm a bit busy at the moment, but I'm planning to move my SR project from Pytorch to PL soon, I've been looking at the code and examples and it seems very straightforward. Just out of curiosity, is there support for multiple losses in the dictionary or only 'loss'? I test multiple loss functions and usually log them individually to track their behavior. I can change that, but just to know if that's an option. Thanks!

Phobos

I'm too lazy to read more now. One quick question, how well do this multigpu support batchnorm

menglin

Hey, just started out using Lightning. I like that you don't have to faff about with to.(device) Just a bit scary that it handles all the optimizer stuff.
And this tutorial was really helpful. I think if you did a series of different models it would gain a lot of traction, if this becomes de facto standard for neurips.
I had a couple of questions.
Why do you not do loss.item()? I'm hoping lightning deals with this efficiently. I notice when I do use .item() it doesn't wokr.
I was wonderingg what your process in your work is with validation train and (test) loader? Do you just do split, don'y shuffle validation and do early stopping type thing with validation, and a sepearte batch just for .test() at the end.
Thanks a lot.

jordieclive

Why should the shuffle be turned off in the validation dataloader?

MrErkout

4:10 Do you have a video that covers more complicated models? My model is an extremely complicated multi-file multi-package model. I'm fixed one error about "device=self.device" which I fixed with the ugly hack of hardcoding "device='cuda:0'" (which only works if you have a gpu with cuda). I tried both to_torchscript() and to_onnx() on my model and both fail.

ThinkTank

How do we pass in multiple GPU indexes i.e. if we want to use specific GPUs like gpus = [0, 2, 3, 8] using only the 1st, 3rd, 4th and 9th GPU. How do we do that?

imflash

I am unable to run in tpus both in kaggle and Collab .. in kaggle it isn't using tpu

varunsai

Can I technically not even define a forward method and just use self.encode and self.decode instead? Someone assist please!

Darkev

Is the return of the method "training_step" changed? I see in the documentation that now you return just the loss and not the dictionary :)

riccardmenoli

Where could I set the number of epochs? Since there is only Trainer and fit()

faraway

Converting from PyTorch to PyTorch Lightning

Converting from PyTorch to PyTorch Lightning

Converting from pytorch to pytorch lightning in 4 minutes

PyTorch in 100 Seconds

How to convert from PyTorch into PyTorch Lightning

PyTorch Lightning Tutorial - Lightweight PyTorch Wrapper For ML Researchers

How to convert PyTorch model to Tensorflow

convert pytorch to tflite

How to save and load models in Pytorch

PyTorch for Deep Learning & Machine Learning – Full Course

convert tensorflow model to pytorch

Convert List To PyTorch Tensor - PyTorch Tutorial

convert pytorch model to torchscript

PyTorch vs TensorFlow | Ishan Misra and Lex Fridman

How to Convert from Keras/Tensorflow to PyTorch

How to convert almost any PyTorch model to ONNX and serve it using flask

Custom pytorch model conversion| YOLOv5 | Pytorch to Tensorflow lite Conversion | Lightweight Model

Convert PyTorch models to ONNX | Resnet

Building a Neural Network with PyTorch in 15 Minutes | Coding Challenge

PyTorch Tutorial 01 - Installation

How to convert PyTorch model to Tensorflow | onnx.ai | Machine Learning | Data Magic

Pytorch Tutorial 6- How To Run Pytorch Code In GPU Using CUDA Library

Day 18 - Pytorch model to Tensorrt conversion

PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Using PyTorch | Edureka

How to Convert PyTorch to Engine | TensorRT