Efficient PyTorch debugging with PyTorch Lightning

Показать описание

In this video we cover 4 features of PyTorch Lightning to make deep learning research easier.

When debugging neural networks Lightning has these 4 Trainer flags which can help:

Trainer(num_sanity_val_steps=5)
This flag runs 5 batches of validation before any training begins. This means if you have a bug in the validation loop, it won't take you a whole epoch of the training loop to find it.
This flag is on by default

Trainer(fast_debug_run=True)
This flag is a sort of "unit test". It runs through every line of your code by running one batch of training and one batch of validation. Then the trainer stops.
It's analogous to hitting "compile" in an IDE in the sense that it will crash if you have any bugs, or complete it it's fine.
Note: this doesn't catch math or logic bugs, only model discrepancies or bugs in your code.

When you finish implementing your model, you normally want to overfit a tiny percent of your data. If you can't do this, then your model likely has bugs.

This is the same idea as overfit_pct but allows you to be more granular about how much data from train and val to use.

Colab demo: