Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

preview_player
Показать описание

Four techniques to optimize the speed of your model's inference process:
0:38 - Quantization
5:59 - Pruning
9:48 - Knowledge Distillation
13:00 - Engineering Optimizations

References:

Рекомендации по теме
Комментарии
Автор

This was one of the best explanation videos I have ever seen! Well structured and right complexity grade to follow without getting a headache. 👌

thomasschmitt
Автор

This felt very nicely taught -- I loved that you pulled back a summary/review at the end of the video - great practice. Please continue, thank you!

_gunna
Автор

Excellent video. Well spoken. Nice visualizations.

muhannadobeidat
Автор

wonderfully explained !!
Thanks for the video.

vineetkumarmishra
Автор

Great summary/outline at 17:16
This video covers a lot of relevant topics for neural networks and edge AI.

carlpeterson
Автор

Great format, succinctness, and diagrams. Thank you!

ljkeller_yt
Автор

that was really nicely done. as a non-expert, I feel like I can now have a great general idea of what a quantized model is. thank you

bonob
Автор

Fantastic introduction and explanation !

huiwencheng
Автор

Great content, well done. Please make a video for ONNX, and another one for Flash Attention. Appreciate.

unclecode
Автор

your teaches so excellent.. we accepted many more videos from your side to understand for the fundamental NLP

RamBabuB-rs
Автор

Thank you for the video Sir.
So please, is quantization just about feature engineering task of data types enforcement of enforcing data types that take only small space? Or it is more than that?

tosinadekunle
Автор

Excellent video, learnt a lot! However, the definition of zero-point quantization is off. What you're showing in the video is the abs-max quantization instead.

yunlu
Автор

And if one was to quantize a distilled model? Is the outcome any good?

ricardokullock
Автор

I heard multiply by 0 operations are faster to process. Are you sure all operations take the same speed?

julians
Автор

The explanation for distillation remains at the surface, it is not enough to understand it

andrea-mjce