Adrian Boguszewski - Beyond the Continuum: The Importance of Quantization in Deep Learning

preview_player
Показать описание
Quantization is a process of mapping continuous values to a finite set of discrete values. It is a powerful technique that can significantly reduce the memory footprint and computational requirements of deep learning models, making them more efficient and easier to deploy on resource-constrained devices. In this talk, we will explore the different types of quantization techniques and discuss how they can be applied to deep learning models. In addition, we will cover the basics of NNCF and OpenVINO Toolkit, seeing how they collaborate to achieve outstanding performance - everything in a Jupyter Notebook, which allows you to try it at home.
Рекомендации по теме
Комментарии
Автор

Nice talk! I think that if you see such a low loss in Post-Training Quantization (but high reward in model size) and you hints that Quantization-Aware Training is better but we can obviously not use it on released models, the LLM field will soon have to switch to it. Unless quantization is just one of the many techniques, not the optimal one, but a convenient patch for the present time.

DanieleO.