Quantization in Neural Networks - Basics Explained | Affine and Symmetric Quantization

Показать описание

This tutorial explains the basics behind different quantization approaches explaining the math and the intuitions. Explains how the mapping is done from float32 precision to int8 precision.

----------------------------------------------------------------------------------------------------------------

Reference materials for further reading.
----------------------------------------------------------------------------------------------------------------

BGM Credits
🔻
Song: "Sappheiros - Falling (Ft. eSoreni) [Chill]" is under a Creative Commons license (CC-BY)
🔺

Рекомендации по теме

Комментарии

Great tutorial! Could you share the third reference "Nvidia docs on Quantisation Basics"? The page not found. Thanks!

zhou

Cool! If you increased font size and showed actual use I think that'd add a lot of visibility to this video. The content and explanations are great.

jeffr_ac

I would prefer without the background music 🥲

ThePdcaster

Quantization in Neural Networks - Basics Explained | Affine and Symmetric Quantization

Quantization in deep learning | Deep Learning Tutorial 49 (Tensorflow, Keras & Python)

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Quantization in Deep Learning (LLMs)

tinyML Talks: A Practical Guide to Neural Network Quantization

Downsizing Neural Networks by Quantization - Introduction to Deep Learning

GTC 2021: Systematic Neural Network Quantization

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

Understanding int8 neural network quantization

The Evolution of Neural Networks

ICLR Paper: Learn Step Size Quantization

Introduction to Quantization in Deep Neural Networks

Lecture 05 - Quantization (Part I) | MIT 6.S965

LoRA explained (and a bit about precision and quantization)

Quantization in Neural Networks - May 27, 2020

On Quantizing Implicit Neural Representations

Introduction to the quantization of neural networks

Part 1-Road To Learn Finetuning LLM With Custom Data-Quantization,LoRA,QLoRA Indepth Intuition

Neural network quantization with AdaRound

Model Quantization in Deep Neural Network (Post Training)

LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?

Hessian AWare Quantization V3: Dyadic Neural Network Quantization

Quantizing a Deep Learning Network in MATLAB

Learning Highly Sparse Deep Neural Networks through Pruning and Quantization

Quantization in Neural Networks - Basics Explained | Affine and Symmetric Quantization