Model Quantization in Deep Neural Network (Post Training)

Показать описание

#machinelearning #neuralnetwork #quantization

In this video we talk about post training model quantization that allows for reduced precision representations of weights and possibly activation. This helps reducing the storage footprint of model as well as computational need. There are many advantages of quantization especially when models are expected to be deployed in low compute or low storage like mobile or on edge of IOT devices

Below are some article where you can read more on this topic

There are also options of quantization aware training that you can read below

Рекомендации по теме

Комментарии

Thanks for the helpful video! Very clear~

kyoungsub

I hope this channel gains popularity between the AI community.

RamkrishanYT

I am reviewing it again and just hoping that the voice quality can be better next time!
One question is that in 1:18, you mention that the quantization can increase the performance of the model.
May I know in what perspective are you talking about?

kyoungsub

is it true that accuracy of the model reduces because of quantization?

prudhvirajboddu

Great Video Sir!
Are You In Twitter loves to follow you!

HeyFaheem

Model Quantization in Deep Neural Network (Post Training)

Quantization in deep learning | Deep Learning Tutorial 49 (Tensorflow, Keras & Python)

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Quantization in Deep Learning (LLMs)

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

Model Quantization in Deep Neural Network (Post Training)

LoRA explained (and a bit about precision and quantization)

Introduction to Quantization in Deep Neural Networks

Understanding: AI Model Quantization, GGML vs GPTQ!

Downsizing Neural Networks by Quantization - Introduction to Deep Learning

ICLR Paper: Learn Step Size Quantization

Part 1-Road To Learn Finetuning LLM With Custom Data-Quantization,LoRA,QLoRA Indepth Intuition

tinyML Talks: A Practical Guide to Neural Network Quantization

Quantizing a Deep Learning Network in MATLAB

Faster Models with Similar Performances - AI Quantization

Quantization of Deep Learning Solution for Efficient Inference | Kim Hee, UMM [PyData Südwest]

Named Tensors, Model Quantization, and the Latest PyTorch Features - Part 1

Model Quantization for Edge Devices with AIMET

Recipes for Post-training Quantization of Deep Neural Networks (Abstract)

54 - Quantization in PyTorch | Mixed Precision Training | Deep Learning | Neural Network

Quantization in Neural Networks - Basics Explained | Affine and Symmetric Quantization

How to Choose AI Model Quantization Techniques | AI Model Optimization with Intel® Neural Compressor...

Inder Preet - Pruning and quantization for deep neural networks

Deep Dive on PyTorch Quantization - Chris Gottbrath

Understanding Quantization for Deep Learning