Accelerate Transformer inference on GPU with Optimum and Better Transformer

preview_player
Показать описание
In this video, I show you how to accelerate Transformer inference with Optimum, an open-source library by Hugging Face, and Better Transformer, a PyTorch extension available since PyTorch 1.12.

Using an AWS instance equipped with an NVIDIA V100 GPU, I start from a couple of models that I previously fine-tuned: a DistilBERT model for text classification and a Vision Transformer model for image classification. I first benchmark the original models, then I use Optimum and Better Transformer to optimize them with a single line of code, and I benchmark them again. This simple process delivers a 20-30% percent speedup with no accuracy drop!

⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️

Рекомендации по теме
Комментарии
Автор

Hey. I need some help. I like to mess around with these Ai models just for fun on huggingface. I had an Nvidia gpu with 12GB of vRAM. Decided to go for an AMD 7900 XT which has 20GB of vRAM to see if I could do a bit more with the models. Now I can't run pytorch with the gpu. It appears I need linux to do it but I also heard linux doesn't run well on my gpu. I suppose I could try dual booting but the problem there is my pc is also a media server in my house and I have programs that don't work in linux that I need to run. I'd really prefer one os that does it all. But I can't seem to figure it all out.

theonerm