NLP Cloud course: Deploy Mistral 7b on an NVIDIA A10 GPU on AWS

preview_player
Показать описание
This NLP cloud course shows how to deploy and use the Mistral 7b generative AI model on an NVIDIA A10 GPU on AWS.

The Mistral 7b model beats LLaMA 2 7b on all benchmarks and LLaMA 2 13b in many benchmarks. It is actually even on par with the LLaMA 1 34b model.
Deploying it and using it requires at least 15GB of VRAM which is why we need an A10 GPU with 24GB of VRAM at least.

Here is the structure of the course:

00:00 - Intro
01:33 - Creating the right AWS EC2 machine
05:38 - Checking that the A10 GPU is detected
07:13 - Writing a short script that downloads Mistral 7b, converts it to fp16, and performs inference
11:17 - Conclusion

Useful links:
Рекомендации по теме
Комментарии
Автор

Saw so many videos but didn't find exactly what I was looking for. This one seems perfect, straightforward, and easy to follow. Thank you. I would like to integrate this into my Django Project and use it with my own API. Thank you again.

remo
Автор

very appreciated - exactly what I was looking for, thx.

navicore
Автор

than you very much for the video, extremely helpful !

amethyst
Автор

For anyone who can't find the AMI he's using, they hid it under the "community AMIs" tab :)

wege
Автор

Hi, thanks for the video!
A question: why do you need 64GB RAM on the instance if Mistral 7B runs locally on LM Studio using ~4.5GB RAM only?

lycjfue
Автор

is it free? or should i buy an ec2 instance?

ilyassemssaad