LLMOps: OpenVino Toolkit quantization 4int LLama3.2 3B, Inference CPU #datascience #machinelearning

preview_player
Показать описание
In this video I will show you how to convert a model LLAMA3.2 3Billions to format Openvino IR and quantize it to 4Int. Later we will do inference in CPU using CoT prompts

Notebook:
Рекомендации по теме