Boost YOLO Inference Speed and reduce Memory Footprint using ONNX-Runtime | Part-1

preview_player
Показать описание
In this stream we would convert Ultralytics YOLO models from their standard Pytorch (.pt) format to ONNX format to improve their infrerence speed and reduce their memory footprint by running them through onnx-runtime. Our goal would be to get the predictions to exactly match in both compute sessions and also benchmark the improvements. For our first video of the series we would be working with OBB (Oriented Bouding Box) models.
Рекомендации по теме
Комментарии
Автор

Bro, why when i run onnx model with uint8 quantize, that i quantize from onnxruntime with dynamic quatntize, it slower than default(float32) onnx model?? Btw i export from .pt ultralytics and inference it with ultralytics too not onnxruntime

gomgom
welcome to shbcf.ru