How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models

preview_player
Показать описание
Topics covered:

1. Object detection background, inluding history and current solutions
2. Easy ways to optimize/sparsify YOLOv5 models
3. Applying your own data with sparse transfer learning or sparsifying from scratch
3. Deploying YOLOv5 by exporting to ONNX and inferencing in the DeepSparse Engine on commodity CPUs at GPU speeds
4. Future research, next steps, and open discussion

After watching this video, you’ll be able to optimize your computer vision models, apply your own data with a few lines of code, and deploy it on commodity CPUs at GPU-level speeds.

Timeline:

00:00 - Intro & agenda
04:15 - Background - solutions & history
06:57 - Background - Ultralytics & compound scaling
09:50 - Why sparsify object detection models
11:00 - Sparse transfer learning
17:20 - Sparsifying from scratch
25:18 - Deploying - exporting to ONNX
26:39 - Deploying - setup, annotation, and benchmarking
31:24 - Deploying - Python & server
34:33 - Deploying - DeepSparse licensing
35:03 - Future research
37:21 - Open discussion & Q&A

Рекомендации по теме
Комментарии
Автор

Can you pls tell me how can I make this yolo pruned+quantised model as an api?

soumyaprusty
Автор

This is great but could we expect detailed architecture or steps that were performed for the sparcification.

dhruvrajlathiya
Автор

is sparseml available for arm architecture cpu

anime_on_data
Автор

we need yolo v7 deep sparse. is that too much to ask for?

modeltrainer