How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models

Показать описание

Topics covered:

1. Object detection background, inluding history and current solutions
2. Easy ways to optimize/sparsify YOLOv5 models
3. Applying your own data with sparse transfer learning or sparsifying from scratch
3. Deploying YOLOv5 by exporting to ONNX and inferencing in the DeepSparse Engine on commodity CPUs at GPU speeds
4. Future research, next steps, and open discussion

After watching this video, you’ll be able to optimize your computer vision models, apply your own data with a few lines of code, and deploy it on commodity CPUs at GPU-level speeds.

Timeline:

00:00 - Intro & agenda
04:15 - Background - solutions & history
06:57 - Background - Ultralytics & compound scaling
09:50 - Why sparsify object detection models
11:00 - Sparse transfer learning
17:20 - Sparsifying from scratch
25:18 - Deploying - exporting to ONNX
26:39 - Deploying - setup, annotation, and benchmarking
31:24 - Deploying - Python & server
34:33 - Deploying - DeepSparse licensing
35:03 - Future research
37:21 - Open discussion & Q&A