State of the Art Deep Learning Based Object Detection in 2D

Показать описание

Dr. James G. Shanahan - Church and Duncan Group and UC Berkeley

Abstract:
The main focus of object detection, one of the most challenging problems in computer vision (CV), is to predict a set of bounding boxes and category labels for each object of interest in an image or in a point cloud. As such, object detection has a variety of exciting downstream applications such as self-driving cars, checkout-less shopping, smart cities, cancer detection, and more. This field has been revolutionized by deep learning over the past five years, where during this time, two-stage approaches to object detection have given way to simpler, more efficient, one-stage models. Mean average precision (mAP) on benchmark problems such as the COCO Object Detection dataset has improved almost 4X over the course of five years from 15% (Fast RCNN, a two-stage approach) to 55% (EfficientDet7x, a one-stage approach). This talk looks under the hood of state-of-the-art object detection systems, such as two-stage, one-stage, and also more recent approaches based upon transformers. It will provide a cheat sheet on how to jumpstart or upgrade your detection pipelines to state-of-the-art, while also highlighting some of the key challenges that remain. Examples and associated detection pipelines will be provided in Jupyter Notebooks using PyTorch, Python, and OpenCV. While the primary focus is on object detection in digital images from cameras and videos, this talk will also introduce object detection in 3D point clouds.