DETR: End-to-End Object Detection with Transformers (Paper Explained)

Показать описание

Object detection in images is a notoriously hard task! Objects can be of a wide variety of classes, can be numerous or absent, they can occlude each other or be out of frame. All of this makes it even more surprising that the architecture in this paper is so simple. Thanks to a clever loss function, a single Transformer stacked on a CNN is enough to handle the entire task!

OUTLINE:
0:00 - Intro & High-Level Overview
0:50 - Problem Formulation
2:30 - Architecture Overview
6:20 - Bipartite Match Loss Function
15:55 - Architecture in Detail
25:00 - Object Queries
31:00 - Transformer Properties
35:40 - Results

ERRATA:
When I introduce bounding boxes, I say they consist of x and y, but you also need the width and height.

Abstract:
We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor generation that explicitly encode our prior knowledge about the task. The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss that forces unique predictions via bipartite matching, and a transformer encoder-decoder architecture. Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. The new model is conceptually simple and does not require a specialized library, unlike many other modern detectors. DETR demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster RCNN baseline on the challenging COCO object detection dataset. Moreover, DETR can be easily generalized to produce panoptic segmentation in a unified manner. We show that it significantly outperforms competitive baselines. Training code and pretrained models are available at this https URL.

Authors: Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko

Links:

Рекомендации по теме

Комментарии

This is a gift. The clarity of the explanation, the speed at which it comes out. Thank you for all of your work.

slackstation

Yup. Subscribed with notifications. I love that you enjoy the content of the papers. It really shows! Thank you for these videos.

aashishghosh

Really appreciate the efforts you are putting into this. You paper explanations make my day everyday!

rishabpal

Greatest find on YouTube for me todate!! Thank you for the great videos!

sahandsesoot

I had seen your Attention is all you need video and now watching this, I am astounded by the clarity you give in your videos. Subscribed!

ankitbhardwaj

The attention visualization are practically instance segmentations, very impressive results and great job untangling it all

Phobos

A great paper and a great review of the paper! As always nice work!

michaelcarlon

WoW, the way you've explained and break down this paper is spectacular,
Thx mate

chaouidhuzgen

Great!!! absolutely great! fast, to the point, and extremely clear. Thanks!!

opiido

This video was absolutely amazing. You explaned this concept really well and I loved the bit at 33:00 about flattening the image twice and using the rows and columns to create an attention matrix where every pixel can releate to every other pixel. Also loved the bit at the beginning when you explaned the loss in detail. alot of other videos just gloss over that part. Have liked and subscribed

hackercop

Awesome video. Highly recommend reading the paper first and then watching this to solidfy understanding. This definitely helped me understand DETR model more.

adisingh

Thank you for your wonderful video. When I read this paper first, I couldn't understand what is the input of decoder (object queries), but after watching your video, finally I got it, random vector !

zeljnrp

Thank you for this content! I have recommended this channel to my colleagues.

renehaas

Thanks so much for making it so easy to understand these papers.

AishaUroojKhan

Fantastic explanation 👌 looking forward for more videos ❤️

pranabsarkar

Was waiting for this. Thanks a lot! Also dude, how many papers do you read everyday?!!!

ramandutt

"Maximal benefit of the doubt" - love it!

edwarddixon

Very well done and understandable. Thank you!

Gotrek

34:08 GOAT explanation about the bbox in atttention feature map.

oldcoolbroqiuqiu

You are a godsend! Please keep up the good work!

biswadeepchakraborty

DETR: End-to-End Object Detection with Transformers (Paper Explained)

DETR: End-to-End Object Detection with Transformers (Paper Explained)

DETR - End to end object detection with transformers (ECCV2020)

DETR: End-to-End Object Detection with Transformers | Paper Explained

PR-284: End-to-End Object Detection with Transformers(DETR)

DETR: End-to-End Object Detection with Transformers

DETR (DEtection TRansformer) | Lecture 38 (Part 2) | Applied Deep Learning (Supplementary)

[Paper Review] DETR: End-to-End Object Detection with Transformer

Object Detection Part 7: Detection Transformers (DETR), Object Queries

Object detection Using Detection Transformer (Detr) on custom dataset

DETR - End-to-End Object Detection with Transformers - DEMO

DETR End to End Object Detection with Transformers

Reimplementation Showcase DETR: End-to-End Object Detection with Transformers | CVDL Final Project

Facebook DETR | ML Coding Series | End to end object detection with transformers

End to end Object Detection with Transformers 😲🚀

DETR - End-to-End Object Detection with Transformers - DEMO

[Tutorial] Training End-to-end Object Detection with Transformer(DETR) model on custom dataset

ELI5: DETR

Decoding DETR

How I Read a Paper: Facebook's DETR (Video Tutorial)

DINO - DETR with Improved DeNoising AnchorBoxes for End-to-End Object Detection

DETR: End to End Object Detection and Panoptic Segmentation

[Paper Review] End-to-End Object Detection with Transformers (DETR)

VPS-01: DETR: End-to-End Object Detection with Transformers paper review

L-7 | DETR | Object detection Using Detection Transformer on custom dataset