YOLO Object Detection (TensorFlow tutorial)

preview_player
Показать описание
You Only Look Once - this object detection algorithm is currently the state of the art, outperforming R-CNN and it's variants. I'll go into some different object detection algorithm improvements over the years, then dive into YOLO theory and a programmatic implementation using Tensorflow!

Code for this video:

Please Subscribe! And like. And comment. That's what keeps me going.

Want more inspiration & education? Follow me:

More learning resources:

Join us in the Wizards Slack channel:

And please support me on Patreon:
Signup for my newsletter for exciting updates in the field of AI:
Рекомендации по теме
Комментарии
Автор

I literally just sat down to do an assignment on this. Siraj, your timing is impeccable

yetBnAmd
Автор

TBH, I only clicked this because it said YOLO. Now my brain is exploding.
But joking aside, you're a great explainer and this is all starting to make sense. Thanks for the video!

Loopyengineeringco
Автор

Man! You are amazing. your kind of presentation makes me stay completely focused!

medmed
Автор

These videos are great! also a lot easier to focus on when there aren’t memes popping up all the time. I enjoy the lecture style.

schulca
Автор

At 4:10, HOG does actually mean Gradient in the same way as backprop does. An image is just a discrete representation of a continuous 2D signal, the gradient of the continuous signal at a point can be approximated from the discrete representation by taking the finite difference between neighbouring pixels.

JossWhittle
Автор

Bro you might not know this...but you're pretty good at this Youtube thing lol. Thanks man you're the best

georgebockari
Автор

I thank God, that I started studying programming/math, so much fun and so fascinating to be able to take part in such cool technological advancements.

RatherBeCancelledThanHandled
Автор

CNN works this time
1- Computation
2- Large Amount of Image available

yashchandraverma
Автор

It seems that there's a faster algorithm called ssd multibox object detection, even works somewhat fast in android

planktonfun
Автор

That was an excellent description of a topic that has been confusing the heck out of me for many hours. Thank you!

jbuist
Автор

The whole video is very thorough and comprehensive, which makes such intimidating subject a no-brainer for the beginners. Not sure how I will use YOLO in my future projects, but I really learned a lot from this video!

Lavimoe
Автор

Gotta send a link of this to my ex-wife! Maybe she can finally detect that I am a person.

Lunsterful
Автор

Object detection made easy <3 Thanks siraj 😁

intrvrt
Автор

You sir, are the reason my company is headed into softwsee development, coding, and programming. This video is worth more than gold.

josephfoltz
Автор

For videos, I think the algorithms should take the time dimension into account, (ie. increasing the probability of an object detected in one frame to be there again in the next frame) to decrease computation cost.

oliviersaint-jean
Автор

Your videos are so amazing. You cover all the fields of CS practically, with a state of the art approach.
So helpful, keep it up

jazzpote
Автор

Hi Siraj,

thanks for your video. I never heard of the YOLO detector before and find this approach very interesting, as I'm used to the good old brute force method of detecting objects. I have a few remarks concerning the two mentioned pre-deep-learning algorithms.

Regarding the Viola-Jones detector: The features are hand-coded (Haar-like features, which are basically the gray-scale value difference of neighboring rectangular regions), but the locations are not selected by the researchers themselves, as suggested by your video. Instead, they are selected by the training algorithm. They did not use a support vector machine for classification, but a cascade of simple classifiers, which were trained using AdaBoost. Maybe you confused it with the HOG approach.

What made the Viola-Jones detector so efficient was the features and cascade. The features could be computed very efficiently using an integral image (only three additions to compute the sum of gray-scale values over any axis-aligned rectangular region). The cascade was trained such that image windows which did not contain a face would be discarded very quickly, so only very few windows needed to compute all the features and go through all cascades.

The image on your slides is also a bit misleading. It mentions local binary patterns, which is another feature extraction method. The image shows face recognition, in this very case to find out whether a face belongs to the person it pretends to be.

The Dalal-Triggs detector uses histograms of oriented gradients, as you mention. They build histograms over each cell, so it does not only contain the strongest gradient direction of all the pixels in a cell.

exratt
Автор

Thanks for your work it is the first time I find proper and clear explanations about how to interpret the network output!

yannickmolinghen
Автор

I was looking for this just a few days ago and was a great coincidence that you decided to upload this video, thanks!!

MrZouzan
Автор

Anyone got any opinions/warnings regarding YOLOv3? About to start a project and dont wanna make my life more difficult than it already is

mirandaclace
join shbcf.ru