filmov
tv
Training a Cascade Classifier - OpenCV Object Detection in Games #8
Показать описание
In this tutorial, we train an OpenCV Cascade Classifier entirely on Windows to detect objects in a video game in real-time. Making your own Haar Cascade isn't complicated, but it can be a lot of work. This project is a great intro for beginners to Machine Learning. In this video I explain the Machine Learning basics, and walk you through the implementation of training and using your own model for computer vision in Python.
0:42 Machine Learning basics
2:28 Haar Cascade Classifier explained
4:04 Gathering the positive and negative images
7:06 Creating the negative description file
8:59 The positive description file
9:41 Installing opencv_annotation, opencv_createsamples, and opencv_traincascade on Windows
12:19 How to use opencv_annotation
15:17 Fixing Error: Assertion failed...
15:45 How to use opencv_createsamples
18:06 How to train a model with opencv_traincascade
22:24 How machine learning training works for image classifiers
23:06 How to use a trained model for object detection
25:59 How to train a better cascade classifier
29:00 What is overfitting?
30:27 Arguments used for my best Haar Cascade Classifier model
Normally when we program something, like when we write a function (for example), we expect certain inputs (like the parameters in our function). And then at the end we'll end up with some output (like the return part of a function). And in the middle, to get from the input to the output, we write some logic (if statements, loops, all that stuff).
With Machine Learning it's exactly the same, except that middle part is replaced by a Machine Learning model. So with Machine Learning you're not writing any of your own logic anymore, instead you're trusting this mysterious dark jumble of multi-dimensional calculus to transform your inputs into your desired output.
And at first your model won't know how to do... what you want it to do. Its output will be no better than random guesses. To get the output we want, we must first train our model. We do that by showing it lots of input examples, and for each example we tell the model what we want the output to be. Once it has seen enough examples, a well trained model will be able to accurately predict what the output should be given some new set of inputs.
That's the super summarized version of how all Machine Learning works.
In our case, our input is going to be screenshot images from the video game we're playing. And the output we want is a list of rectangles that identify the objects we're trying to detect. And fortunately for us, OpenCV's Cascade Classifiers are designed to do exactly that.
The way a Haar classifier works is it looks for features in an image, very much like the ORB feature detection we talked about in the last video. And it looks for these features in different layers. So at the top layer it will be looking at large features that span nearly the whole image window, down to the bottom layer where it’s looking for very fine details. This makes the end model fast enough to detect objects in real-time, because it can quickly reject areas of the image that fail to match the features in the top-most layers. And it can spend more time analyzing areas of the image that are good candidates, by studying those finer details.
Hopefully you have a general understanding now of how Machine Learning works and what makes a Haar Cascade Classifier unique. The great part is, the code for all of this is very straight forward.
The art of doing this well actually isn't so much in the code, it's more in gathering the data to train your model with. To get good results, you need quality data, and you need lots of it.
Now we need two types of data: We need the positive images - which are images that contain the object we're trying to detect... and we need negative images - which will be screenshots from the game that don't contain our object at all. The Machine Learning algorithm needs to see both what is and what is not the object in order for it to learn.
0:42 Machine Learning basics
2:28 Haar Cascade Classifier explained
4:04 Gathering the positive and negative images
7:06 Creating the negative description file
8:59 The positive description file
9:41 Installing opencv_annotation, opencv_createsamples, and opencv_traincascade on Windows
12:19 How to use opencv_annotation
15:17 Fixing Error: Assertion failed...
15:45 How to use opencv_createsamples
18:06 How to train a model with opencv_traincascade
22:24 How machine learning training works for image classifiers
23:06 How to use a trained model for object detection
25:59 How to train a better cascade classifier
29:00 What is overfitting?
30:27 Arguments used for my best Haar Cascade Classifier model
Normally when we program something, like when we write a function (for example), we expect certain inputs (like the parameters in our function). And then at the end we'll end up with some output (like the return part of a function). And in the middle, to get from the input to the output, we write some logic (if statements, loops, all that stuff).
With Machine Learning it's exactly the same, except that middle part is replaced by a Machine Learning model. So with Machine Learning you're not writing any of your own logic anymore, instead you're trusting this mysterious dark jumble of multi-dimensional calculus to transform your inputs into your desired output.
And at first your model won't know how to do... what you want it to do. Its output will be no better than random guesses. To get the output we want, we must first train our model. We do that by showing it lots of input examples, and for each example we tell the model what we want the output to be. Once it has seen enough examples, a well trained model will be able to accurately predict what the output should be given some new set of inputs.
That's the super summarized version of how all Machine Learning works.
In our case, our input is going to be screenshot images from the video game we're playing. And the output we want is a list of rectangles that identify the objects we're trying to detect. And fortunately for us, OpenCV's Cascade Classifiers are designed to do exactly that.
The way a Haar classifier works is it looks for features in an image, very much like the ORB feature detection we talked about in the last video. And it looks for these features in different layers. So at the top layer it will be looking at large features that span nearly the whole image window, down to the bottom layer where it’s looking for very fine details. This makes the end model fast enough to detect objects in real-time, because it can quickly reject areas of the image that fail to match the features in the top-most layers. And it can spend more time analyzing areas of the image that are good candidates, by studying those finer details.
Hopefully you have a general understanding now of how Machine Learning works and what makes a Haar Cascade Classifier unique. The great part is, the code for all of this is very straight forward.
The art of doing this well actually isn't so much in the code, it's more in gathering the data to train your model with. To get good results, you need quality data, and you need lots of it.
Now we need two types of data: We need the positive images - which are images that contain the object we're trying to detect... and we need negative images - which will be screenshots from the game that don't contain our object at all. The Machine Learning algorithm needs to see both what is and what is not the object in order for it to learn.
Комментарии