Machine Learning in R - Supervised vs. Unsupervised

preview_player
Показать описание


In the previous video, you learned about three machine learning techniques: Classification, Regression and Clustering. As you might have felt, there are quite some similarities between Classification and Regression. For both, you try to find a function, or a model, which can later be used to predict labels or values for unseen observations. It is important that during the training of the function, labeled observations are available to the algorithm. We call these techniques supervised learning.

Labeling can be a tedious work and is often done by puny humans. There are other techniques which don't require labeled observations to learn from data. These techniques are called unsupervised learning. You've already acquainted yourself with one of these techniques in the previous video, namely Clustering. Clustering will find groups of observations that are similar, and thus does not require specific labeled observations.

In the next chapter we'll talk about assessing the performance of your trained model. In supervised learning, we can use the real labels of the observations and compare them with the labels we predicted. It's quite straightforward that you want your model's predictions to be as similar as possible to the real labels. With unsupervised learning, however, measuring the performance gets more difficult: we don't have real labels to compare anything to. You'll learn some neat techniques to assess the quality of a clustering in the next chapter.

As you get more experienced as a data scientist, you might notice that things aren't always black and white. In machine learning, some techniques overlap between supervised and unsupervised learning. With semi-supervised learning, for example, you can have alot of observations which are not labeled, and a few which are. You can then first perform clustering to group all observations which are similar. Afterwards, you can use information about the clusters and about the few labeled observations to assign a class to unlabeled observations. This will give you more labeled observations to perform supervised learning on.

Enough talking, let's do some more exercises!
Рекомендации по теме
Комментарии
Автор

I am a newbie to machine learning projects .. can anyone tell me how can I proceed with unlabeled raw data ? Like how to find weightage model and all for an unsupervised ml classification?

medley