K-Means clustering with Python Example

preview_player
Показать описание
This video is going to be divided into 5 parts:

00:00 Introduction to Data
00:59 Clean data and choose variables for clustering
05:30 Sklearn K-Means example code
07:50 Interpreting clustering results
11:30 Wrapping up

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances. For instance, better Euclidean solutions can be found using k-medians and k-medoids.

The problem is computationally difficult (NP-hard); however, efficient heuristic algorithms converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both k-means and Gaussian mixture modeling. They both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the Gaussian mixture model allows clusters to have different shapes.

The unsupervised k-means algorithm has a loose relationship to the k-nearest neighbor classifier, a popular supervised machine learning technique for classification that is often confused with k-means due to the name. Applying the 1-nearest neighbor classifier to the cluster centers obtained by k-means classifies new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm.

Created by
Kunaal Naik

------

Follow us on Facebook

Also on Instagram

Also you can check our website

--------

----
#k-means_clustering #deeplearning #machinelearning​ #Decision_trees​ #gradient_boosting​​ #varianc​​ #gradiant_descent​​ #python​​ #deeplearning​​ #technology​​ #programming​​
#coding​​ #bigdata​​ #computerscience​​ #data​​ #dataanalytics​​ #tech​​ #datascientist​​ #iot​​ #pythonprogramming​​
#programmer​​ #ml​​ #developer​​ #software​​ #robotics​​ #java​​ #innovation​​ #coder​​ #javascript​​ #datavisualization​​
#analytics​​ #neuralnetworks​​ #bhfyp​
Рекомендации по теме
visit shbcf.ru