14. Mahalanobis distance with complete example and Python implementation

Показать описание

Mahalanobis distance with complete example and Python implementation.

Mahalanobis distance is a distance between a data (vector) and a distribution. It is useful in multivariate anomaly detection, classification as skewed data.

Prasanta Chandra Mahalanobis was Indian scientist and statistician. He founded the Indian statistical institute.

Euclidean distance will work great as long as the features are equally important and are independent to each other.

If the variables are strongly correlated then covariance will be high. If the features of x are not correlated, then the covariance is not high and the distance is more.

Рекомендации по теме

Комментарии

having read through half a dozen articles on the subject, my pattern recognition text book, several hours of lecture and a handful of videos, I can say with good authority that this is the best explanation of the process I've seen so far.
probably this would have been easier if I'd ever taken linear algebra

bpin

Wow!! You really simplified this. Just like many I was googling through tons of 'super complicated results.' With this, was able to do it side by side (albeit using numpy). Thanks a lot! Its a pity the website is de-registered.

jbz

why did you transposw the data in line 19?

MeMonarch

I tried to follow your example but did not get the same values in your covariance matrix. first I did it in Excel (calculating it manually) and then a second time in python to check myself.

import numpy as np
# suppresses the use of scientific notation
A = [1, 2, 4, 2, 5]
B = [100, 300, 200, 600, 100]
C = [10, 15, 20, 10, 30]
data = np.array([A, B, C])
covMatrix = np.cov(data)
# covMatrix = np.cov(data, bias=True)
print(covMatrix)

Results for the Covariance Matrix are:

[[ 2.7 -110. 13. ]
[ -110. 43000. -900. ]
[ 13. -900. 70. ]]

craiggers

How to calculate the mahalanobis distance between a point(which lies within the distribution itself) and the centre of the distribution. I need to find the outlier present within the data. will the formula differ?? Can u pls help me in this regard.

robi

The best video in youtube discussing Mahalanobis. I've one question please, in case I'm using Mahalanobis distance for classification, if I have a binary dataset is this mean that for every unknown object I'll have only 2 distances? Or as Euclidean I'll have n distances, where n is the number of training objects?

AhmedHamed-otez

This video help me alot! I'm using mahalanobis to calculate the distance between a rgb color and a set of rgb colors, but sometimes the MD² results in a negative number, so my python program crashes when calculates the sqrt. Do you have any idea of what should I do in this case?

diogohalmeida

Thank you very much for this, it was precisely what I was looking for! Also, your handwritting is beautiful!

Rafaelkenjinagao

The correct covariance matrix for the data is:
2.26 -88 10.4
-88 34400 -720
10.4 -720 56

Bmahsih-iryx

14. Mahalanobis distance with complete example and Python implementation

14. Mahalanobis distance with complete example and Python implementation

What are Mahalanobis Distances

Euclidean distance and the Mahalanobis distance (and the error ellipse)

23: Mahalanobis distance

Mahalanobis distance for classification | Machine Learning

Machine Learning Blink 3.4 (probability level curves using Euclidean & Mahalanobis distances)

How to calculate the Mahalanobis Distance in SPSS

(IS38) Mahalanobis Distances

Mahalanobis Distance| Probability Density function| Gaussian Distribution| [Estimation & Trackin...

Mahalanobis Distance - intuitive understanding through graphs and tables

Research Lounge: Multivariate Outlier - Mahalanobis Distance

Mahalanobis Distance

Mahalanobis Distances

Using Mahalanobis Distance - Part 2 - Model Building and Validation

MAHALANOBIS DISTANCE AND OUTLIER DETECTION (MACHINE LEARNING)

#NIJPodcast Ep 06 – Decoding Mahalanobis & India's Contentious Five-Year Plans

amv22 - Mahalanobis Distance

Colloquium : Mahalanobis Distance and Discriminant Analysis

Identifying multivariate outliers (Mahalanobis distance) in SPSS

R : K-means and Mahalanobis distance

Robust Unscented Kalman Filter for Target Tracking Based on Mahalanobis Distance

Review Questions 3 - Mahalanobis distance and rank of covariance matrix

On the use of Mahalanobis distance for out-of-distribution detection - presented at MICCAI 2023

Multivariate Outlier Test Using Mahalanobis Distance