Principal Component Analysis (PCA) [Matlab]

Показать описание

This video describes how the singular value decomposition (SVD) can be used for principal component analysis (PCA) in Matlab.

These lectures follow Chapter 1 from: "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz

This video was produced at the University of Washington

Рекомендации по теме

Комментарии

for future ref) In the first part of the video, X's _colums_ (not rows) are each points(correction: 5:02 not every rows but every columns are X average).
And, note that the code is using 'svd' function, not 'pca' function.
This can be confusing because Prof. Brunton says in the previous lecture(the first vid on PCA) that PCA assumes 'rows' represent each individual(e.g. person, etc.), contrast to SVD which assumes 'columns' does it.
*_BUT, _* in the second part(ovarian cancer), even though the code is using 'svd' function, the 'obs' matrix is 216x4000(216 patients) where each 'row' represents individual patient. Thus, here, U and V is actually like V and U in the first part of the lecture, respectively.
Also, in the for loop in the code, the code plots each patient(each dot) in the 3 "principal" axes, in the for loop(in Matlab, A' means conjugate tranpose of A).
*_However, _* the code calculates the dot products of the two long vectors(4000 elements, and this can be even larger in different examples).
We _don't_ need this calculation because U already contains the exact same values(this U would have been V if each individual patient were represented by column, not row).
So, we can just use U(i, 1), U(i, 2), U(i, 3) for x, y, z in the for loop, instead of calculating dot products.
(I don't use MATLAB but it should work. If it were Python, the only difference would be the indices start from 0 and using square brackets instead of parentheses).
But, still, knowing why those dot products("projection" onto orthonormal vector, in this case) works is important in understanding SVD and PCA.

Anyway, thanks a lot for this great series of lectures, awesome.

starriet

Excellent! Your 15-minute video really captures the majority of the 100 years of information on PCA. SVD works!

MageshJohn

Code is more understandable for me, thanks for your great job. This example has shown how PCA looks like in the gemotry way. Also there's some implicit relationship between the data points' shape and the centralized matrix's transformation capability which is not mentioned in linear algebra course.

nwxxzchen

I just don't understand why for the ovarian cancer example you don't do the preprocessing steps (mean and division by sqrt(Nmeas))

sapertuz

Great Video, But one confusion, Arent we supposed to subtract the mean before computing the SVD? in the ovarian cancer case

haideralishuvo

Excellent lecture. Question: once you have determined the magnitude of the principle components is there a way of determining which features they represent in your original data? For instance determining which features from the cancer data correlated strongest to a cancer diagnosis?

fermijman

High quality presentation, Thanks for sharing.

abolfazlabbasi

thank you so much! my understanding increased exponentially when you explained with the ovarian cancer example.

ratnaa

BTW to use a legend for the ovarian-data you can make use of plot handlers as follows:

h = zeros(2, 1);
...
if(grp{i}=='Cancer')
h(1) = plot3(...);
else
h(2) = plot3(...);
...
legend(h, 'Cancer', 'Normal')

ElPrestigo

Just to clarify, when you mention the energy of the statistical data, you're referring to the extent to which it captures the trend in the data, right?

zhengyangkrisweng

Hello Steve ... I would like first to Thank you by your effort in sharing and teaching this amazing technics. I also would like to ask you if it is possible you make a video on how to find the best r value using the Gavish-Donoho method using python language. This would be very useful for me. Thanks a lot and keep going.

Danielsantos

great explanation! thanks! may i know how do i tell which genes has the highest "impact" with regards to PC 1 ? (in the Ovarian Cancer example) - Is there a way i can tell from matrix U or matrix V ? i just learnt PCA 3 days ago, sorry if this is a noob question :)

yourswimpal

Off-topic, but how do you get the IDE to be dark for your presentations?

Assault

what're the differences between the 2 Dimensional and 3 Dimensional data set plots?

ifan

I got a little bit confused, what's the intuition behind calculating x, y, and z by doing V times b (observations)? What is x, y, and z showing? Sorry for the silly question, thanks in advance.

AdityaDiwakarVex

Is singular data decomposition also use in 3-dimensional data plots?

ifan

Been trying to to write a formula to combine both Honey Mustard [ detaSet ] and Ranch BBQ Sauce [ dataSet(2×2) ] as one component while randomly scaling calories and sugar. Don't see what The Matrix movie has to do with anything though.

mybean

Also, is there anyway to get this code for practice? .. Thank you in advance!!

mataFot

dear Steve, I see that in my data set 2 states contribute to 90% of the data, how do I know, which ones?

alex.ander.bmblbn

Is there a convention about signs? I was convincing myself.. and what made me confused is T1, T2, T3 (scores) matrices in code below have same values with different signs. I found some article and code about flipping sign of svd and pca but I couldnt be sure... I'd be very happy if you made it clear for me, thanks!
%% CODE
clear; close all; clc;
load fisheriris
X = meas;
% X = 5*randn(300, 10);
[W, D] = eig(X'*X);
W = W(:, end:-1:1);
D = D(end:-1:1, end:-1:1);
T1 = X*W;
[U, S, V] = svd(X, 'econ');
T2 = U*S;
[coeff, score, latent] = pca(X, 'Algorithm', 'svd', 'centered', false);
T3 = score;

burakyesilyurt

Principal Component Analysis (PCA) [Matlab]

Principal Component Analysis (PCA) | MATLAB | Machine Learning

Principal Component Analysis (PCA) using MATLAB | MATLAB Tutorial for Beginners | Simplilearn

Principal Component Analysis (PCA) [Matlab]

Doing Principal Components Analysis in Matlab - using the pca function (Statistics Toolbox)

Principal Component Analysis in MATLAB

PCA in matlab ( Principal Component analysis in Matlab)

Calculating Principal component analysis (PCA), step by step using a simple dataset.

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) in MATLAB

Principal Component Analysis (PCA) - easy and practical explanation

StatQuest: PCA main ideas in only 5 minutes!!!

Principle Component Analysis Matlab Tutorial Part 1 - Overview

MATLAB tutorial - principal component analysis (PCA)

Principal Component Analysis (PCA) in Python and MATLAB

MATLAB CODE for FACE RECOGNITION using Principal Component Analysis PCA

Face recognition using Principal Component Analysis(PCA) in Matlab - Part 2.2( II )

Principal Component Analysis (PCA): With Practical Example in Minitab

Understand & interpret result of pca MATLAB | Machine Learning

Principal Component Analysis - Theory & MATLAB Implementation | Machine Learning | @MATLABHelper

Data Analysis 6: Principal Component Analysis (PCA) - Computerphile

MATLAB tutorial: Principal Component Analysis & Regression

Principal Component Analysis (PCA) Intuition | Machine Learning

Principal Component Analysis- Part I

PCA using MATLAB: Simplify Data Insights #PCA #matlabtutorial