Correlation - What Is Correlation - What Is And How To Calculate The Correlation Coefficient r

preview_player
Показать описание
In this video we discuss what is correlation and the sample correlation coefficient, and how to calculate the correlation coefficient r. We also cover how to determine strong and weak correlations.

Transcript/notes
Correlation is a relationship between 2 variables, and we use correlation to identify what type of relationship the 2 variables have with one another.

For instance, here is a scatter plot, and the data in this scatter plot is represented by ordered pairs, x and y, where x is considered the independent variable and y is considered the dependent variable.

Now, here are 3 scatter plots, in number 1, there is a positive linear correlation, a positive slope, in number 2 there is a negative linear correlation, a negative slope and number 3 shows no correlation or no apparent relationship between the 2 variables.

Correlation has linear strength and direction, and is measured by the correlation coefficient, and the symbol r is used to represent the sample correlation coefficient.

The range of the correlation coefficient is from -1 to 1, and the closer to +1 or -1 the correlation coefficient is, the stronger the relationship is between the 2 variables.

So, here are 6 graphs, in graph 1 the correlation coefficient is 1, a perfect increasing line and in graph 4 a correlation coefficient of -1, a perfect decreasing line. Graphs 2 and 5 show strong positive and strong negative correlations, and graphs 3 and 6 show very weak correlations.

Here is kind of a guide, based on the value of the correlation coefficient, to determining how strong or weak the relationship is between the 2 variables being used.

The formula for calculating r, or the sample correlation coefficient is here. And here is a sample data set of the number of passing yards each game for a quarterback in column 1, and the number of points his team scored in each of those games in column 2. So, passing yards is the x variable the independent variable, and points scored is the y variable, the dependent variable, and these are ordered pairs of x and y.

In our formula, n is the size of the sample, which is 16, the sum of x is the sum total of column 1, which is 5097. The sum of y is the sum total of column 2, which is 565. We also need to know the sum of x times y, the sum of x squared and the sum of y squared, so we can add in those 3 columns as you see here. And I have summed each of those up at the bottom of the table.

Now we can plug into the formula, and calculating we get 0.6178 as the r value.

So, this suggests that there is a moderate linear relationship between passing yards and points scored.

Timestamps
0:00 What Is Correlation?
0:18 Types Of Correlation
0:34 What Is The Correlation Coefficient?
1:18 Formula For Finding The Correlation Coefficient
1:35 Example Problem Of Finding The Correlation Coefficient
Рекомендации по теме
Комментарии
Автор

There is a typo on the screen. At 2:07 I have the sum of xy equaling 140, 038, and it should equal 184, 038. The answer is correct, as this is just a typo. Thanks to a viewer, Lisa, for pointing that out.

whatsupdude
Автор

Excellent. Thank you for the breakdown.

Nostalgia_Space
Автор

Thank you so much for this. Helped me a lot sir great explanation

Mina-bzeo
Автор

Excellent video. I believe the sum of xy should be 184038, not 140038. I think this is just a typo because your correlation coefficient is correct.

LBJ