filmov
tv
Correlation - What Is Correlation - What Is And How To Calculate The Correlation Coefficient r
Показать описание
In this video we discuss what is correlation and the sample correlation coefficient, and how to calculate the correlation coefficient r. We also cover how to determine strong and weak correlations.
Transcript/notes
Correlation is a relationship between 2 variables, and we use correlation to identify what type of relationship the 2 variables have with one another.
For instance, here is a scatter plot, and the data in this scatter plot is represented by ordered pairs, x and y, where x is considered the independent variable and y is considered the dependent variable.
Now, here are 3 scatter plots, in number 1, there is a positive linear correlation, a positive slope, in number 2 there is a negative linear correlation, a negative slope and number 3 shows no correlation or no apparent relationship between the 2 variables.
Correlation has linear strength and direction, and is measured by the correlation coefficient, and the symbol r is used to represent the sample correlation coefficient.
The range of the correlation coefficient is from -1 to 1, and the closer to +1 or -1 the correlation coefficient is, the stronger the relationship is between the 2 variables.
So, here are 6 graphs, in graph 1 the correlation coefficient is 1, a perfect increasing line and in graph 4 a correlation coefficient of -1, a perfect decreasing line. Graphs 2 and 5 show strong positive and strong negative correlations, and graphs 3 and 6 show very weak correlations.
Here is kind of a guide, based on the value of the correlation coefficient, to determining how strong or weak the relationship is between the 2 variables being used.
The formula for calculating r, or the sample correlation coefficient is here. And here is a sample data set of the number of passing yards each game for a quarterback in column 1, and the number of points his team scored in each of those games in column 2. So, passing yards is the x variable the independent variable, and points scored is the y variable, the dependent variable, and these are ordered pairs of x and y.
In our formula, n is the size of the sample, which is 16, the sum of x is the sum total of column 1, which is 5097. The sum of y is the sum total of column 2, which is 565. We also need to know the sum of x times y, the sum of x squared and the sum of y squared, so we can add in those 3 columns as you see here. And I have summed each of those up at the bottom of the table.
Now we can plug into the formula, and calculating we get 0.6178 as the r value.
So, this suggests that there is a moderate linear relationship between passing yards and points scored.
Timestamps
0:00 What Is Correlation?
0:18 Types Of Correlation
0:34 What Is The Correlation Coefficient?
1:18 Formula For Finding The Correlation Coefficient
1:35 Example Problem Of Finding The Correlation Coefficient
Transcript/notes
Correlation is a relationship between 2 variables, and we use correlation to identify what type of relationship the 2 variables have with one another.
For instance, here is a scatter plot, and the data in this scatter plot is represented by ordered pairs, x and y, where x is considered the independent variable and y is considered the dependent variable.
Now, here are 3 scatter plots, in number 1, there is a positive linear correlation, a positive slope, in number 2 there is a negative linear correlation, a negative slope and number 3 shows no correlation or no apparent relationship between the 2 variables.
Correlation has linear strength and direction, and is measured by the correlation coefficient, and the symbol r is used to represent the sample correlation coefficient.
The range of the correlation coefficient is from -1 to 1, and the closer to +1 or -1 the correlation coefficient is, the stronger the relationship is between the 2 variables.
So, here are 6 graphs, in graph 1 the correlation coefficient is 1, a perfect increasing line and in graph 4 a correlation coefficient of -1, a perfect decreasing line. Graphs 2 and 5 show strong positive and strong negative correlations, and graphs 3 and 6 show very weak correlations.
Here is kind of a guide, based on the value of the correlation coefficient, to determining how strong or weak the relationship is between the 2 variables being used.
The formula for calculating r, or the sample correlation coefficient is here. And here is a sample data set of the number of passing yards each game for a quarterback in column 1, and the number of points his team scored in each of those games in column 2. So, passing yards is the x variable the independent variable, and points scored is the y variable, the dependent variable, and these are ordered pairs of x and y.
In our formula, n is the size of the sample, which is 16, the sum of x is the sum total of column 1, which is 5097. The sum of y is the sum total of column 2, which is 565. We also need to know the sum of x times y, the sum of x squared and the sum of y squared, so we can add in those 3 columns as you see here. And I have summed each of those up at the bottom of the table.
Now we can plug into the formula, and calculating we get 0.6178 as the r value.
So, this suggests that there is a moderate linear relationship between passing yards and points scored.
Timestamps
0:00 What Is Correlation?
0:18 Types Of Correlation
0:34 What Is The Correlation Coefficient?
1:18 Formula For Finding The Correlation Coefficient
1:35 Example Problem Of Finding The Correlation Coefficient
Комментарии