Session 39 - Descriptive Statistics Part 2 | DSMP 2023

preview_player
Показать описание
Session 39 - Descriptive Statistics Part 2 | DSMP 2023
-------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------------------------

-----------------------
| **Chapters** |
-----------------------
00:00:00 - Session start
00:03:42 - Recap of the previous session
00:06:03 - Agenda
00:08:43 - Quantiles & Percentiles
00:15:17 - How to calculate percentile?
00:29:51 - Doubt clearance
00:31:08 - 5 number summary in statistics
00:35:05 - Boxplot theory
00:37:45 - How to create a boxplot with an example?
00:49:12 - Side-by-side boxplot
00:52:15 - Doubt clearance
00:58:05 - Scatterplots
01:01:30 - What is Covariance and how is it interpreted?
01:07:46 - How is Covariance calculated?
01:20:19 - Disadvantages of Covariance
01:27:34 - Covariance of a variable with itself
01:29:14 - Doubt clearance
01:30:17 - What problem does Correlation solve?
01:32:05 - What is correlation?
01:36:22 - Difference between correlation & causation
01:42:14 - Doubt clearance
01:44:36 - Visualizing multiple variables
01:51:17 - Session end & Doubt clearance

#datanalytics #Stats #statistics #SQL #descriptivestatistics #campusx #dsmp2023
Рекомендации по теме
Комментарии
Автор

we want one like button on YouTube so that all the campusX videos will get likes automatically on one click....becoz all the videos of this channel speaks, what's the real meaning of good content and hard work you sir

bhushanbowlekar
Автор

We don't need to round-off this percentile to 100 because 99 is within the dataset and we can clearly see 90% of the data is below 99.
If we will take 100% we are considering a datapoint which is not in data i.e. 100% of the data is below that number. That's why in the percentile to rank formula we used N+1.
Somebody correct me if I am wrong.

anshikasharma
Автор

From your explanation of variance and standard deviation, and covariance and correlation, it looks like there is a striking similarity between these two sets of variables. Although the functions of all four methods are different, it is very easy to understand their differences if we understand the differences between one set. For example, understanding difference between variance and standard deviation would make it easy to follow how covariance and correlations differ. My humble suggestion is that if you teach covariance and correlation immediately after teaching variance and standard deviation, it would make it easier. Your tutorials on stats are excellent. I come from a social science background but nowhere did I feel that I was not understanding what you explained. Keep the good work going. It is a great service to those who want to learn math and stats from basics to an advanced level.

shujashakir
Автор

As to why 1.5 is chosen for the IQR rule for outlier detection: it is an approximation after assuming that the data is normally distributed, and the corresponding Z values are obtained for the 25th and 75th percentile from the Z distribution table (which is -0.675 and 0.675 respectively). Thus,
Solving the equation: 3=0.675+Value*(0.675-(-0.675)), which is nothing but using the 3 Sigma rule for a normal distribution to detect outliers.
Here, we get value=1.72.

After a few more empirical tests (since this is an approximation where we are assuming that the distribution is normal), it was observed by statisticians that when the value of 1.72 is rounded to 1.5, the result is more generalized for even non normalised distributions. Hence, the 1.5xIQR rule

Tusharchitrakar
Автор

great effort really your teaching methods are too easy to absorb

amreshmishra-clyk
Автор

You are one of the Great Teacher Sir Love from Pakistan.

abdulqadar
Автор

grate teaching i ever seen on you tube for data science great work sir

AI-Brain-or-Mind
Автор

Best lecture like always, Please also cover different types of distrtibution if not all than the important types, pleasde do cover this soon, eagerly waiting for lectures

SupriyaSingh-llhb
Автор

The Case study you discussed at 1:41:00 in that case study it was observed that no. of Heatstroke and sales of ice creme were corelated but that was because of weather .

SauravKumar-stxo
Автор

Thanks Sir for all that series which clear all our doubts on statistics used in data science

namansethi
Автор

even in the paid courses this thing are no teach great sir love you sir you teach from the roots

AI-Brain-or-Mind
Автор

You are great teacher for poor students. i salute you

rajanabeel
Автор

great explanation sir. Thank You So Much.

ParthivShah
Автор

thank you sir for this wonderful lecture🙏🙏

Shashank_Shekhar_Singh
Автор

This is far far better than any paid course

nrted
Автор

sir i request you to continue this series
please
don't change this batch to paid batch

unity-tghe
Автор

Can you create video on shap values and their use in interpreting machine learning models

milindtakate
Автор

1:41:35 Warning taken!! Garmi me bahar mt niklo😂jk

flakky
Автор

answer to the question asked at point 53:17 is This method of using 1.5 times the IQR is a commonly used rule of thumb to identify potential outliers in a dataset and to visualize the spread of the data in a box plot. It's important to note that the choice of 1.5 times the IQR is somewhat arbitrary and can be adjusted based on the specific context and requirements of the analysis. Some analyses may use different multiplier values (e.g., 1.0, 2.0) depending on the dataset and the desired level of sensitivity to outliers. Consider how sensitive you want your outlier detection to be. A smaller multiplier will flag fewer data points as outliers, while a larger multiplier will be more sensitive and flag more points.

davidhenry
Автор

I have completed 2 lectures at 2x speed within 4 hrs

TechnicalDrMusic