Correlation Doesn't Mean Causation

preview_player
Показать описание
You’ve probably heard something like "correlation doesn’t imply causation.” That’s true, but then what does imply causation? In short, the only way to really make a causal claim is to run a randomized experiment. If you do that, you can claim causation all day long. If you stick around, I’ll teach you what a randomized experiment is, how it can be used to make critically important causal claims like whether a vaccine is effective at preventing a disease or whether a flashier YouTube thumbnail gets content creators more views. And, beyond that, yes, I’ll even explain why correlation doesn't imply causation.

So here’s the game plan. I’m going to first very quickly explain what a correlation is, then explain why we REALLY care about making causal statements, show you what does actually allow you to make all important causal claims, and, finally show you why correlations don’t necessarily imply causation. I’ll do all this with as little jargon as possible and an emphasis on intuition.

I do this all without ever using any fancy jargon or any complex math. I focus on intuition and and insights to help you get an intuitive understanding of complex, important, and timely topics in statistics, data science, and data analytics.

Follow me at:

Equipment Used for Filming:
Рекомендации по теме
Комментарии
Автор

I have never seen someone explaining with so many examples. Thankyou for the lovely content.

aakritiaggarwal
Автор

You're the best. Glad I found you😊. Keep it coming!

navaneeth
Автор

Definitely one of your best videos so far!

Inigma
Автор

Thank you professor, you really opened my mind about correlation and causation. Maybe you could show some examples using actual research papers ?

jorgenetto
Автор

Great video. However, one improvement: to say something is a pseudo-causal claim — when it in fact does assert the existence of a causal relationship — is confusing. As is saying that we "cannot" assert (use, rely on) causal claims in the absence of these three criteria. We don't have perfect information and getting it is very costly; thus, we are best served by relying on a variety of what you would call pseudo-causal claims.

LSATAngel
Автор

Hi Jeff, first of all thank you for this detailed and at the same time clear explanation. As a M.Sc. Data Science student the concept of causation is something I always wanted to better understand.

So, based on your video and the general consensus the “randomised experiment” framework is the best tool we currently have. You said that it works well also in real world scenarios where we may have infinite possibile factors that may be the cause. However if I understood correctly the “trick” is based on randomly picking our volunteers/subjects. What if our group itself doesn’t reflect the population (i.e. not enough variability because of selection bias or more commonly due to difficulty in conducting the experiment / finding candidates)? Is there a way to take into account these scenarios and attach some “uncertainty “ (and quantify it) to our results or all experiments far from the ideal scenario won’t lead to any robust conclusion?

Thanks in advance and have a nice day!

KhaledHechmi
Автор

This is getting me to being able to write

TheReef