A/B Testing Analysis Made Easy: How to Use Hypothesis Testing for Data Science Interviews!

preview_player
Показать описание
This video is part 2 of hypothesis testing problems in data science interviews.

✔️ Part 1 of hypothesis testing problems in data science interviews:

🟢Get all my free data science interview resources

// Comment
Got any questions? Something to add?
Write a comment below to chat.

// Let's connect on LinkedIn:

====================
Contents of this video:
====================
00:00 Intro
00:39 Two-sample test of proportions
4:40 Statistical significance
5:11 Practical significance
07:11 Two-sample test of means
10:19 Welch's t-test
Рекомендации по теме
Комментарии
Автор

Correction:
Thanks Yidan Shang -- At ​9:12, the Spool calculated should be 1.06 instead of 1.099.
Thanks Ruby Jiang -- At 7:38, the mean of treatment is 1.7 instead of 2. The subsequent calculations should be changed accordingly.

emma_ding
Автор

What I love about your channel is that you don't charge $300 to unemployed job seekers for this information.

Quasar_Energy
Автор

My hiring manager actually recommended this series of videos. Super helpful for someone who doesn't have much business experience. Thank you!

CityInvisible
Автор

I recommended you to all my classmates. Excellent work and presentation of what is needed!

zbear
Автор

Hi Emma - Thank you so much for the sharing! I have learnt a lot. While I would like to clarify the formula you used to calculate the SE in the Confidence Interval of two sample proportions (4:12). The formula you used is the SE in the test statistic of two proportion Z-test, but the SE for the CI should be different (sqrt(p1*(1-p1)/n1 + p2*(1-p2)*n2) . Please correct me if I am wrong here. Thanks

Crtg
Автор

CI in the second case is coming inside the practical boundary for me. Am is missing something? CI for d = 0.633(+ or -) 2.2018 (Multiply 2.002*1.0998). so the range is -1.56 and 2.83. Really helpful video. Thanks Emma!

hieification
Автор

Great content!!! The best explanation of z-test and T-test on YouTube! Great examples!!! Feel very lucky to find you here🙏! Thank you!!!

TiantianGao
Автор

Hi Emma, thank you for your valuable video. In the video at 5:57, dmin = 0.05, but at dmin = 0.01. I am a little confused the value of dmin. why it was changed from 0.05 to 0.01 and the value is determined randomly or it could be calculated by some formula? Thanks

sijunjiang
Автор

Thank you Emma, I am learning a lot, God bless you! I finally feel I understand this topic.

star_
Автор

Great explanation Emma! Nice work! For the second case, can you show in formula and calculation of how did you come up with the lower bound is more than Dmin 0.05?

viviangong
Автор

Thank you much Emma! I have trouble with combining the hypothesis and A/B testing knowledge together, your video saved me!!!

yueleji
Автор

Really great video thanks 😋 I appreciate the effort 🙏

jithendrayenugula
Автор

Hi Emma, really appreciate that you made all the great videos, which is very helpful. I was wondering if you can make some videos about how to handle the take home challenge such as Lyft and Airbnb. Any information will be highly appreciated! Thanks

leizhang
Автор

quality of content is top notch! thanks for making these videos .looking forward to learn more from you.

sooryaprakash
Автор

All your videos are gold mine.
Keep up the good work

arojitdas
Автор

Thanks a lot for your channel! It helped me to prepare and get a job offer! :)

proolga
Автор

Hey Emma, great video and love the decision flow chart. One quick question - at 9:12, why the Spool =1.099? I calculated the pooled variance =1.13, and the square root of that would be 1.06 instead of 1.099.

yidanshang
Автор

Did anyone else try to reproduce the results of the two-sample test of means? I get that the mean of the treatment is 1.7 (not 2.0 like in the video). This changes the conclusion, the result is not statistically significant.

I think it's a mistake since my calculation for the pooled standard error (which uses the standard deviation of the treatment) matches perfectly

elderpinzon
Автор

Hi Emma, how did you calculate the CI in the second example exactly when assuming similar variance for both control and treatment? I got the margin of error = t-score*SEpool = 2.002*1.0999 = 2.202. Then with \hat{d} = 0.6, which would then give a very big CI that includes the entire [-dmin, dmin] = [-0.05, 0.05]. But it seems like in your slides the CI is strictly on the right side of [-dmin, dmin]. I'm very confused and would appreciate some help! Otherwise, your videos have been super super helpful!! Thanks.

dantongzhu
Автор

why do we use the same z value used for p value to calculate the confidence interval. Should we not choose the z value for calculating confidence interval based in 0.01 practical significance boundary (6:50)?

korkutkaynardag