Google SQL Interview Question - Calculate Correlation Coefficient

preview_player
Показать описание

====== ✅ Details ======

🤔 "How would you calculate the correlation coefficient of two columns, x1 and x2, in the table?"

Here's a demonstration of how Dan, a former Google/PayPal data scientist, would approach an SQL problem that requires both statistical and SQL knowledge. This style of SQL question that requires statistical concepts is a fair game in many interviews across FAANGS and top startups.

👍 Make sure to subscribe, like and share!

====== ⏱️ Timestamps ======

0:00 Intro
01:15 SQL problem
03:42 Solution walkthrough
17:10 Conclusion

====== 📚 Other Useful Contents ======

1. Principles and Frameworks of Product Metrics | YouTube Case Study

2. How to Crack the Data Scientist Case Interview

3. How to Crack the Amazon Data Scientist Interview

====== Connect ======

Рекомендации по теме
Комментарии
Автор

this type of video is perfect, showing the raw process of the interview. Skipping the editing of the video makes it more authentic. Even the light going off in the middle of the video mimics random "black swans" of the interview where the candidate has to deal with additional stress and continue problem solving regardless of the external circumstances. Great explanation of the task solving with SQL. Thank you for this material.

lucyk
Автор

Absolutely brilliant. You presented a very clean solution. Congratulations! I subscribed.

elatedbento
Автор

Really appreciate the effort put in for the video despite the recurring cough. Looking forward to the course and upcoming videos. Cheers.

goonerboi
Автор

Hi, May I know why Avg is used for calculating standard deviation instead of Sum of Xi-X(mean).

khushbumehta
Автор

can you use corr in SQL for correlation ? thanks!

miaoxie
Автор

Thanks for the video. The calculation used is not very efficient: you basically go through the data twice since you are precalculating the means. The alternative is using cov=Exy-ExEy and v=Ex2-(Ex)2, where 2 means squared. Then you only need to go through the data once by calculating sums of x, y, x2, y2, xy, and 1.

jingangmiao