Facebook and Microsoft Data Science SQL interview question walkthrough (advanced)

preview_player
Показать описание
This data science SQL interview question is from Facebook and Microsoft that tests your ability to find and segment users as well as join aggregated tables together. Let's walk through solving this question as we do in the interview.

______________________________________________________________________

______________________________________________________________________

Timeline:

Intro: (0:00​​​)
Interview Question: (0:32​​​)
5 Step Framework: (1:20​​​)
Explore the Data: (2:06​​​)
List Assumptions: (2:40​​​)
Outline Approach: (5:20​​​)
Code in Increments: (7:32​​​)
Calculate New Users: (7:50​​​)
Calculate All Users: (11:02)
Join Tables: (11:36​​​)
Calculate User Share: (12:54​​​)
Optimize Code: (14:15​​​)
Conclusion: (16:20​​​)
______________________________________________________________________

About The Platform:

______________________________________________________________________

Contact:

If you have any questions, comments, or feedback, please leave them here!
______________________________________________________________________

#SQLInterview
Рекомендации по теме
Комментарии
Автор

I don’t know how you do it but after watching one of your videos I always come up with a solution for one of my coding problems. I am new on my job as a business analyst and am setting up dashboards for the company. Thank you so much for this quality content - keep up the good work.

TheMHud
Автор

Hey, just found out about your channel, been going through videos almost all day and wanted to give my thanks! The content is amazing.

mohitnisar
Автор

Very Clear explanation. Thanks Nate for creating a video for this problem.

PATRICKCHUAD
Автор

Impressive.
The people who work with you and especially the team you lead are certainly very lucky.

joaopedroreissilva
Автор

Your videos are helpful. Insights on how to break a problem statement are very detailed.

myrandomandboringvideos
Автор

good video. only catch is that you don't need the 'distinct' keyword in the first CTE because it's already grouped by user_id and user_id will always be distinct in that case

jerrygong
Автор

i love your videos narrating your mind map of creating queries, which is very helpful for sql learners!

leonardlau
Автор

great video! I've learned a lot through your videos!! Thanks. Please keep posting this kind of videos

oliviaou
Автор

I have been doing similar task at work utilising my colleagues shorter approach. In cte I count new user date, then in the main query I calculate shares like this: sum(iif(new user date>=month, 1, 0))/count(user_id), and group by month. Less subqueries, would it be quicker as well? And thanks for great videos, looking forward to solving many problems with window functions you taught.

Nadyusa
Автор

Just came across your videos and I really like your explanations! I was checking out StrataScratch and was wondering how the difficulty level maps to the difficulty level in Leetcode Database. Are your “Hard” questions harder than Leetcode “Hard”?

maheshchandra
Автор

Hello Nate, quiet good video, just wanted to know that the question you share, are these type of question useful for a Data Engineer Sql rounds as well?

kamakshijoshi
Автор

You dont need the count(distinct user_id) at 9:47. you did a min in your inner query which keeps once instance of that column per user_id. you said a user could show up multiple times in the same month but the min takes care of that

ismafoot
Автор

Why we need to use distinct user_id ?? When we already have filtered out the data in subquery and did a group by on user_id and fetched the min(time_id).
So even if user would be having multiple entries in the same month but will result only one row when we did min(time_id)

sujayshashank
Автор

Great video, Nate! I am a new grad and I have a question; whenever I try to solve a medium or hard problem, I can’t seem to think beyond ‘joining’ tables. I can’t seem to think of using CTEs or sub queries. Can you point me to any resources or use cases where I can learn to use CTEs/subqueries intuitively? Also are CTEs and subquiries have the same use case?

SuperLOLABC
Автор

Hey Nate, is it necessary to know the time complexity of queries during the interview? I understand it is expected in a coding interview but is it necessary for SQL interviews too?

SuperLOLABC
Автор

@Nate Could you please post some more data science or data engineer SQL questions?

priyankalad
Автор

Hi Nate, I think there's one thing wrong about this query. What if for a month there are 0 new users? By doing an inner join you're missing out on that logic.

In my opinion, the right way to do it is doing a left join between all users and new users and then assign a 0 wherever there's a null(in new users table).

Let me know what you think.

vivekkamisetti
Автор

“New users are defined as users who started using services in the current month.” But where are we checking that condition? Your solution checks min value of the time_id column. As per my understanding the current month should the month in which we exceute the query. Could you please explain.

ajeetnv
Автор

Hi Nate, In the problem statement it says ratio of new users to "existing users". I was wondering whether we should exclude the new users from all users and then get the existing users?? Thanks

VinodKumar-nngo
Автор

Nate could you post some tutorial about Data Modelling question

anumitamondal