Walmart Labs SQL Interview Question for Senior Data Analyst Position | Data Analytics

preview_player
Показать описание
In this video we will discuss a very interesting problem where from a phone log history we need to find if the caller had done first and last call for the day to the same person. I will explain the solution step by step and encourage you to solve it yourself first before checking the solution.

Zero to hero(Advance) SQL Aggregation:

Most Asked Join Based Interview Question:

Solving 4 Trick SQL problems:

Data Analyst Spotify Case Study:

Top 10 SQL interview Questions:

Interview Question based on FULL OUTER JOIN:

Playlist to master SQL :

Rank, Dense_Rank and Row_Number:
script:
create table phonelog(
Callerid int,
Recipientid int,
Datecalled datetime
);

insert into phonelog(Callerid, Recipientid, Datecalled)
values(1, 2, '2019-01-01 09:00:00.000'),
(1, 3, '2019-01-01 17:00:00.000'),
(1, 4, '2019-01-01 23:00:00.000'),
(2, 5, '2019-07-05 09:00:00.000'),
(2, 3, '2019-07-05 17:00:00.000'),
(2, 3, '2019-07-05 17:20:00.000'),
(2, 5, '2019-07-05 23:00:00.000'),
(2, 3, '2019-08-01 09:00:00.000'),
(2, 3, '2019-08-01 17:00:00.000'),
(2, 5, '2019-08-01 19:30:00.000'),
(2, 4, '2019-08-02 09:00:00.000'),
(2, 5, '2019-08-02 10:00:00.000'),
(2, 5, '2019-08-02 10:45:00.000'),
(2, 4, '2019-08-02 11:00:00.000');

#sql #interview #walmart #dataanalyst
Рекомендации по теме
Комментарии
Автор

Hi Ankit my sql skills have really improved after watching your videos..thankyou for this amazing content
solution without join:
with cte as (select *,
cast(Datecalled as date) as dated
from phonelog),
cte2 as (select *,
case when first_value(Recipientid) over(partition by Callerid, dated order by Datecalled) =
first_value(Recipientid) over(partition by Callerid, dated order by Datecalled desc) then 0 else 1 end as flg
from cte)
select Callerid, dated from cte2
group by Callerid, dated
having max(flg)=0

sathwikdhanya
Автор

Thanks Ankit for wonderful explanation, here is my try.

WITH cte AS (SELECT Callerid, Recipientid, CAST(Datecalled AS DATE) as Datecalled
FROM phonelog),
cte2 AS (SELECT *,
FIRST_VALUE(Recipientid) OVER(PARTITION BY Datecalled ORDER BY Datecalled) as first_value,
LAST_VALUE(Recipientid) OVER(PARTITION BY Datecalled ORDER BY Datecalled) as last_value
FROM cte)

SELECT Callerid, Datecalled, MAX(first_value) AS Recipientid FROM cte2
WHERE first_value = last_value
GROUP BY Callerid, Datecalled;

jameskhan
Автор

Hello Ankit, thank you for your help. I resolved the question from de following way:

with cte as(
select *,
FIRST_VALUE(Recipientid) over (partition by Convert(DATE, Datecalled) order by Datecalled) as first_call,
LAST_VALUE(Recipientid) over (partition by Convert(DATE, Datecalled) order by Datecalled rows between current row and unbounded following) as last_call
from phonelog)
select distinct Callerid, first_call, last_call,
FIRST_VALUE(Datecalled) over (partition by first_call order by Datecalled) as first_call_date,
LAST_VALUE(Datecalled) over (partition by first_call order by Datecalled rows between current row and unbounded following) as last_call_date
from cte
where first_call = last_call;

What do you thing? is a valid option for a Interview answer?
Very grateful for everything you do. Regards

ezequielsaldana
Автор

You could remove the joins and use window functions instead, it is usually preferred over doing "self" joins anyways. I also think it works faster from a performance standpoint, which makes a huge difference for extremely large datasets.

dhruvemital
Автор

Great Explanation .I approached using window function
with cte as
(
select *,
cast(Datecalled as date) as Dates,
FIRST_VALUE(Recipientid) over(partition by callerid, cast(Datecalled as date) order by Datecalled) First,
LAST_VALUE(Recipientid) over(partition by callerid, cast(Datecalled as date) order by Datecalled range between unbounded preceding and unbounded following) Last
from phonelog
)

select distinct callerid, Dates, First
from cte
where First = Last;

sonalijain
Автор

Thanks for informative video. My solution using window func:
with cte as
(Select Callerid, Recipientid, cast(Datecalled as date) date,
first_value(Recipientid) over( partition by Callerid, cast(Datecalled as date) order by Datecalled) as first,
first_value(Recipientid) over( partition by Callerid, cast(Datecalled as date) order by Datecalled desc) as last
from phonelog)
Select Callerid, date, max(first) as Recipientid
from cte
where first = last
group by Callerid, date

nikharjain
Автор

Thanks Ankit for explanation, this is my below approach using window function

select Callerid, Recipientid, cast(Datecalled as date) from(
select
*
, first_Value(Recipientid) over(partition by callerid, cast(Datecalled as date) order by Datecalled) as fval
, last_Value(Recipientid) over(partition by callerid, cast(Datecalled as date) order by Datecalled
rows between unbounded preceding and UNBOUNDED FOLLOWING ) as lval
from phonelog) as a
where Recipientid = a.fval and Recipientid = a.lval
group by Callerid, Recipientid, cast(Datecalled as date)
having count(cast(Datecalled as date)) > 1

ParthKhambhayta
Автор

I started finding your videos very addictive, keep making more such great videos, Thanks!

akashvishwakarma
Автор

Hello Ankit, Thank you for sharing the question. Here is my approach:
with cte as
(select x.*,
case when n = 1 then 'first_call'
when n = max(n) over (partition by callerid, date) then 'last_call'
end flag
from
(select t.*,
row_number() over (partition by t.callerid, t.date)n from
(select *, date(datecalled) date from phonelog)t)x),

cte2 as
(select z.*,
case when recipientid = lag(recipientid) over (partition by callerid) then 1
else 0
end ck
from (select callerid, datecalled, recipientid from cte
where flag is not null)z)

select callerid, datecalled, recipientid from cte2 where ck = 1;

BismarckWangkhem
Автор

With you video am getting more interest in SQL . Your way of explaing is very simple ... Thank you

vikasvk
Автор

Ankit thanks for solution. I tried it this way below:
with cte as(select *,
first_value(Recipientid) over(partition by cast(Datecalled as date) order by Datecalled) as first_recipient,
last_value(Recipientid) over(partition by cast(Datecalled as date) order by Datecalled) as last_recipient,
rank() over(partition by cast(Datecalled as date) order by cast(Datecalled as time) desc) as time_rnk,
count(*) over(partition by cast(Datecalled as date)) as daywise_count
from phonelog)
select Callerid, cast(Datecalled as date) as date_called, Recipientid
from cte
where time_rnk=1 and and daywise_count>1

MixedUploader
Автор

with cte as (select *, ROW_NUMBER()over(partition by callerid, cast(datecalled as date) order by Datecalled) as rn,
ROW_NUMBER()over(partition by callerid, cast(datecalled as date) order by Datecalled DESC) as rnk
from phonelog),

cte2 as (select * from cte
where rn=1 or rnk=1)

select callerid, recipientid
from cte2
group by callerid, recipientid
having count(*)>1

GulshanKumar-wjoe
Автор

great explanation this one is mine

with cte as (select *
, cast(datecalled as date ) as called_date
, count(callerid) over(partition by callerid, cast(datecalled as date )) as cnt
from phonelog
)
, cte2 as (select *
, lead(Recipientid, cnt - 1) over (partition by called_date, callerid order by callerid) as last_called
from cte
)
select callerid, called_date, Recipientid
from cte2
where Recipientid = last_called
group by called_date, callerid, Recipientid
;

rahulkanojiya
Автор

Love the way you explain the query of sql sir and your soft revert of the comments of the people.

insightMantra
Автор

Thanks Ankit, here is my approach with window functions:
with cte as(
select *,
cast(Datecalled as date) as caller_date,
first_value(Recipientid) over (partition by Callerid, cast(Datecalled as date) order by Datecalled)
as first_recipient,
first_value(Recipientid) over (partition by Callerid, cast(Datecalled as date) order by Datecalled desc)
as last_recipient
from phonelog )
select Callerid, caller_date,
first_recipient
from cte where
group by Callerid, caller_date,
first_recipient

sourabhgupta
Автор

it's my third time solving this question and each time came up with a different sol and I guess this is better optimized..

this time used first and last value window function and a cte
with cte as (
Select Callerid, Recipientid, CAST(datecalled as date) as Date_Col,
FIRST_VALUE(Recipientid) over(partition by callerid, CAST(datecalled as date) order by datecalled ) as first_call,
Last_VALUE(Recipientid) over(partition by callerid, CAST(datecalled as date) order by datecalled rows between unbounded Preceding and unbounded following) as last_call
, count(*) over(partition by callerid, CAST(datecalled as date) order by datecalled rows between unbounded Preceding and unbounded following) as Total_Calls
from phonelog
)
Select distinct Callerid, Recipientid, Date_Col from cte
where Recipientid = first_call
and Recipientid = last_call
and Total_Calls > 1;

Datapassenger_prashant
Автор

Thanks ankit for this clear explaination...
My Approach using Window functions with min, max and lag

with cte as (
select *, min(Datecalled) over(partition by date(Datecalled)) as first_call, max(Datecalled) over(partition by date(Datecalled)) as last_call from phonelog),
cte1 as(
select *, lag(Recipientid) over() as lags from cte where Datecalled=first_call or Datecalled=last_call)
select Callerid, Recipientid, Datecalled, first_call, last_call from cte1 where Recipientid=lags;

mranaljadhav
Автор

Awesome! After taking your course, these questions have become a cakewalk.

NilayMukhopadhyay
Автор

select callerid, recipientid, datecalled from(
with cte as(
select *, month(datecalled) as month from phonelog
)
select *, first_value(recipientid) over(partition by month order by datecalled) as first_call
, last_value(recipientid) over(partition by month) as last_call from cte) as a where first_call=last_call and recipientid=last_call;

Manjunath_
Автор

Hi Ankit, very grateful for your efforts to keep bringing the interesting problems.

I thought to give it a try without looking at the solution.

Mysql Solution:

with first_last_call_log as (
select *,
concat(Callerid, '-', Recipientid) caller_recept,
min(datecalled) over (partition by date_format(datecalled, '%Y-%m-%d')) first_call_time,
max(datecalled) over (partition by date_format(datecalled, '%Y-%m-%d')) last_call_time
from phonelog),
min_max_dates_calls as (
select Callerid, first_call_time, Recipientid, count(caller_recept) from first_last_call_log
where datecalled = first_call_time or datecalled = last_call_time
group by Callerid, Recipientid having count(caller_recept)>1)
select Callerid, date_format(first_call_time, '%Y-%m-%d') called_date, Recipientid from min_max_dates_calls;

rakeshchaudhary
join shbcf.ru