Tricky SQL Interview Problem Asked in Amazon

preview_player
Показать описание
In this video we are going to discuss a SQL interview problem asked in Amazon for a Business Intelligence Engineer Position.

Here is the script:
create table purchase_history
(userid int
,productid int
,purchasedate date
);
SET DATEFORMAT dmy;
insert into purchase_history values
(1,1,'23-01-2012')
,(1,2,'23-01-2012')
,(1,3,'25-01-2012')
,(2,1,'23-01-2012')
,(2,2,'23-01-2012')
,(2,2,'25-01-2012')
,(2,4,'25-01-2012')
,(3,4,'23-01-2012')
,(3,1,'23-01-2012')
,(4,1,'23-01-2012')
,(4,2,'25-01-2012')
;
Zero to hero(Advance) SQL Aggregation:

Most Asked Join Based Interview Question:

Solving 4 Trick SQL problems:

Data Analyst Spotify Case Study:

Top 10 SQL interview Questions:

Interview Question based on FULL OUTER JOIN:

Playlist to master SQL :

Rank, Dense_Rank and Row_Number:

#sql #dataengineer
Рекомендации по теме
Комментарии
Автор

My approach, kudos to you for your contribution 🥇🥇
with cte as (
select *, rank() over(partition by userid, productid order by purchasedate) as same_product
from purchase_history )

select (userid) from cte
where userid not in (select userid from cte where same_product = 2 )
group by userid
having ;
;

senthilkumarjeyaraj
Автор

Hi Ankit...Thanks for the content
My approach
WITH all_data AS(
SELECT *, DENSE_RANK()OVER(PARTITION BY userid, productid ORDER BY purchasedate ASC) AS rn
FROM purchase_history)
SELECT userid
FROM all_data
GROUP BY userid
HAVING max(rn)=1 AND count(distinct purchasedate)>1

mohdtoufique
Автор

My take on this problem -
WITH CTE AS(
SELECT *, RANK() OVER(PARTITION BY user_id, product_id ORDER BY purchase_date) AS rn
FROM purchase_history
)
SELECT user_id
FROM CTE
WHERE user_id NOT IN(
SELECT user_id
FROM CTE
WHERE rn != 1
)
GROUP BY user_id
HAVING COUNT(DISTINCT purchase_date)>1;

shlokagrawal
Автор

select userid from purchase_history where userid not in (select userid from (select p.*, row_number() over(partition by userid, productid ) as rn from purchase_history p ) where rn > 1 ) group by userid having

kannanramachandran
Автор

with cte as(
select *, count(1) over(partition by userid, productid) as cnt from purchase_history),
cte2 as(
select * from cte where userid not in (select userid from cte where cnt>1))
select userid from cte2 group by userid having count(distinct purchasedate)>1;

rahulmehla
Автор

with cte as(
select *, row_number() over(partition by userid, productid order by purchasedate) as rn
FROM purchase_history)
select userid
from cte
group by userid
having max(rn)=1 and count(distinct purchasedate)>1

MuskanGoyal-dbcs
Автор

My solution
with cte as
(
select *,
row_number() over(partition by userid, productid) as rn
from purchase_history
),
cte2 as
(
select distinct userid
from cte
where rn = 2
)
select userid
from purchase_history
where userid not in (select * from cte2)
group by userid
having count(distinct purchasedate) > 1

radhikamaheshwari
Автор

with du as (select count(productid) over (partition by userid, purchasedate order by purchasedate) ctn, * from purchase_history )

select userid from du where ctn=1 group by 1

enisertem
Автор

Hi Ankit, Thanks for the value you are creating. Your videos really help us.

I think this solution is not matching with question. What happens if a customer buy same product on same day? Ex: Userid 1 bought productid 1 twice on 2012-01-23. As the question is about customers not buying same product on different day but it is okay to buy twice on same day.

lokeshkatikireddi
Автор

Great question brother you are doing great work

karan-pqjy
Автор

Hi Ankit, i love your videos. Thankyou very much for all the efforts you are keeping to make these videos and helping the Data community.

Please chek my solution
with cte1 as ---This cte returns, users who bought same product more than once
(
select userid, productid, count(1) as tot
from purchase_history
group by userid, productid
having count(1) > 1

)

select distinct userid
from purchase_history
where userid not in (select userid from cte1) --Filtering users who bought same product more than once

and userid not in ( select userid ----Filtering users who dont buy on different dates
from purchase_history
group by userid
having count(distinct purchasedate) = 1 )

venkataram
Автор

Hi Ankit, this query will fail a edge case where a product is purchased twice on a single date but different on the other dates.. This user should be in the output but yours will definetely fail to come to the output..
I used a reverse approach get the userid who has bought same product and then remove that userid from the output..

MrSam
Автор

with cte as (
select userid, count(distinct purchasedate) as count_purchasedate, count(productid) as new_p, count(distinct productid) as new_dp
from purchase_history
group by userid
having count_purchasedate > 1)

select case when new_p = new_dp then userid end as userid
from cte
where (case when new_p = new_dp then userid end) is not null

Chathur
Автор

Hi ankit. Iam waiting for past few months for your course. fees charges are higher and normal individual cannot afford. please reconsider the price. I emailed you ankit. THANKYOU . ALL THE BEST. NEW TOPICS VERY INTERESTING.

rajeshramisetti
Автор

with cte as (
select *, count(productid) over(partition by productid, userid) as pro_count,
count( purchasedate) over (partition by purchasedate, userid) as date_count
from purchase_history
)
select distinct userid from cte
where pro_count = 1 and Date_count = 1


getting optimum output, is there any wrong in this

kailastupe
Автор

WITH MY_CTE AS
(SELECT *, COUNT(*)OVER(PARTITION BY userid, purchasedate)AS cnt FROM purchase_history)

SELECT DISTINCT userid FROM MY_CTE
WHERE cnt =1

ArnabGhosh-prbm
Автор

with du as (select count(productid) over (partition by userid, purchasedate order by purchasedate) ctn,
count( purchasedate) over (partition by userid, purchasedate order by userid) ctn1,
* from purchase_history )

select userid from du where ctn=1 and ctn1=1 group by 1

enisertem
Автор

Hi Ankit, your approach is better than mine. But I am sharing my approach anyway:

In cte I am finding for each puchchase, whether the productid matches with that customer's any previous purchase productid, if 'no' then that productid will be new for that customer.

Then in main query, first discarding customers who puchased only in one day. Then, I am counting total number of product puchases made by each customers and number of new kind of product purchases by them. If these two values are equal then only selecting that customer.



The PostgreSQL syntax is as follows:

with cte as(
SELECT userid, productid, purchasedate,
DENSE_RANK() over(PARTITION by userid ORDER by purchasedate) as purchase_day_serial_num
case WHEN productid in (SELECT productid from purchase_history p2 where p2.userid=p1.userid and
THEN 'no' ELSE 'yes' END as new_product
FROM purchase_history p1)


SELECT userid
from cte
group by userid, purchase_day_serial_num
having AND COUNT(userid)=COUNT(case when new_product='yes' then 1 END);

ArijitSaha_
Автор

Hi Ankit Sir, Can you please give a verdict on the solution?

with cte1 as
(
select a.* from purchase_history as a left join
(
select userid, productid, count(1) as product_count from purchase_history group by userid, productid
having count(1) > 1
) as b on a.userid = b.userid where b.userid is null
)
select userid from cte1 group by userid
having count(distinct purchasedate) > 1;

saikatde
Автор

I am using mysql, I seem to be stuck on insert step. How do I change the date format in the session for mysql. Anyone can guide?

mariecurie
welcome to shbcf.ru