Data Science SQL Interview Question | Recommendation System | Complex SQL 13

preview_player
Показать описание
Product recommendation. Just the basic type (“customers who bought this also bought…”). That, in its simplest form, is an outcome of basket analysis. In this video we will learn how to find products which are most frequently bought together using simple SQL. Based on the history ecommerce website can recommend products to new user.

Playlist for complex SQL questions:

Here is the ready script to practice it:

create table orders
(
order_id int,
customer_id int,
product_id int,
);

insert into orders VALUES
(1, 1, 1),
(1, 1, 2),
(1, 1, 3),
(2, 2, 1),
(2, 2, 2),
(2, 2, 4),
(3, 1, 5);

create table products (
id int,
name varchar(10)
);
insert into products VALUES
(1, 'A'),
(2, 'B'),
(3, 'C'),
(4, 'D'),
(5, 'E');
Рекомендации по теме
Комментарии
Автор

Wow ..i appreciate your simple thinking ! The use of Year function is group by was brilliant !

AmitDileepKulkarni
Автор

the step by step approach to solve a query is amazing, the steps you walk us through to put the thoughts in right direction is wonderful. Thanks for the valuable lesson and keep educating all :-) loved most of your videos as they teach learners in correct direction. Thanks again @AnkitBansal

sivasrimakurthi
Автор

Thanks Ankit and removing the duplicates was tricky .

shekharagarwal
Автор

Thank you, Ankit !

select concat(b.name, '', b1.name) as product_combo, count(concat(b.name, '', b1.name)) as purchase_freq from
(select p1.product_id as p1, p2.product_id as p2
from orders p1
join orders p2
on p1.order_id = p2.order_id and p1.product_id < p2.product_id)a
join products b on a.p1 = b.id
join products b1 on a.p2 = b1.id
group by concat(b.name, '', b1.name)

mantisbrains
Автор

amazing problem statement.. thanks for sharing this with solution.

Datapassenger_prashant
Автор

with temp1 as (
select order_id, count(order_id) coun from orders group by order_id having count(order_id)>1
),
temp2 as (
select * from orders where order_id in (select order_id from temp1)
),
temp3 as(
select a.*, b.name from temp2 a join products b on a.product_id = b.id
),
temp4 as (
select distinct a.*, concat(a.name, b.name) as combination from temp3 a inner join temp3 b on a.order_id =b.order_id
),
temp5 as (
select *, REVERSE(combination) as comb from temp4
),
temp6 as (
Select * from temp5 where combination < comb
)
select combination, count(combination) as coun from temp6 group by combination

propel
Автор

what do you think if the question is not 2 but 3, 4, 5, ... buy together

haleynguyen
Автор

Hello sir, nice video
Here is my approach :-

with cte as (
select o1.product_id as p1, o2.product_id as p2, count(*) as purchase_freq
from orders_recm as o1
inner join orders_recm as o2
on o1.order_id = o2.order_id
where o1.product_id < o2.product_id
group by o1.product_id, o2.product_id
)
select STRING_AGG(name, ' ') as product_pair, A.purchase_freq from cte as A
inner join products_recm as B
on B.id = A.p1 or b.id = A.p2
group by A.p1, A.p2, A.purchase_freq
;

devrajpatidar
Автор

Hi,
Any idea how will do suppose a user gives a product_id and according to that recommendation top 5 should pop up.

HuzaifaKhan-odmz
Автор

with t1 as (
Select a.order_id, a.customer_id, p1.name as name1, p2.name as name2, (p1.id+p2.id) as pair_sum, monotonically_increasing_id() as idf
from orders a
inner join orders b on a.order_id = b.order_id and a.product_id<>b.product_id
left join products p1 on a.product_id = p1.id
left join products p2 on b.product_id = p2.id
)
, t2 as (
Select order_id, customer_id, name1, name2, pair_sum, row_number() over(partition by order_id, pair_sum order by idf asc ) as rnk
from t1
), t3 as (
Select *,
concat(name1, ' ', name2) as pair
from t2 where rnk=1
)

Select
pair, count(distinct order_id) as frequency
from t3
group by pair
order by 2 desc

PrashantSharma-swjr
Автор

create temp table orders1 as
(
select * from orders a join products b on a.product_id = b.id);

select m2 as pair, count(distinct order_id) as purchase_freq from
(
select a.order_id, a.customer_id, concat( case when a.product_id> b.product_id then b.name else a.name end, case when a.product_id< b.product_id then b.name else a.name end) as m2 from orders1 a join orders1 b
on a.order_id = b.order_id and a.customer_id= b.customer_id and a.product_id<>b.product_id)
group by 1 order by 2 desc, 1

ujjwalvarshney
Автор

Amazing!!
self join questions are always tricky for me :-( .. any suggestion for good self join resources ?

harshSingh-ifzb
Автор

with ord as
(select order_id orid, cust cusid, product_id pid, name from products12, order2
where id = product_id)
select distinct nm2, cnt from
(select a.orid, a.cusid, a.pid,
a.name||' '||b.name nm2,
count(*) over(partition by a.name, b.name order by a.name) cnt
from ord a, ord b
where a.orid = b.orid
and a.pid <> b.pid
and a.pid <b.pid)
order by nm2;

amrutaborse
Автор

What id the product ID is not an integer but some varchar. Will the Pr-ID1>Pr-ID2 work? Or any other approach should be taken?

shahrukhkhanpatan
Автор

with cte as(
select order_id, name from orders o
join products p on o.product_id=p.id)

select concat(a.name, b.name) as pair, count(concat(a.name, b.name)) as purchase_freq from cte a
join cte b on a.order_id=b.order_id and a.name < b.name
group by concat(a.name, b.name)

ayushsrivastav
Автор

what is this bro I adicted to your real time senario sql complex queries😜

shankrukulkarni
Автор

I have tried the same but not getting the same in oracle console.. Receiving duplicate records by doing self join

vikaskasaraneni
Автор

I have used Lead() to get the next value :


with cte1 as(
select *, row_number() over(order by order_id, customer_id, product_id) as rn from orders
join products on product_id=id
),
cte2 as (
select name, lead(name) over(order by rn) as leaded from cte1
)
select concat(name, ' ', leaded) as combo, count(*) as prod_freq from cte2
group by concat(name, ' ', leaded) having len(concat(name, ' ', leaded))>1

shaikalfina-uz
Автор

Hey btw could you suggest me a course from where I could learn SQL and my primary goal is hands on learning, while building and playing with it . I couldn't find a decent course anywhere . Everywhere they are only focussing on specific topics and it's just basics mostly.

rrohit
Автор

Hi Ankit here is my solution
with cte as(
select * from orders o inner join products p on o.product_id =p.id), cte2 as(

select *, LEAD(name,1) over( partition by customer_id
order by (select null))next_name from cte),
cte3 as(
select *, CONCAT(name,next_name) d
from cte2)

select d,
COUNT(d)count_value
from cte3 where next_name is not null group by d

anushakrishnaa
welcome to shbcf.ru