Tricky SQL Challenge | SQL For Data Analytics

preview_player
Показать описание
This video explains you the SQL challenge which can give to Amazon Gift Vouchers worth Rs 1500. You need to comment your solution and 3 best answers will win Rs 500 vouchers each.

script:
create table section_data
(
section varchar(5),
number integer
)
insert into section_data
values ('A',5),('A',7),('A',10) ,('B',7),('B',9),('B',10) ,('C',9),('C',7),('C',9) ,('D',10),('D',3),('D',8);

Problem statement : we have a table which stores data of multiple sections. every section has 3 numbers
we have to find top 4 numbers from any 2 sections(2 numbers each) whose addition should be maximum
so in this case we will choose section b where we have 19(10+9) then we need to choose either C or D
because both has sum of 18 but in D we have 10 which is big from 9 so we will give priority to D.

Master the ART of SQL :

Zero to hero(Advance) SQL Aggregation:

Most Asked Join Based Interview Question:

Solving 4 Trick SQL problems:

Data Analyst Spotify Case Study:

Top 10 SQL interview Questions:

Interview Question based on FULL OUTER JOIN:

Playlist to master SQL :

Rank, Dense_Rank and Row_Number:

#sql #challenge #amazongiftcardgiveaway
Рекомендации по теме
Комментарии
Автор

The last date to submit the answer is March 12th .
I will announce the 3 winners based on best answers and in case of a tie the preference will be given to the solution posted first in the comments section.

If you want to master SQL check out my zero to hero course here :

ankitbansal
Автор

good question, here is my solution. I am not looking for any reward, it is just because you helped me to undestand complex sql question so i am just participating 🙂. Here is my sol.

with cte as
(select *, sum(number) over(partition by section) as sn from
(select *, row_number() over(partition by section order by number desc) as num_r from section_data) tbl
where num_r=1 or num_r=2
)
, max_sum_sec as
(
select *, dense_rank() over(order by sn desc) as rn from cte
)
, extract_sec as
( select *, dense_rank() over(partition by rn order by number desc) dnr from max_sum_sec where rn=1 or rn=2)

select section, number from cte where section in (select distinct section from extract_sec where dnr=1);

note: last logic is applied keep in mind that we are considering only 2 rows and if 1st row is different(in case of clash) then we will have our solution otherwise we have to consider all rows. if 1st row is different then it will be part of solution.

dibakarmandal
Автор

My approach, Thanks


with cte_section_data as
(select section, number,
rank() over (partition by section order by number desc) as rnk,
sum(number) over (partition by section order by number desc) as cumm_sum,
lag(number) over (partition by section order by number desc) as lag_num
from
section_data),
cte_top2 as
(select section from
cte_section_data
where rnk = 2
order by cumm_sum desc, lag_num desc
limit 2)

select section, number
from
cte_section_data
where section in (select section from cte_top2)
and (rnk = 1 or rnk = 2)

ramchavali
Автор

Ankit I tried this way below:
with cte as(select *,
sum(number) over(partition by section) as s_sum,
max(number) over(partition by section) as n_max
from (select *,
row_number() over(partition by section order by number desc) as rn
from section_data) A
where rn<=2)
select top 4 section, number
from cte
order by s_sum desc, n_max desc, number desc

MixedUploader
Автор

with cte as(
select *,
row_number() over(partition by section order by number desc) as rn
from section_data
),
cte2 as(
select section, number, sum(number) over (partition by section) top_2_sum from cte
where rn<=2
),
cte3 as(
select *,
dense_rank()over(order by top_2_sum desc) as drk,
max(number) over(partition by section order by number desc) as sec_max
from cte2
),
cte4 as(
select *,
row_number() over( order by drk, sec_max desc) as sec_max_rank
from cte3
where drk<=2
)
select * from cte4
where sec_max_rank<=4

MS-yzhu
Автор

Hi Ankit sir,
My solution for the above problem is :

With cte as(select *, rank() over(partition by section order by number desc)as max_2max from section_data),
cte1 as(select *, sum(number) over (partition by section)as sum, max(number) over (partition by section)as max from cte where max_2max<=2)
Select * from(
Select *, dense_rank() over(order by sum desc, max desc)as highest from cte1)as top_2 where highest<=2;

priyajanardhan
Автор

Hi Ankit, this is my solution, tried using self join & union

with cte as (
select *, ROW_NUMBER() over(partition by section order by number desc) as rn
from section_data
), top_2_pairs as (
select top 2 c1.section, c1.number as num1, c2.number as num2
from cte c1
inner join cte c2 on c1.section = c2.section and c1.rn = c2.rn-1
where c1.rn = 1
order by c1.number+c2.number desc, c1.number desc, c2.number desc
)
select section, num1
from top_2_pairs
union
select section, num2
from top_2_pairs

SachinSingh-oycq
Автор

Solved this without seeing your solution but ofcourse I am developing my sql skill bcz of your videos. plz suggest if any query performance optimisation is required.
Really want to hear from you on this solution approach.

with cte as(
select section, min(number)a from section_data
group by section),

cte1 as(select section as s, sum(number) total, max(number) max_number from section_data where number not in(select a from cte)
group by section
order by sum(number) desc, max(number) desc
limit 2)

select section, number, total, max_number from section_data
inner join cte1 on section =s
where number not in (select a from cte)

nidhisingh
Автор

hello Ankit really a great question I tried this in Big Query,

my solution along with the comments.

--The numbers are ranked in accord with section first, then in general
with cte as (SELECT section, number, row_number() over(partition by section order by number desc) as rn,
rank() over (order by number) as rns FROM section_data ),
--The top 2 numbers from each section are aggregated here
cte2 as (select section, string_agg(cast(number as string), ", ") as num from cte where rn<=2 group by 1),  
--The sum of top 2 numbers from each section is calculated and ranked over
cte3 as (select section, sum(number) as sums, rank() over(order by sum(number) desc) as rnk from cte where rn<=2 group by 1),  
-- If any of the top 2 sum is tied, then the highest number of the section is compared to determine the top 2 .
cte4 as(select c3.section, c3.sums, c3.rnk, rank() over(order by max(c1.rns) desc) as rp
from cte as c1 join cte3 as c3 on c1.section=c3.section where c3.rnk<=2 group by 1, 2, 3) 
-- The top 2 section with the top 2 numbers we get here.
select c4.section, split(c2.num, ", ") from cte4 as c4 join cte2 as c2 on c4.section=c2.section where c4.rnk<=2   and c4.rp <=2 order by c4.rnk;

sumanacharya
Автор

Hi Ankit, My solution here, Didn't seen your solution yet!!!:::
with cte as (
select *,
sum(number) over(partition by section order by number desc) as val,
row_number() over(partition by section order by number desc) as rn from section_data)
, section_info as(
select *, dense_rank() over(order by val desc, number asc)  as dnr from cte where rn<=2)
select section, number from section_info where rn<=2 and
section in (select section from section_info where dnr<=2)
order by section asc, number desc;

raghavendrabeesa
Автор

my solution -
with cte as (select *, row_number() over(partition by section order by number desc) as rk
from section_data ),
ctc as (select section, number, sum(number) over(partition by section) as s, max(number) over(partition by section) as sm from cte where rk < 3),
ctv as (select section, number, sm, dense_rank() over(order by s desc) as dk
from ctc)
select section, number
from ctv
where dk <3
order by sm desc
limit 4

adityap
Автор

my approach:

with a as
(
select *,
rank() over(partition by section order by number desc ) as rn
from section_data
where section != 'A'
),
b as
(
select *from a
where rn in ( 1, 2 )
),
agg as
(
select section, max(number) as num
from b
group by section
),
f_selection as
(
select *, rank() over( order by num desc ) as rn from agg
)
select *from b
where section in ( select section from f_selection where rn = 1)

madhuk
Автор

Hi Ankit, Thanks for the Challenge !!
;with cte as(

select *, rank() over(partition by section order by number desc) as rn from section_data )
, cte2 as(

select section, sum(number) as sum, count(distinct number) as max_count
from cte where rn<3 group by section
), cte3 as(

select a.*, b.max_count, dense_rank() over(order by b.sum desc) as rn2 from cte a left join cte2 b on a.section=b.section where a.rn<3)
select section, number from cte3 where rn2<3 and max_count=2

AbhishekYadav-pwco
Автор

Thanks, Ankit, Learnt how using dense_rank() during ties, where we use 2 criterias to rank is useful in arriving at solution

mantisbrains
Автор

select section, number from(
select *, max(number) over (partition by sw, section) as m1, dense_rank( ) over(order by sw desc ) as ro, max(number)over() as m2 from(
select section, number, sum(number) over (partition by section) as sw from(
select *, row_number() over (partition by section order by number desc ) rw from section_data)
where rw <=2))
where ro<=2 and m1=m2

ujjwalvarshney
Автор

#My approach is
with cte1 as(
select section, number,
lead(number) over(partition by section order by number desc) as next_no,
sum(number) over(partition by section order by number desc rows between current row and 1 following) as sum_num
from section_data
),
cte2 as(
select *
from cte1 c
where sum_num=(select max(sum_num) from cte1 group by section having section=c.section)
order by sum_num desc, number desc limit 2
)
select section, number from cte2
union
select section, next_no from cte2
order by section;

aaravkumarsingh
Автор

Hi, Ankit this is my solution for the problem. I have broken down the solution into modules and utilized cte's for it. This is my first solution and I am looking for ways to optimize it. Thank you for posting this challenge, hope to see more fun challenges like this in the future.
with summary as (
select *,
dense_rank() over (order by number desc) as num_rnk, -- finding the maximum number from all the data
rank() over (partition by section order by number desc) as in_rnk from section_data -- utilized this to filter out the third number for each section
),
section_sum as (
select section, sum(number) as sm, num_rnk -- finding the sum for each section
from summary
where in_rnk < 3 -- removing third record for each section
group by section
),
top_2_sum as (
select section, sm from section_sum
order by sm desc, num_rnk limit 2 -- ordered by sum of section and breaks tie based on rank of the numbers in each section since D has higher ranked number it is shown in the final output
)
select * from top_2_sum

sairaina
Автор

Here's my solution -

with cte as (
select section, number, sum(number) over (partition by section order by number desc rows between unbounded preceding and current row) as sum
from section_data
), cte2 as (
select section, number, sum, rn, sum(number) over(partition by section) as summ from (select *, ROW_NUMBER() over (partition by section order by (select 0)) as rn from cte)z
where rn <= 2
)
, cte3 as (
select top 2 * from cte2 where rn=2 order by sum desc
)
, cte4 as (
select cte2.section as section2, cte2.number as number2, cte2.sum as sum2, cte2.rn as rn2, cte2.summ as summ2 from cte3 left join cte2 on cte3.sum = cte2.summ
)
, cte44 as (
select *, lag(number2, 1) over (partition by section2 order by number2) as num2 from cte4
)
, cte5 as (
select top 2 * from (select *, number2-num2 as minus from cte44)z where z.minus is not null order by minus desc
)
, cte6 as (
select * from cte2 join cte5 on cte2.section = cte5.section2
)
select * from cte6

MohitSharma
Автор

Hi ankit sir,
Without watching your video, I almost solved the same way you have shown in the video. That is why I had to find other way to get the required ans.
Here's my solution:

with cte1 as(
SELECT t1.section as sec1, t1.number as num1,
t2.section as sec2, t2.number as num2
FROM section_data t1
cross join section_data t2
where t1.section=t2.section and t1.number>t2.number
),
cte2 as(
select sec1, num1, num2, (num1+num2) as total,
dense_rank() over(order by (num1+num2) desc) rnk
from cte1
),
cte3 as(
select *
from cte2
where rnk in (1, 2)
),
cte4 as(
select section, number,
dense_Rank() over(partition by section order by number desc) rnk
from section_data t1
join cte3 c3 on t1.section=c3.sec1
)
select section, number from cte4
where rnk in (1, 2)

yateshpatil
Автор

Hi Ankit, started exploring your content, Really appreciate your work :) Regarding this question, I know deadline is passed but I have tried to solve it purely by joins and without window function. Please at lest give your valuable input. I will wait for same. Thanks,

with cte1 as (
SELECT section, min(number) min FROM section_data
GROUP by section )
, cte2 as ( SELECT s.section, s.number FROM section_data s left join cte1 c on s.number=c.min and s.section=c.section
WHERE c.section is null )
, cte3 as ( SELECT section, sum(number) SUM
FROM cte2 GROUP by section )
SELECT top 4 c2.section, c2.number, c3.sum from cte2 c2 inner join cte3 c3 on c2.section=c3.section ORDER by sum, number DESC

chiragshah
visit shbcf.ru