Scenario Based SQL Question | Solving Using SCD Type 2 Concept | SQL Interview Question

preview_player
Показать описание
In this video we will discuss a scenario based SQL problem. I will be solving it using Analytical function. You will learn how to use Lead analytical function with partition by clause and how to deal with data ranges in SQL.

Here is the ready script:

create table billings
(
emp_name varchar(10),
bill_date date,
bill_rate int
);
delete from billings;
insert into billings values
('Sachin','01-JAN-1990',25)
,('Sehwag' ,'01-JAN-1989', 15)
,('Dhoni' ,'01-JAN-1989', 20)
,('Sachin' ,'05-Feb-1991', 30)
;

create table HoursWorked
(
emp_name varchar(20),
work_date date,
bill_hrs int
);
insert into HoursWorked values
('Sachin', '01-JUL-1990' ,3)
,('Sachin', '01-AUG-1990', 5)
,('Sehwag','01-JUL-1990', 2)
,('Sachin','01-JUL-1991', 4)

#sql #scenariobased #interview
Рекомендации по теме
Комментарии
Автор

Interesting !!! These problems look tedious initially, but then your step-by-step approach is amazing. Wonderful Work!!!

vandanaK-mhzo
Автор

Thanks Ankit! You nailed it by connecting concepts with real world problems !

sharu
Автор

learned a new terminology today.. "SCD TYPE 2" and an amazing problem statement on how to handle end date in place of null. I was able to take care of that -1 in date but wasn't sure how to deal with if there is no change in bill date..
thank you once again for this super amazing concept.

Datapassenger_prashant
Автор

Really smart. I think a lot of ways to solve this problem but it doesn't work and you give an excellent solution amazing! Thank you for your channel. Hope you make more advanced SQL excercise

tunguyenanh
Автор

select c.emp_name, sum(case when work_date between bill_date and next_date then bill_hrs*bill_rate end) as total_charges from
(select a.emp_name, work_date, bill_hrs, bill_date, bill_rate, next_date from
(select * from hoursworked1)a join
(select *, lead(bill_date, 1, '9999-12-31') over (partition by emp_name order by bill_date) next_date from billings)b
on a.emp_name = b.emp_name)c
group by c.emp_name;

Thanks Ankit!

mantisbrains
Автор

Ankit bhai can you please also explain various practical techniques for delta/incremental load.

tushardeepsingh
Автор

I follow all of your videos, it's really amazing.
This all queries are helpful to crack SQL interview

omsingh
Автор

Hi Ankit, I have been following up your videos and find them quite insightful. Thank you for this wonderful work.
I have one request to make. Please also explain the generic usage and functionality of these SQL analytical functions (like lead() in this video). This will help in understanding what other scenarios can these be used.

khushisrivastava
Автор

Thanks Ankit for your hard work. Really you made it enjoyable

shekharagarwal
Автор

Hi Ankit Sir,

Please evaluate this solution, Instead of using a dummy date, i have used filter conditions as it can be possible that we didn't update the values of end date in real life scenario

with total_billings as (
select *, dateadd(day, -1, lead(bill_date) over(partition by emp_name order by bill_date)) as last_date
from billings)

select h.emp_name, sum(bill_rate*bill_hrs) as totalCharges
from total_billings t inner join HoursWorked h
on t.emp_name=h.emp_name
where
(h.work_date between bill_date and last_date) or
(last_date is null and bill_date<work_date)
group by h.emp_name;

Will this query cover all the test cases or am i missing something. Kindly let me know?

abhishekjain
Автор

Great videos! Here is my solution before watching complete video for solution

with billing_dimension as (
select emp_name,
bill_date as effective_start_date,
lead(bill_date, 1, '9999-12-31') over(partition by emp_name order by bill_date) as effective_end_date,
bill_rate
from billings
)
select h.emp_name, sum(bd.bill_rate * h.bill_hrs) as total_charges
from HoursWorked h
inner join billing_dimension bd
on h.emp_name = bd.emp_name
and h.work_date >= effective_start_date and h.work_date < effective_end_date
group by h.emp_name;

AnandaHariharaShivamurthy
Автор

Hi ankit thanks for posting here is my solution: with billing_range as (
SELECT
emp_name,
bill_date as start_bill_date,
lead(bill_date, 1, '9999-12-31')over(partition by emp_name order by bill_date) as end_date,
bill_rate
from billings
), aggregated_result as (
SELECT
br.emp_name,
br.bill_rate,
hw.bill_hrs,
br.bill_rate * hw.bill_hrs as cost
FROM billing_range br
JOIN HoursWorked hw
ON br.emp_name = hw.emp_name
AND hw.work_date BETWEEN br.start_bill_date AND br.end_date
)

SELECT
emp_name,
sum(cost) as totalcharges
from aggregated_result
GROUP by emp_name

tahakhalid
Автор

Hi Ankit,

SELECT LEAD(dateadd(day, -1, bill_date), 1, DATE('28-SEP-1999')) OVER (partition by emp_name order by bill_date asc) as bill_date_end
FROM billings;

getting this error in Redshift
ERROR: 42601: Default parameter not be supported for window function lead

vishnukumarreddykonatcsl
Автор

Pls make a video implementing all SCD types

PriyanshuGupta-ecfv
Автор

using union-

with temp as(
select *,
count(*) over(partition by username) as total_act,
rank() over(partition by username order by startDate desc) as rn

from useractivity)

select * from temp
where total_act>1 and rn=2

union

select * from temp
where total_act=1
;

ashutoshverma
Автор

Below is select query from postgresql,
select
a.emp_name,
sum(
a.bill_hrs *
(select b.bill_rate
from billings b where a.emp_name=b.emp_name
and a.work_date > b.bill_date
order by b.bill_date desc limit 1)
)
as calc
from HoursWorked a group by a.emp_name;

murwathmehtar
Автор

select enm,
sum(case when work_date>= mn1 and work_date<= mx1 then bill_hrs*mn2 else bill_hrs*mx2 end) rat2
from
(select distinct enm, work_date, bill_hrs, min(bill_date) over(partition by enm) mn1,
max(bill_date) over(partition by enm) mx1
, min(bill_rate) over(partition by enm) mn2,
max(bill_rate) over(partition by enm) mx2 from a) group by enm ;

amrutaborse
Автор

#Using lead & Between
with cte as (select *, lead(bill_date, 1, '9999-12-30') over(partition by emp_name order by bill_date) as lead_res
from billings )
select cte.emp_name, sum(cte.bill_rate * hw.bill_hrs) as totalcharges from cte inner join hoursworked hw
on cte.emp_name = hw.emp_name and hw.work_date between cte.bill_date and cte.lead_res
group by cte.emp_name
order by 2 desc

saibhargav
Автор

with bill AS
(select emp_name, bill_rate, bill_date, lead(bill_date, 1, '9999-12-31') over(partition by emp_name order by bill_date) as enddate
from billings
)
select a.emp_name, sum(bill_hrs*bill_rate) from HoursWorked a join bill b on a.emp_name=b.emp_name and a.work_date between b.bill_date and b.enddate
group by a.emp_name

guptaashok
Автор

MY SQL Solution, only trick was to get the SCD type 2 table for billings tables

with base as (select emp_name, bill_date as start_date,
over(partition by emp_name order by bill_date), interval 1 day), '9999-12-31') as end_date,
bill_rate
from billings order by emp_name )
select t1.emp_name,
sum(t1.bill_hrs * t2.bill_rate) as totalcharges
from HoursWorked t1
join base t2 on t1.emp_name = t2.emp_name and t1.work_date between t2.start_date and t2.end_date
group by t1.emp_name

anirvansen