SQL Interview Questions & Answer - Part 59 | Google SQL Question

preview_player
Показать описание
SQL Interview Questions & Answer - Part 59 | Google SQL Question | Median Google Search Frequency 🔥🔥

Google's marketing team is making a Superbowl commercial and needs a simple statistic to put on their TV ad: the median number of searches a person made last year.
However, at Google scale, querying the 2 trillion searches is too costly. Luckily, you have access to the summary table which tells you the number of searches made last year and how many Google users fall into that bucket.
Write a query to report the median of searches made by a user. Round the median to one decimal point.

-------------------------------------------------------------------------
𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮:🔥
-------------------------------------------------------------------------
🔴 Instagram :

🔴 Twitter:
-------------------------------------------------------------------------
🔴 Table and Insert SQL Script :
-------------------------------------------------------------------------
Create Table search_frequency (
searches int,
num_users int
)

Insert into search_frequency Values(1,2)
Insert into search_frequency Values(4,1)
Insert into search_frequency Values(2,2)
Insert into search_frequency Values(3,3)
Insert into search_frequency Values(6,1)
Insert into search_frequency Values(5,3)
Insert into search_frequency Values(7,2)

#GoogleDataEngineer #GoogleInterview #GoogleSQLInterview #FAANG #SQLInterviewQuestionsandanswers #sqlInterviewQuestions #sqlInterviewQuestionsForTesting #ITJunction4all
Рекомендации по теме
Комментарии
Автор

As always, great question from Sunil. My approach without using the inbuilt percentile_cont function:-


Declare @total int;
Set @total = (Select sum(num_users) from search_frequency)


; with cte_rec_median as (
Select searches, num_users, 1 as step from search_frequency
union all
select searches, num_users, 1+step from cte_rec_median
where step < num_users
)
Select CASE WHEN @total%2 = 0 THEN CAST(sum(CAST(searches as Decimal(38, 1))/2) as Decimal(38, 1)) ELSE sum(searches) END as median from (
Select searches, CASE WHEN @total%2 = 0 THEN CASE WHEN rn = @total/2 or rn = @total/2+1 THEN 1 ELSE 0 END
ELSE CASE WHEN rn = @total/2+1 THEN 1 ELSE 0 END
END as Indicator from (
Select searches, num_users, step, row_number() over(order by searches, num_users) as rn from cte_rec_median
)sub
)sub2
where Indicator = 1

premanandramesh
Автор

Awesome Sunil!!

Below is another solution...




with temp1 as /* holds exact median value of the sum is odd; else holds two values whose avg is median */
(select * from (
select case when sum(num_users)%2 =0 then sum(num_users)/2 end as col
from search_frequency union
select case when sum(num_users)%2 =0 then (sum(num_users)/2)+1 end as col
from search_frequency union
select case when sum(num_users)%2 =1 then sum(num_users)/2 end as col
from search_frequency
) as a
where col is not null
)
, temp2 as /* calcs running total of num_users and a lag of prev running total */
(select searches
, num_users
, cumulative_sum
, COALESCE(lag(cumulative_sum,1) over(), -1) as prev_cumulative_sum
from (
select searches
, num_users
, sum(num_users) over(order by searches rows between unbounded preceding and current row) as cumulative_sum
from search_frequency
order by
searches
) as a
)
select round(avg(searches), 1) as median
from (
select searches
, case
when ( (select min(col) from temp1) between prev_cumulative_sum+1 and cumulative_sum OR
(select max(col) from temp1) between prev_cumulative_sum+1 and cumulative_sum
) then 'Y'
else 'N'
end as choose
from temp2
) A
where choose = 'Y'

renganathanmutthiah
Автор

Another approach.

SELECT AVG(1.*searches) median

FROM (
SELECT searches, row_number() over(order by searches) rn, count(*) over() cnt
FROM search_frequency f
CROSS APPLY GENERATE_SERIES(1, f.num_users)
) x
WHERE rn BETWEEN 1.*cnt/2 AND 1.*cnt/2 + 1

reachrishav
Автор

with cte as
(select searches, num_users, 1 as temp from search_frequency
union all
select searches, num_users, temp +1 from cte where temp+1 <= num_users
)
, cte2 as
(
select *, row_number() over (order by searches) as rn from cte
)
select sum(searches)/2.0 as Median
from cte2
where rn in ((select count(*) from cte2) /2, (select count(*) from cte2)/2 +1)


Ref Notes:
n is the count of records after recursive cte
if n is odd:
Median = (n + 1)/2 th item

If the number of values is even, the median can be calculated as:

Median = ((n/2)th element + (n/2 + 1)th element)/2

arunkumarn
Автор

with recursive cte as(
select searches, num_users from search_frequency
union all
select searches, num_users-1 from cte
where num_users-1 > 0),
cte2 as (
select *, total_cnt*1.0/2 as t1, total_cnt*1.0/2 + 1 as t2
from (
select *,
row_number() over(order by searches) as rn,
count(searches) over() as total_cnt
from cte
order by searches)x)

select round(avg(searches), 2) as median from cte2
where rn between t1 and t2;

SnapMathShorts
Автор

thanks for this, can you consider posting videos in 720p or more.. this resolution is too old now.

akashgoel
welcome to shbcf.ru