SQL Test Based on Real Interview | SQL Interview Questions and Answers

preview_player
Показать описание
In this video we will solve a complete SQL test consist on 7 interview questions. This would be a great exercise to practice SQL.

script:
CREATE TABLE users (
USER_ID INT PRIMARY KEY,
USER_NAME VARCHAR(20) NOT NULL,
USER_STATUS VARCHAR(20) NOT NULL
);

CREATE TABLE logins (
USER_ID INT,
LOGIN_TIMESTAMP DATETIME NOT NULL,
SESSION_ID INT PRIMARY KEY,
SESSION_SCORE INT,
FOREIGN KEY (USER_ID) REFERENCES USERS(USER_ID)
);

-- Users Table
INSERT INTO USERS VALUES (1, 'Alice', 'Active');
INSERT INTO USERS VALUES (2, 'Bob', 'Inactive');
INSERT INTO USERS VALUES (3, 'Charlie', 'Active');
INSERT INTO USERS VALUES (4, 'David', 'Active');
INSERT INTO USERS VALUES (5, 'Eve', 'Inactive');
INSERT INTO USERS VALUES (6, 'Frank', 'Active');
INSERT INTO USERS VALUES (7, 'Grace', 'Inactive');
INSERT INTO USERS VALUES (8, 'Heidi', 'Active');
INSERT INTO USERS VALUES (9, 'Ivan', 'Inactive');
INSERT INTO USERS VALUES (10, 'Judy', 'Active');

-- Logins Table

INSERT INTO LOGINS VALUES (1, '2023-07-15 09:30:00', 1001, 85);
INSERT INTO LOGINS VALUES (2, '2023-07-22 10:00:00', 1002, 90);
INSERT INTO LOGINS VALUES (3, '2023-08-10 11:15:00', 1003, 75);
INSERT INTO LOGINS VALUES (4, '2023-08-20 14:00:00', 1004, 88);
INSERT INTO LOGINS VALUES (5, '2023-09-05 16:45:00', 1005, 82);

INSERT INTO LOGINS VALUES (6, '2023-10-12 08:30:00', 1006, 77);
INSERT INTO LOGINS VALUES (7, '2023-11-18 09:00:00', 1007, 81);
INSERT INTO LOGINS VALUES (8, '2023-12-01 10:30:00', 1008, 84);
INSERT INTO LOGINS VALUES (9, '2023-12-15 13:15:00', 1009, 79);

-- 2024 Q1
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (1, '2024-01-10 07:45:00', 1011, 86);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (2, '2024-01-25 09:30:00', 1012, 89);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (3, '2024-02-05 11:00:00', 1013, 78);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (4, '2024-03-01 14:30:00', 1014, 91);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (5, '2024-03-15 16:00:00', 1015, 83);

INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (6, '2024-04-12 08:00:00', 1016, 80);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (7, '2024-05-18 09:15:00', 1017, 82);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (8, '2024-05-28 10:45:00', 1018, 87);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (9, '2024-06-15 13:30:00', 1019, 76);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (10, '2024-06-25 15:00:00', 1010, 92);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (10, '2024-06-26 15:45:00', 1020, 93);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (10, '2024-06-27 15:00:00', 1021, 92);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (10, '2024-06-28 15:45:00', 1022, 93);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (1, '2024-01-10 07:45:00', 1101, 86);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (3, '2024-01-25 09:30:00', 1102, 89);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (5, '2024-01-15 11:00:00', 1103, 78);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (2, '2023-11-10 07:45:00', 1201, 82);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (4, '2023-11-25 09:30:00', 1202, 84);
INSERT INTO LOGINS (USER_ID, LOGIN_TIMESTAMP, SESSION_ID, SESSION_SCORE) VALUES (6, '2023-11-15 11:00:00', 1203, 80);

Zero to hero(Advance) SQL Aggregation:

Most Asked Join Based Interview Question:

Solving 4 Trick SQL problems:

Data Analyst Spotify Case Study:

Top 10 SQL interview Questions:

Interview Question based on FULL OUTER JOIN:

Playlist to master SQL :

Rank, Dense_Rank and Row_Number:

#sql #dataengineer
Рекомендации по теме
Комментарии
Автор

It takes a lot of effort to make these videos. Please do hit the like button and subscribe to the channel.

ankitbansal
Автор

Hi Ankit,
You're really awesome, helping so many.

Last Question in PostgreSQL:


with cte as(
select
generate_series(min(login_timestamp), max(login_timestamp), '1 day')::date as login_timestamp
from logins)
select login_timestamp from cte
where login_timestamp not in (select distinct login_timestamp::date
from logins)

LearnSQL
Автор

I wish more folks watch your channel before giving interviews, solid stuff. Infact, they should pay you 10% post getting a new job ;)

keshavkaranam
Автор

This is truly amazing service to the analytics community. Thank you brother

SunilKumar_
Автор

Hi Ankit Sir,

Thank you for this wonderful Case Study..
It took 70 mins for me to solve all the questions.. only for 6th question I used a bit complex query when compared to yours. other wise everthing were good. enjoyed a lot solving these problem statement..
and for last one I used recursive cte..

Datapassenger_prashant
Автор

This video is pretty simple to understand and this expertise will come with practice.. I hope all the folks practice this at least once a day.. thanks so much Ankit

nagarajshenoy
Автор

Hi Ankit,
I really appreciate your efforts. I learnt a lot from your videos and I've become confident in SQL
Thanks a ton.

uday
Автор

Hi Ankit,

I appreciate your learning session.
Could you please make some videos apart from SQL queries like SQL Dashboard operation, basic options, error handling part, trigger, procedure etc...

MazharKhan-hxxu
Автор

Just the video I needed to revise! Thanks a ton Ankit!!


Just a twist to your 2nd Q, solve the same question (#2) WITHOUT using DATETRUNC function

it's quite interesting!

Anyone reading this comment, comment your approach in above case, its fun!

varunas
Автор

Hi Ankith,

It's one of the best SQL videos, and you have amazing explanation skill.

NiranjanDevaraddi
Автор

Hi Ankit Sir
You're really awesome, its very helpful added to it
if you can add on these question in an seperate file it will be very useful

eldhokuriakose
Автор

Brilliantly Explained sir .... Mindblowing ... 🙂

sidindian
Автор

Thank you @Ankit. Please share the questions as you did the CREATE & INSERT statements. Thank you once again.

newmanokereafor
Автор

Perfect timing! I have an interview in two days.

I've solved these in Oracle.

-- 1. Usernames that did not login in the past 5 months
SELECT user_id, user_name
FROM users
WHERE user_id NOT IN (
SELECT user_id
FROM logins
WHERE login_timestamp > ADD_MONTHS(SYSDATE, -5)
);



-- 2. Quarter Analysis
SELECT
TO_CHAR(ADD_MONTHS(TO_DATE('01/01/2024', 'MM/DD/YYYY'), 3 * (TO_CHAR(LOGIN_TIMESTAMP, 'Q') - 1)), 'MM/DD')
|| '/' || TO_CHAR(MIN(LOGIN_TIMESTAMP), 'YYYY') AS quarter,
COUNT(*) AS sessions,
COUNT(DISTINCT user_id) AS unique_users
FROM
logins
GROUP BY
TO_CHAR(LOGIN_TIMESTAMP, 'Q')
ORDER BY
1;

-- 3. Users in Jan2024 but not in Nov2023
SELECT DISTINCT user_id
FROM logins
WHERE TO_CHAR(LOGIN_TIMESTAMP, 'MM/YYYY') = '01/2024'
AND user_id NOT IN (
SELECT user_id
FROM logins
WHERE TO_CHAR(LOGIN_TIMESTAMP, 'MM/YYYY') = '11/2023'
);


-- 4. Percent increase from ans2
WITH raw_data AS (
SELECT
TO_DATE(
TO_CHAR(
ADD_MONTHS(TO_DATE('01/01/2024', 'MM/DD/YYYY'), 3 * (TO_CHAR(LOGIN_TIMESTAMP, 'Q') - 1)),
'MM/DD')
|| '/' || TO_CHAR(MIN(LOGIN_TIMESTAMP), 'YYYY'),
'MM/DD/YYYY') AS quarter,
COUNT(*) AS sessions,
COUNT(DISTINCT user_id) AS unique_users
FROM
logins
GROUP BY
TO_CHAR(LOGIN_TIMESTAMP, 'Q')
ORDER BY
quarter
)
SELECT
QUARTER,
SESSIONS,
unique_users,
ROUND((SESSIONS / LAG(SESSIONS, 1, SESSIONS) OVER (ORDER BY quarter)) * 100 - 100) AS pct
FROM
raw_data;


-- 5. Users with highest scores [will show multiple entries is more than one max]
WITH cte AS (
SELECT
user_id,
TRUNC(login_timestamp) AS login_time,
SUM(session_score) AS total_score
FROM
logins
GROUP BY
user_id, TRUNC(login_timestamp)
)
SELECT
user_id,
login_time,
MAX(total_score) AS score
FROM
cte
GROUP BY
user_id, login_time
ORDER BY
user_id, login_time;



-- 6. Users logged in everyday
WITH cte AS (
SELECT
user_id,
MAX(trunc(login_timestamp)) OVER (PARTITION BY user_id) AS max_date,
MIN(trunc(login_timestamp)) OVER (PARTITION BY user_id) AS min_date,
COUNT(DISTINCT trunc(login_timestamp)) OVER (PARTITION BY user_id) AS distinct_dates_count
FROM
logins
)
SELECT
DISTINCT user_id
FROM
cte
WHERE
max_date - min_date = distinct_dates_count - 1;


-- 7. Dates without login
WITH cte (start_date) AS (
SELECT MIN(TRUNC(login_timestamp)) AS start_date
FROM logins

UNION ALL

SELECT start_date + 1
FROM cte
WHERE start_date < (SELECT MAX(TRUNC(login_timestamp)) FROM logins)
)
SELECT *
FROM cte
WHERE start_date NOT IN (
SELECT DISTINCT TRUNC(login_timestamp)
FROM logins
)
ORDER BY 1;

exanode
Автор

Thanks Ankit!

Not correcting you at all, but the 2nd question condition is to sort from newest to oldest.

Result should be order by datetrunc(quarter, login_timestamp) desc

"Nevertheless, exceptional learning as always."

thegamingtron
Автор

Learned many new date functions, from this video, do it more

GiriPrasath.D
Автор

Thank you Ankit, it's really interesting and cool stuff

sudhirsharma
Автор

Hi Ankit, it is really helpful, could you please start creating content on AWS and pyspark. Also just to add on in 1st question we need to use user tables and there might be the case user has never logged in so that entry might not come in logins table??

amberjain
Автор

❤❤❤❤❤❤❤❤ i don't have any words 🎉🎉🎉🎉🎉🎉
Love you 🤟 bro

Gareeb_
Автор

Hi Ankit! Thanks for the content
My approach from q4 onwards
4)WITH quarter_cnt AS(
select YEAR(cast(login_timestamp AS Date)) AS login_year, DATEPART(QUARTER, cast(login_timestamp AS Date)) AS login_quarter,
COUNT(DISTINCT USER_ID) AS user_cnt,
COUNT(DISTINCT SESSION_ID) AS session_cnt
from logins
GROUP BY YEAR(cast(login_timestamp AS Date)), DATEPART(QUARTER, cast(login_timestamp AS Date)))
SELECT
CASE WHEN login_quarter=1 THEN CONCAT('01-01-', login_year)
WHEN login_quarter=2 THEN CONCAT('01-04-', login_year)
WHEN login_quarter=3 THEN CONCAT('01-07-', login_year)
WHEN login_quarter=4 THEN CONCAT('01-10-', login_year)
END AS qrtr_str_date, session_cnt,
BY login_year, login_quarter)))*100.0/lag(session_cnt)OVER(ORDER BY login_year, login_quarter),
user_cnt
FROM quarter_cnt

--5)
WITH score_day AS(
SELECT cast(login_timestamp AS Date) AS login_day, user_id,
sum(session_score) AS total_score
FROM logins
GROUP BY cast(login_timestamp AS Date), user_id),
rn_cte AS(
SELECT *, DENSE_RANK()over(PARTITION BY login_day ORDER BY total_score DESC) AS rn
FROM score_day)
SELECT * FROM rn_cte
WHERE rn=1

----6)
WITH user_login_history AS(
SELECT USER_ID,
min(cast(LOGIN_TIMESTAMP AS DATE)) as FIRST_LOGIN,
MAX(CAST('2024-06-28' AS Date)) AS Last_Date,
COUNT(distinct cast(LOGIN_TIMESTAMP AS DATE)) as no_logins
FROM logins
GROUP BY USER_ID)
SELECT *, datediff(DAY, FIRST_LOGIN, Last_Date)+1 AS logins_btw_str_end
FROM user_login_history
WHERE (datediff(DAY, FIRST_LOGIN, Last_Date)+1)=no_logins

---7)
WITH master_dates as(
select cast('2023-07-15' AS Date) AS login_date
UNION all
SELECT DATEADD(day, 1, login_date)
FROM master_dates
WHERE login_date<=cast('2024-06-28' AS Date))
SELECT login_date from
master_dates left join logins
ON cast(login_timestamp AS Date)=login_date
WHERE user_id IS NULL
OPTION (MAXRECURSION 5000)

mohdtoufique