How to remove Duplicate Data in SQL | SQL Query to remove duplicate

preview_player
Показать описание
In this video, we see 10 different ways to remove duplicate records in SQL. We look at 2 different scenario for duplicate records in a table and then come up with 10 SQL queries to remove these duplicate data from the database.

Data can be consider as duplicate if all column values are duplicated or if only some of the column values are duplicated. In this video, you will get solution to remove duplicate data for both these scenarios.

The dataset, scripts and SQL Queries used in this video can be downloaded from below link:

Timestamp:
00:00 Intro about duplicate data in SQL
01:22 Explaining Scenario 1 of Duplicate Data
03:02 SOLUTION 1 - Delete using Unique Identifier.
07:19 SOLUTION 2 - Using SELF join.
12:21 SOLUTION 3 - Using Window function.
15:32 SOLUTION 4 - Using MIN function. This delete even multiple duplicate records.
19:34 SOLUTION 5 - Using backup table.
24:10 SOLUTION 6 - Using backup table without dropping the original table.
26:06 Explaining Scenario 2 of Duplicate Data
27:02 SOLUTION 7 - Delete using CTID.
30:25 SOLUTION 8 - By creating a temporary unique id column.
33:38 SOLUTION 9 - By creating a backup table.
35:30 SOLUTION 10 - By creating a backup table without dropping the original table.

🔴 My Recommended Courses:

🔴 WATCH MORE VIDEOS HERE 👇
✅ SQL Tutorial - Basic concepts:

✅ SQL Tutorial - Intermediate concepts:

✅ SQL Tutorial - Advance concepts:

✅ Practice Solving Basic SQL Queries:

✅ Practice Solving Intermediate SQL Queries:

✅ Practice Solving Complex SQL Queries:

✅ Data Analytics Career guidance:

✅ SQL Course, SQL Training Platform Recommendations:

✅ Python Tutorial:

✅ Git and GitHub Tutorial:

✅ Data Analytics Projects:

THANK YOU,
Thoufiq
Рекомендации по теме
Комментарии
Автор

For the first solution, instead of deleting records from table and repopulating table again with those records, we can use:
select * from car
where id not in
(
select max(id) from car
group by model, brand
having count(*)>1
)

KetakiGadgil_
Автор

It's blessing that i found your channel.. your way of teaching SQL concepts is really awesome. I am learning all SQL concepts from your videos which is really giving me confidence in queries. Thank you so much sir.

bollaalekyadevi
Автор

Man you've explained exactly what i needed to know. You've no idea how relieved i feel after finally understanding the concept of joins. I needed to understand how sql works with each record of each table and you've explained that really well. Thanks a lot brother for this video. I'll definitely subscribe to your channel.

NawazKhan-gofj
Автор

I believe solution 8 has a "bug", if you would have more than 1 duplicate - that is 3 or more records with the same values you would just delete one duplicate with proposed query which has biggest id value. So with simplified schema = (id, model, brand, row_num) and records = [(1, X5, BMW, 1), (1, X5, BMW, 2), (1, X5, BMW, 3)] you would just delete last record and still have [(1, X5, BMW, 1), (1, X5, BMW, 2)] left. It can be fixed by changing the query to `DELETE FROM prod WHERE row_num NOT IN (SELECT MIN(row_num) min_id FROM prod GROUP BY model, brand HAVING COUNT(*) > 1)`
This way you would delete every possible duplicate. Anyway, great video✌

mlevvy
Автор

One of the best lecture ever I have seen in my life over SQL topic, so much precise clarity, in-depth knowledge & thoroughly discussion over a amazing buddy.

Mayank-jwyy
Автор

This is the best video "How to remove the duplicate records from the table."

sudeep
Автор

Assalam O Alaikum Toufeeq!
On 7:01 I do not think that it will remove all duplicates from table. In your case duplicates were maximum of 2 but Lets say that In some table we have duplicates more than 2 rows than i.e 4 rows are duplicates, then It will remove the row with maximum id and we will be left with 3 rows. Maybe I am wrong, so could you guide me about it.
Btw, I have learned a lot from your channel and I have watched almost all basics and intermediate playlist. Thank you so much.

MuhammadSalmanKhan-lpmf
Автор

Your video is really help full for a data analyst who is struggling thank you so much ❤️👍🇮🇳

nagarapusudheeksha
Автор

Hey Thanks for your video

We can solve scenario two with below 2approaches also


Select * from cars group by 1, 2, 3, 4, 5

It will give you unique records
And second approach is using row_number by passing all variables in partition by



The above 2 are mostly use approachs

May be you find it better.

By the way I have learnt about window functions from your channel only.
Keep up the great work

rajatsethi
Автор

You took this class to heaven, hats off.

MASC_RAMBOFuggi
Автор

I completely learning SQL tips from your channel sir. Thanks for your efforts and it is amazing..

somasekharkunka
Автор

Solution 3 can also work for Scenario 2 just need to give order by clause in over( ).
Thanks for the other solutions : )

RandomlyWisdom
Автор

Superb, best part of the video is distinct method when only few values are duplicate and when entire row is duplicate. Thank you so much.

swamivivekananda-cyclonicm
Автор

For the same id we can use the below query to delete records:

with cte as
(
select *,
row_number() over ( partition by id order by id ) as row_count
from cars
)

delete from cte where row_count>1

deepakkumars
Автор

Everyone can understand your video from school boy to graduate guy
Thanks a lot

trotyoi
Автор

Your SQL content likes God gifted❣️ its same question asks in interviewer.🏆

vishalsonawane.
Автор

Wonderful presentation, I really like your presentation, thank you very much sir, infact u are making a positive impact in my life, keep it up

jamesopoku
Автор

In second solution could you not use one more condition on join. ie c1.id <> c2.id or even c1.id > c2.id

upmdosadno
Автор

This video is very practical and helpful when it comes to the real-time database. Thank you so much 👌👌👍👍❤❤

dhruvshahvlogs
Автор

To remove row level duplicates, we can simply use set operation UNION
create table cars_bk as
select * from cars
UNION
select * from cars where 1=2;

sahilbhange