Part 4 Delete duplicate rows in sql

preview_player
Показать описание
Link for all dot net and sql server video tutorial playlists

Link for slides, code samples and text version of the video

Healthy diet is very important both for the body and mind. If you like Aarvi Kitchen recipes, please support by sharing, subscribing and liking our YouTube channel. Hope you can help.

In this video, we will discuss deleting all duplicate rows except one from a sql server table.

SQL Script to create Employees table
Create table Employees
(
ID int,
FirstName nvarchar(50),
LastName nvarchar(50),
Gender nvarchar(50),
Salary int
)
GO

Insert into Employees values (1, 'Mark', 'Hastings', 'Male', 60000)
Insert into Employees values (1, 'Mark', 'Hastings', 'Male', 60000)
Insert into Employees values (1, 'Mark', 'Hastings', 'Male', 60000)
Insert into Employees values (2, 'Mary', 'Lambeth', 'Female', 30000)
Insert into Employees values (2, 'Mary', 'Lambeth', 'Female', 30000)
Insert into Employees values (3, 'Ben', 'Hoskins', 'Male', 70000)
Insert into Employees values (3, 'Ben', 'Hoskins', 'Male', 70000)
Insert into Employees values (3, 'Ben', 'Hoskins', 'Male', 70000)

The delete query should delete all duplicate rows except one.

Here is the SQL query that does the job. PARTITION BY divides the query result set into partitions.
WITH EmployeesCTE AS
(
SELECT *, ROW_NUMBER()OVER(PARTITION BY ID ORDER BY ID) AS RowNumber
FROM Employees
)
DELETE FROM EmployeesCTE WHERE RowNumber ] 1
Рекомендации по теме
Комментарии
Автор

Grate it works for me, Saved lot of time
Thank you very much....

aravindkumareriventy
Автор

I'm data engineer today because of this guy. ❣️

Can never thank you enough.

alokranjan
Автор

Does deleting the row from CTE EmployeeCTE also deletes it from the underlying Employees table? Did I miss anything here? I thought CTEs are just like views to make joins

RockoHunter
Автор

But cte is a just a temporary result so when we delete rows from cte, why is it deleting rows in main employees table. Can u tell me why cte is effecting main employees table. I don't get this logic. Please explain

meghasyam
Автор

Just one thing I don't understand:
You're deleting from the EmployeeCTE, but how come it delete the row in the Employee table?

williamxu
Автор

Seems like this query isn't working on PostgreSQL DB. Will you please check and confirm the same?
Just for reference, I have tried running two different versions to achieve this:

Version 1 :--

WITH tempEMP as (
select row_number() over(order by id, firstname, lastname, gender, salary desc) as rn, employees.* from employees
)
delete from tempEMP where rn not in (select max(rn) from tempEMP group by id, firstname, lastname, gender, salary)

Version 2 :--

with tempEMP as
(
select row_number() over (partition by id order by id ) as rn from employees
)
DELETE from tempEMP where rn > 1;


Note: I am able to run the SELECT query but in the case of the DELETE query, I am getting a common error from both of the above queries. Here is my error:
SQL Error [42P01]: ERROR: relation "tempemp" does not exist¶ Position: 383
PS: Position in error log changes with the query change.

MayankJain
Автор

Mr. Kudvenkat, you are an excellent instructor! The code was very well explained and the example was very helpful. Thanks!

afifkhaja
Автор

Kudvankat, You are giving a Public Service. I can't thank you enough for your Clear and detailed tutorial. God bless you, Sir.

searez
Автор

Hi Venkat,

You are rock star for beginners to learn SQL
Without basics there is nothing in every course.
With your course and videos about SQL i got placed in Deloitte.

In every interview, interviewers never go for advanced level without asking basic level questions. If we are very strong in Basics you can manage and convenience the interviewer with basic knowledge that will create great impact.

Your videos are reference for any SQL interview and for every concept.
Thank you so much for your help.

ramakoteswararaovemuri
Автор

Hey,
Delete query is not running, ERROR is invalid object name

kaushikdas
Автор

6 years on and still the clearest solution to the age old problem of deleting duplicate rows in a heap in SSMS.  Awesome!!!

iainmpickering
Автор

can someone please explain why when row_number>1, row numbers with 2 and 3 are not getting deleted?

cautioni
Автор

thanks a lot for this video, but I have a question. you've deleted rows based on duplicated IDs, in real environment, duplicated rows suppose to be considered on multi columns.. am I right?

ttkaf
Автор

you are excellent instructor! example was very helpful. Thanks a lot!

shrikant
Автор

i love you Kudvenkat .. developers will give you blessings always.

harshsaxena
Автор

awesome, thanks a lot for your explanation.Can u post the video for lock in sql server.

manishvasuki
Автор

As i run this command is shows erros like-Recursive common table expression 'Tbl_DeleteMulRow' does not contain a top-level UNION ALL operator

anujpal
Автор

Just one thing I don't understand:
You're deleting from the EmployeeCTE, but how come it delete the row in the Employee table?

arjunyadav-buzf
Автор

can you please write a code for the mysql, i have tried a lot on YouTube but nothing helpful

billiondollarsidea
Автор

Hi,
I followed the same steps you did.
I created Employees table and Inserted the same records as you did and run this: SELECT * FROM EmployeesCTE; query so far everything works fine but when I run DELETE FROM EmployeesCTE WHERE RowNumber > 1; I get the following errors:

ERROR: relation "employeescte" does not exist
LINE 9: DELETE FROM EmployeesCTE WHERE RowNumber > 1;
^
SQL state: 42P01
Character: 161

Please help to resolve the issue

abdulkarimdorman
join shbcf.ru