In this article, we will see how to delete duplicate rows from a table in SQL server.
This is a very common SQL interview question. Here is the table with duplicate data, notice that Avinash record is duplicated 2 times Joe record 2 times and Nancy record 2 times.
Now the delete query that we are going to write should delete all the rows except one for the employee.
It means after we execute the delete query the employee table should leave with just 3 records as you can see here.
How to delete duplicate rows from a table using Row Number and Partition?
Let us see how to achieve this using SQL server management studio.
I have already created the employee table and populated it with some test data. Here is the SQL script that can do it.
Now to write a delete query we are going to make use of common table expression called CTE.
Now we are going to select all the columns from the employee table and along with all the columns we are going to make use of row number function.
Then over we want data to be portioned by ID column and also we want the order by this ID column that generates the row number.
Now select the results from CTE and look at the results.
Now the data is partitioned by ID and the row numbers are unique within that partition.
Here when the new partition stats look at the row number, it starts with a new number and we have partitioned the data basically by the ID.
Within the partition, row numbers are unique when a new partition stats with 1.
Now using this row number column we can very easily delete all the rows except 1.
In the delete query, we can use row numbers as a filter.
Instead of selecting all the rows from the employee table what we can do is delete from employees CTE where row number greater than 1.
What is it going to do is, it is going to delete anything that has got greater than 1. Meaning it is going to leave just one record per employee.
That’s it we got the results as expected.
How to Retrieve Only Unique Records from the Employee Table Without Using Rownumber ?
Let us write a simple query for this one without using Row Number.
Select ID, First Name, Last Name, Gender, Salary, Count(*) as UNIQUE_VAL from Employee Group By ID, First Name, Last Name, Gender, Salary Having Count(*)=1;
Result for this query will be
I hope this information is useful to you.
You can leave your feedback as a comment in the comment section below.
For more information on how to delete duplicate rows from a table in SQL Server, you check this article Different Ways to SQL delete duplicate rows from a SQL table.