How to Delete Duplicate Rows in SQL?

How to Delete Duplicate Rows in SQL?

Duplicate rows in a database can cause inconsistencies, errors, and inefficiencies. Removing them is crucial for maintaining data integrity and ensuring accurate query results. This guide will cover various methods to delete duplicate rows in SQL using different approaches.

Understanding Duplicate Rows in SQL

A duplicate row occurs when all column values in one row match all column values in another row. Consider the following example:

ID	Name	Age
1	Alice	25
2	Bob	30
3	Alice	25
4	Charlie	35
5	Alice	25

In the above table, Alice (ID 1, 3, and 5) has duplicate entries.

Methods to Delete Duplicate Rows in SQL

1. Using `DISTINCT` to Select Unique Rows

If you want to view only unique records without deleting duplicates, use:

SELECT DISTINCT * FROM employees;

2. Deleting Duplicates Using `ROW_NUMBER()`

The ROW_NUMBER() function assigns a unique number to each row. You can delete duplicates while keeping only one record:

WITH CTE AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Name, Age ORDER BY ID) AS RowNum
    FROM employees
)
DELETE FROM employees
WHERE ID IN (
    SELECT ID FROM CTE WHERE RowNum > 1
);

3. Deleting Duplicates Using `DELETE` with `GROUP BY`

If you have an ID column and want to keep the lowest ID for each duplicate set:

DELETE FROM employees
WHERE ID NOT IN (
    SELECT MIN(ID)
    FROM employees
    GROUP BY Name, Age
);

4. Deleting Duplicates Using `DELETE` with `HAVING`

DELETE FROM employees
WHERE ID IN (
    SELECT ID FROM employees
    GROUP BY Name, Age
    HAVING COUNT(*) > 1
);

Best Practices for Deleting Duplicates

Always back up your data before running delete commands.
Use transactional DELETE to prevent accidental data loss.
Consider using indexes to optimize performance.
If duplicates are frequent, redesign the table with unique constraints.

Conclusion

Duplicate rows can create problems in data analysis and reporting. By using techniques like ROW_NUMBER(), DELETE with GROUP BY, and HAVING, you can efficiently clean your database.

For more SQL queries and discussions, visit our SQL Community.

For additional learning, check out this official SQL documentation.

How to Delete Duplicate Rows in SQL? | SQL Queries

Understanding Duplicate Rows in SQL

Methods to Delete Duplicate Rows in SQL

1. Using `DISTINCT` to Select Unique Rows

2. Deleting Duplicates Using `ROW_NUMBER()`

3. Deleting Duplicates Using `DELETE` with `GROUP BY`

4. Deleting Duplicates Using `DELETE` with `HAVING`

Best Practices for Deleting Duplicates

Conclusion

1 thought on “How to Delete Duplicate Rows in SQL? | SQL Queries”

Understanding Duplicate Rows in SQL

Methods to Delete Duplicate Rows in SQL

1. Using DISTINCT to Select Unique Rows

2. Deleting Duplicates Using ROW_NUMBER()

3. Deleting Duplicates Using DELETE with GROUP BY

4. Deleting Duplicates Using DELETE with HAVING

Best Practices for Deleting Duplicates

Conclusion

1 thought on “How to Delete Duplicate Rows in SQL? | SQL Queries”

1. Using `DISTINCT` to Select Unique Rows

2. Deleting Duplicates Using `ROW_NUMBER()`

3. Deleting Duplicates Using `DELETE` with `GROUP BY`

4. Deleting Duplicates Using `DELETE` with `HAVING`