Introduction
In this article, I am going to explain how to delete duplicate rows/records in an SQL server using common table expression (CTE). This is one of the most common questions asked in an SQL interview.
Here we will be using SQL Server 2017 or you can use SQL Server 2008 or above.
Read my previous Joins in SQL Server 2017 part of this article using the below links,
Prerequisites
SQL Server 2017 or you can use SQL Server 2008 or above version.
Now, first, we will create a Database and a table.
Creating a Database and a Table
Step 1. Create a Database
Open your SQL Server and use the following script to create the “chittadb” Database.
Create database chittadb
Now, select the script query then press F5 or click on Execute button to execute the above script.
You should see a message, “Command(s) completed successfully.” This means your new database has been created.
Step 2. Create a table
Open your SQL Server and use the following script to create table “tbl_Mcastudents”.
Execute the above query to create “tbl_Mcastudents “.
You should see a message, “Command(s) completed successfully.”
Now, data has been inserted into the table.
Execute the above query, you should see a message, “Command(s) completed successfully.”
Now retrieve all data from the “tbl_Mcastudents” table.
Output
![SQL Server]()
There are many duplicate rows (10, 11, 12), (13, 14), and (15, 16, 17) for the tbl_Mcastudents that have the same Name, Location, and Gender.
Delete duplicate rows/records in the SQL server using common table expression (CTE)
To delete the duplicate rows from the table in SQL Server, we follow these steps,
- Find duplicate rows using group BYclause orROW_NUMBER()
- UseDELETEstatement to remove the duplicate rows.
Query
In the above query
- First, the CTE uses theROW_NUMBER()function to find the duplicate rows specified by values in the name, Location, and gender.
- Then, the delete statement deletes all the duplicate rows but keeps only one occurrence of each duplicate group.
To execute the above query you should see a message.
(5 rows affected)
The above message indicates that the duplicate rows have been removed from the table.
Now retrieve all data from the “tbl_Mcastudents” table after the duplicate rows have been deleted.
Output
![Output]()
Conclusion
In this article, we have learned how to delete duplicate rows from a table in SQL Server.