Today one of my colleague asked me how to delete duplicate records from a SQL Server Table. So to answer his question, let’s check the following example.
Note: The below example works on SQL Server 2005 and later versions.
--Create Table
CREATE TABLE dbo.TestDuplicate (EmpNo INT, EmpName VARCHAR(10));
--Insert Test Data
INSERT INTO dbo.TestDuplicate VALUES (1,'Emp 1');
INSERT INTO dbo.TestDuplicate VALUES (1,'Emp 1');
INSERT INTO dbo.TestDuplicate VALUES (2,'Emp 2');
INSERT INTO dbo.TestDuplicate VALUES (2,'Emp 2');
INSERT INTO dbo.TestDuplicate VALUES (3,'Emp 3');
--Remove Duplicates
WITH CTE
AS (SELECT ROW_NUMBER () OVER (ORDER BY EmpNo) SrNo , EmpNo FROM dbo.TestDuplicate)
DELETE FROM CTE WHERE SrNo NOT IN (SELECT MAX(SrNo) FROM CTE GROUP BY EmpNo);
--Check the table
SELECT * FROM dbo.TestDuplicate;
-- Data before deleting the duplicate records.
-- Data after deleting the duplicate records.
Points to Note:
- This was possible because of the ROW_NUMBER() function available in SQL 2005 onwards.
- ROW_NUMBER() function Returns the sequential number of a row within a partition of a result set. This means for each row in a result set it will return one unique sequential number for that row.
- The OVER clause determines the partitioning and ordering of the row set.
- As we have used CTE (Common Table Expression) it allows us to reuse the result set like a temporary table.
Hope this article may help you in deleting the duplicate records.
Happy Learning !!!
No comments:
Post a Comment