Delete Duplicate Rows – SQL 2005
Apr 14
SQL server common table expression, delete duplicate rows, SQL server 2005 No Comments
There are different methods to remove duplicates in SQL server 2005 where we do not have primary key defined for any table.
In the example I have table with field names FirstName and LastName. To remove duplicates in any table, replace the field names with the actual names.
If we have more columns, then add those columns as well in the declaration part and in the select and where conditions in the query.
Method 1:
DECLARE @Count int
DECLARE @FirstName nvarchar(50)
DECLARE @LastName nvarchar(50)
DECLARE dublicate_cursor CURSOR FAST_FORWARD FOR
SELECT FirstName, LastName, Count(*) – 1
FROM Users
GROUP BY FirstName, LastName
HAVING Count(*) > 1
OPEN duplicate_cursor
FETCH NEXT FROM duplicate_cursor INTO @FirstName, @LastName, @Count
WHILE @@FETCH_STATUS = 0
BEGIN
DELETE TOP(@Count) FROM Users WHERE FirstName = @FirstName AND LastName = @LastName
FETCH NEXT FROM duplicate_cursor INTO @FirstName, @LastName, @Count
END
CLOSE duplicate_cursor
DEALLOCATE duplicate_cursor
Method 2: Common Table Expression CTE
WITH Duplicates_CTE(FirstName, LastName, Id)
AS
(
SELECT FirstName, LastName, Min(Id) Id
FROM Users
GROUP BY FirstName, LastName
HAVING Count(*) > 1
)
DELETE FROM Users
WHERE Id IN (
SELECT Users.Id
FROM Users
INNER JOIN Duplicates_CTE
ON Users.FirstName = Duplicates_CTE.FirstName
AND Users.LastName = Duplicates_CTE.LastName
AND Users.Id <> Duplicates_CTE.Id
)
RSS