Posted by Kyle Hankinson October 15, 2021
Duplicate rows are one of the most common data quality issues in any database. Whether they crept in through a missing unique constraint, a buggy import, or a race condition, you need to find them and clean them up. Here is how to do it across every major database.
The simplest way to find duplicates is GROUP BY combined with HAVING COUNT(*) > 1. This works identically in MySQL, PostgreSQL, SQL Server, and Oracle.
SELECT email, COUNT(*) AS cnt
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
This returns every email that appears more than once, along with how many times it appears.
To see the actual duplicate rows with all their columns:
SELECT u.*
FROM users u
INNER JOIN (
SELECT email
FROM users
GROUP BY email
HAVING COUNT(*) > 1
) dupes ON u.email = dupes.email
ORDER BY u.email;
The most reliable cross-database approach uses the ROW_NUMBER() window function to number each duplicate, then delete everything except row 1.
WITH ranked AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
FROM users
)
DELETE FROM ranked WHERE rn > 1;
PostgreSQL and SQL Server both allow deleting directly from a CTE.
MySQL does not allow deleting directly from a CTE. Use a subquery instead:
DELETE FROM users
WHERE id NOT IN (
SELECT min_id FROM (
SELECT MIN(id) AS min_id
FROM users
GROUP BY email
) AS keepers
);
DELETE FROM users
WHERE ROWID NOT IN (
SELECT MIN(ROWID)
FROM users
GROUP BY email
);
Oracle's ROWID pseudo-column makes this straightforward.
After cleaning up, add a unique constraint to prevent the problem from recurring:
ALTER TABLE users ADD CONSTRAINT uq_users_email UNIQUE (email);
In SQLPro Studio, you can run any of these queries directly in the query editor. The multiple result sets feature lets you first run the SELECT to preview duplicates, then run the DELETE in the same window — so you can verify before you commit.
You can also use SQLPro Studio's table structure editor to add unique constraints without writing DDL by hand.
About the author — Kyle Hankinson is the founder and sole developer of SQLPro Studio and the Hankinsoft Development suite of database tools. He has been building native macOS and iOS applications since 2010.
Try SQLPro Studio — A powerful database manager for MySQL, PostgreSQL, Microsoft SQL Server, SQLite, Oracle, and Snowflake. Available on macOS, iOS, and Windows.
Download Free Trial View Pricing Compare