Finding duplicates in SQL involves identifying rows in a
table where certain columns have identical values. Here is a very common method
to achieve this:
Using GROUP BY and HAVING
Let us take an example.
We have cupcake data. In this table, the flavor of the cupcake is duplicated. Because our data is small, we can say by looking at the table the duplicate records are 2-Vanilla and 2-Chocolate.
Query for cupcakes table
CREATE TABLE cupcakes (
flavor VARCHAR(50)
);
INSERT INTO cupcakes (flavor) VALUES
('Vanilla'),
('Chocolate'),
('Strawberry'),
('Vanilla'),
('Chocolate'),
('Red Velvet');
select * from cupcakes;
Example Table
Remember we intend to identify only the duplicated flavor of
cupcakes, we are not deleting the records.
First, we need to check the count of each flavor.
Query for finding the count of each flavor
select flavor, count(flavor) from cupcakes
group by flavor;
If the count is one we know the records are not duplicated,
but if the count is more than one that means it has duplicated records.
Now how do we extract only the duplicated records?
For this we give another condition in the query with having clause, the condition is that it returns the records with count >1. Along with our previous query, we add this condition using the having function. When adding having condition where count>1, it selects only those records whose count is greater than 1.
Query for finding duplicated flavors
select flavor, count(flavor) from cupcakes
group by flavor
having count(flavor)>1;
No comments:
Post a Comment