Data Genie: How to remove duplicates using SQL query?

Wednesday, February 7, 2024

How to remove duplicates using SQL query?

Ever wondered why your database feels a bit messy? Meet the troublemakers called duplicates – they slow

things down, mess up accuracy, and hog space. In this article, I will share how you can delete duplicate

rows in SQL.

Method 1- By using distinct function

The DISTNCT keyword removes duplicate rows from the results of a SELECT statement

or in technical language we can say a distinct keyword is used to remove redundant data

Query to create the example table

CREATE TABLE cupcakes (flavor VARCHAR(50));

INSERT INTO cupcakes (flavor) VALUES

('Vanilla'),

('Chocolate'),

('Strawberry'),

('Vanilla'),

('Chocolate'),

('Red Velvet');

Example Table

Query using "distinct" function

select distinct flavor from cupcakes;

Result Table

Method 2 -By using group by clause

Group by clause allows us to collapse a field into its distinct values.

Example Table

Here the vanilla is repeated 2 times, similarly chocolate is also repeated two times. What group by clause does is it will collapse the two vanilla into one group and the two chocolate into one group. Other records remains the same.

Data Genie

Wednesday, February 7, 2024

How to remove duplicates using SQL query?

Query to create the example table

Example Table

Query using "distinct" function

select distinct flavor from cupcakes;

Result Table

Example Table

Query using "group by" clause

select flavor from cupcakes group by flavor;

Result Table

No comments:

Post a Comment