Sunday, February 25, 2024

Group By in SQL

 Group by is an unavoidable clause for any data analyst who start analysis in SQL.

Let me explain the concept with an example

We have a  cupcake table that contains some records repeating like Vanilla, Chocolate, and Strawberry

What grouping means  here is arranging the cupcakes into identical groups.

All the chocolate together, all the strawberry together similar the rest of the Flavors.












In short SQL GROUP BY statement is used to arrange identical data into groups.

The GROUP BY clause combines all those records(row) that have identical values in a particular column and GROUPING can be done by column names.

Let's see how the group by works on the cupcake table

Query

select flavor from cupcakes group by flavor;

Result











Here what happened is the group by collapsed identical records into a single record

Now what are the situations where we can use group by sql clause?

The first one is, if you examine this sql query we ran earlier , with a simple group by, we saw that all the duplicated data is also removed from this given table.  Hence group by can also be used to remove duplicates from a table. This is a common interview question as well. If the interviewer gives a simple column and asks a question to remove the duplicates you may use the group by clause to get the result.

 The next important function of group by is , it is used with aggregate functions like sum, count, min, max etc in which case the aggregate produces a value for each group.

Let's see a query where group by can be used with aggregate function count.

Query

select flavor, count(flavor)from cupcakes

group by flavor;

Result







 






In the result, we get the flavor and the aggregated count.

Now we are going to discuss placing conditions on groups, here we use group by along with having clause. Lets say in our cupcake table we want to select only the records whose count is > 1.

Query for Group By with Having

select flavor, count(flavor) from cupcakes group by flavor having count(flavor)>1;

The query will select flavor and count of flavor according to groups, and then it checks for records whose count is >1. Remember Always group by is used before placing the condition with having clause.

 Result















One thing that we should keep in mind is that while grouping, we should include only those values in the SELECT list.

If we check the above example, flavor comes with the select statement and we group according to flavor.

I hope you understand how group by clause is used in SQL queries. 




No comments:

Post a Comment