GROUPING SETS

We've done aggregations on multiple levels. For example, we've looked at average salaries per company

SELECT
  company,
  avg(salary) AS average_salary
FROM employees
GROUP BY company;

`company` `VARCHAR`	`average_salary` `NUMERIC`
Footbook	154750.000000000000
Wamazon	145923.076923076923
Poogle	186461.538461538462

And we can drill down deeper and look at departments within each company.

SELECT
  company,
  department,
  avg(salary) AS average_salary
FROM employees
GROUP BY company, department
ORDER BY company, department;

`company` `VARCHAR`	`department` `VARCHAR`	`average_salary` `NUMERIC`
Footbook	Integration	157200.000000000000
Footbook	Research	145000.000000000000
Footbook	Solutions	163666.666666666667
Poogle	Functionality	187333.333333333333
Poogle	Infrastructure	167000.000000000000
Poogle	Marketing	215000.000000000000
Wamazon	Marketing	145500.000000000000
Wamazon	Quality	146750.000000000000
Wamazon	Solutions	145666.666666666667

What if we want to see both of these in a single query? We would need some way of specifying 2 different groupings, GROUP BY company and GROUP BY company, department. That's where GROUPING SETS comes in. It gives us a way to do just that.

SELECT
  company,
  department,
  avg(salary) AS average_salary
FROM employees
GROUP BY GROUPING SETS (
  (company),
  (company, department)
)
ORDER BY company, department;

`company` `VARCHAR`	`department` `VARCHAR`	`average_salary` `NUMERIC`
Footbook	Integration	157200.000000000000
Footbook	Research	145000.000000000000
Footbook	Solutions	163666.666666666667
Footbook	NULL	154750.000000000000
Poogle	Functionality	187333.333333333333
Poogle	Infrastructure	167000.000000000000
Poogle	Marketing	215000.000000000000
Poogle	NULL	186461.538461538462
Wamazon	Marketing	145500.000000000000
Wamazon	Quality	146750.000000000000
Wamazon	Solutions	145666.666666666667
Wamazon	NULL	145923.076923076923

The department column contains some NULLs.

This is how the different grouping sets are represented. This result table is a combination of rows grouped with different grouping sets, some on both (company, department), and some on just company. The way you can tell them apart is with the NULL values in those grouping columns. The rows where department is NULL are the rows grouped by just company, and the rows where department has a value are grouped by (company, department).

We could've written this query another way by UNIONing 2 different queries: 1 that does GROUP BY company and 1 that does GROUP BY company, department. This would be equivalent:

SELECT
  company,
  NULL AS department,
  avg(salary) AS average_salary
FROM employees
GROUP BY company
UNION
SELECT
  company,
  department,
  avg(salary) AS average_salary
FROM employees
GROUP BY company, department
ORDER BY company, department;

`company` `VARCHAR`	`department` `VARCHAR`	`average_salary` `NUMERIC`
Footbook	Integration	157200.000000000000
Footbook	Research	145000.000000000000
Footbook	Solutions	163666.666666666667
Footbook	NULL	154750.000000000000
Poogle	Functionality	187333.333333333333
Poogle	Infrastructure	167000.000000000000
Poogle	Marketing	215000.000000000000
Poogle	NULL	186461.538461538462
Wamazon	Marketing	145500.000000000000
Wamazon	Quality	146750.000000000000
Wamazon	Solutions	145666.666666666667
Wamazon	NULL	145923.076923076923

Empty Grouping Set

We can add another level to this: the average salary for all users overall. Usually when aggregating for a table overall, we just omit the GROUP BY clause altogether. However, remember that in Implicit Aggregation, an empty list can be given to specify that the whole table is 1 group.

SELECT
  company,
  department,
  avg(salary) AS average_salary
FROM employees
GROUP BY GROUPING SETS (
  (),
  (company),
  (company, department)
)
ORDER BY company NULLS FIRST, department NULLS FIRST;

`company` `VARCHAR`	`department` `VARCHAR`	`average_salary` `NUMERIC`
NULL	NULL	162578.947368421053
Footbook	NULL	154750.000000000000
Footbook	Integration	157200.000000000000
Footbook	Research	145000.000000000000
Footbook	Solutions	163666.666666666667
Poogle	NULL	186461.538461538462
Poogle	Functionality	187333.333333333333
Poogle	Infrastructure	167000.000000000000
Poogle	Marketing	215000.000000000000
Wamazon	NULL	145923.076923076923
Wamazon	Marketing	145500.000000000000
Wamazon	Quality	146750.000000000000
Wamazon	Solutions	145666.666666666667

The row where both company and department are NULL is the row that contains the average salary for that group, all employees for all companies.

By the way, NULL normally get's sorted last. Here, we switched it so that NULL gets sorted first by adding NULLS FIRST in the ORDER BY clause.

ROLLUP, CUBE

There are specific GROUPING SETS patterns which are common, like the one in the last query. This query would be equivalent:

SELECT
  company,
  department,
  avg(salary) AS average_salary
FROM employees
GROUP BY ROLLUP(company, department)
ORDER BY company NULLS FIRST, department NULLS FIRST;

`company` `VARCHAR`	`department` `VARCHAR`	`average_salary` `NUMERIC`
NULL	NULL	162578.947368421053
Footbook	NULL	154750.000000000000
Footbook	Integration	157200.000000000000
Footbook	Research	145000.000000000000
Footbook	Solutions	163666.666666666667
Poogle	NULL	186461.538461538462
Poogle	Functionality	187333.333333333333
Poogle	Infrastructure	167000.000000000000
Poogle	Marketing	215000.000000000000
Wamazon	NULL	145923.076923076923
Wamazon	Marketing	145500.000000000000
Wamazon	Quality	146750.000000000000
Wamazon	Solutions	145666.666666666667

The GROUP BY ROLLUP(company, department) is equivalent to the following:

GROUP BY GROUPING SETS (
  (),
  (company),
  (company, department)
)

There is another shortcut CUBE. The clause GROUP BY CUBE(company, department) is equivalent to:

GROUP BY GROUPING SETS (
  (),
  (company),
  (department),
  (company, department)
)

CUBE takes all the possible subsets of the set of columns, also known as a Power Set.