Appearance
GROUP BY GROUPING SETS
Name zero or more GROUP BY
expressions as grouping sets. Aggregates are computed for each discrete group.
GROUP BY GROUPING SETS (expression [, ...])
Within the outer pair of parentheses, expressions may be either named in a simple comma-separated list or isolated within their own inner pairs of parentheses. For example, grouping sets((city, state))
and grouping sets((city), (state))
are both valid; however, these constructs return different results:
grouping sets((city, state))
means group bycity
andstate
in one set of rows.grouping sets((city), (state))
, orgrouping sets(city, state)
, means group bycity
in a set of rows, then group by state in a separate set of rows.
You may want to use inner sets of parentheses consistently for clarity.
Usage Notes
- Duplicate grouping sets are not allowed.
- Queries are limited to a maximum of 128 grouping sets.
- The maximum length of each grouping set is 200 elements.
GROUP BY CUBE
queries are limited to a maximum of 7 columns.- If the target table is empty, a grouping set query returns 0 rows.
- Aliases defined in the
SELECT
list may be referenced in theGROUPING SETS
clause. For example:
select c1 as a, c2 as b, ...
from ...
group by grouping sets((a), (b))
...
Ordinal numbers that reference GROUP BY
expressions are also valid. For example:
select city, state, ...
from ...
group by grouping sets((1), (2))
...
Examples
Assume that you have a citypop
table that stores population numbers by city and state, and that the table contains five rows:
testdb=# select * from citypop order by 2,3;
city | state | pop
-------------+-------+----------
San Jose | CA | 1799000
Los Angeles | CA | 12459000
Naples | FL | 398000
Miami | FL | 6167000
Boise | ID | 455000
(5 rows)
Note the results of the following two queries. The first one defines (city, state)
as a single unit of aggregation:
testdb=# select city, state, sum(pop) from citypop
group by grouping sets((city, state)) order by 1;
city | state | sum
-------------+-------+----------
Boise | ID | 455000
Los Angeles | CA | 12459000
Miami | FL | 6167000
Naples | FL | 398000
San Jose | CA | 1799000
(5 rows)
The query returns only five rows, with one sum for each different city/state group. This is the exact same result that you would get with a simple group by city, state
clause. The use of GROUPING SETS
in this particular case does not add any value but is a valid alternative to the basic GROUP BY
syntax.
The second query defines city
and state
as distinct groups for aggregation within the same result set:
testdb=# select city, state, sum(pop) from citypop group by grouping sets((city), (state)) order by 1,2,3;
city | state | sum
-------------+-------+----------
Boise | NULL | 455000
Los Angeles | NULL | 12459000
Miami | NULL | 6167000
Naples | NULL | 398000
San Jose | NULL | 1799000
NULL | FL | 6565000
NULL | CA | 14258000
NULL | ID | 455000
(8 rows)
This query returns eight rows: five rows for each different city, plus three rows for each different state. In this case, the use of GROUPING SETS
provides a simple way to return results that have multiple levels of aggregation within a single query. NULL
values are used to fill in the empty columns.
To take this example one step further, you can also produce a total row for the query by specifying ()
as a third, but empty grouping set:
testdb=# select city, state, sum(pop) from citypop group by grouping sets((city), (state), ()) order by 1,3;
city | state | sum
-------------+-------+----------
Boise | NULL | 455000
Los Angeles | NULL | 12459000
Miami | NULL | 6167000
Naples | NULL | 398000
San Jose | NULL | 1799000
NULL | ID | 455000
NULL | FL | 6565000
NULL | CA | 14258000
NULL | NULL | 21278000
(9 rows)
The ninth row is the grand total for the preceding sets of rows. An empty grouping set means that all rows are aggregated in a single group. This total row is produced even when no input rows exist.
Throughout these examples, note that NULL
values occupy the empty columns when there is no match for a particular grouping column. Total rows contain NULL
values for all grouping columns.
NULL Handling with Grouping Sets
When you are running queries that group by grouping sets, you have two types of NULL
values to think about: those that exist in the table data itself, and those produced in aggregated rows.
When NULL
values exist in the data for grouping set columns and expressions, aggregated rows return NULL
values in the result. For example, the 14th row in the following query result represents the group where both position
and cob
are NULL
. There are two such rows in the player
table. In both of those rows, weekly_wages
is also NULL
. Therefore the 14th row is a series of three NULL
values.
yellowbrick=# select position, cob, sum(weekly_wages)
from player
group by grouping sets((position, cob)) order by 1,2,3;
position | cob | sum
----------+-------------+--------
D | Netherlands | 180000
F | Argentina | 572000
F | Chile | 410000
F | Egypt | 200000
F | England | 580000
G | Spain | 200000
M | Algeria | 200000
M | Belgium | 250000
M | England | 140000
M | France | 378000
M | Germany | 350000
M | Netherlands | 618000
M | Spain | 250000
[NULL] | [NULL] | [NULL]
(14 rows)
If you do not want this kind of row to be evaluated or displayed, filter out NULL
values in the WHERE
clause.
For grouping sets queries, you will see NULL
values used in two other ways:
- To fill in the blanks for subtotal and grand total rows. Total rows contain
NULL
values for all grouping columns. - To fill in the blanks when grouped aggregation occurs at different levels.
NULL
values occupy the empty columns when there is no match for a particular grouping column.
If you did not intend to return these aggregated rows, you may need to rewrite the query. If you want to return these rows only under certain conditions, you can apply a filter in the HAVING
clause.