Skip to content

GROUP BY GROUPING SETS

Name zero or more GROUP BY expressions as grouping sets. Aggregates are computed for each discrete group.

GROUP BY GROUPING SETS (expression [, ...])

Within the outer pair of parentheses, expressions may be either named in a simple comma-separated list or isolated within their own inner pairs of parentheses. For example, grouping sets((city, state)) and grouping sets((city), (state)) are both valid; however, these constructs return different results:

  • grouping sets((city, state)) means group by city and state in one set of rows.
  • grouping sets((city), (state)), or grouping sets(city, state), means group by city in a set of rows, then group by state in a separate set of rows.

You may want to use inner sets of parentheses consistently for clarity.

Usage Notes

  • Duplicate grouping sets are not allowed.
  • Queries are limited to a maximum of 128 grouping sets.
  • The maximum length of each grouping set is 200 elements.
  • GROUP BY CUBE queries are limited to a maximum of 7 columns.
  • If the target table is empty, a grouping set query returns 0 rows.
  • Aliases defined in the SELECT list may be referenced in the GROUPING SETS clause. For example:
select c1 as a, c2 as b, ...
from ...
group by grouping sets((a), (b)) 
...

Ordinal numbers that reference GROUP BY expressions are also valid. For example:

select city, state, ...
from ...
group by grouping sets((1), (2))
...

Examples

Assume that you have a citypop table that stores population numbers by city and state, and that the table contains five rows:

testdb=# select * from citypop order by 2,3;
   city     | state |   pop    
-------------+-------+----------
 San Jose    | CA    |  1799000
 Los Angeles | CA    | 12459000
 Naples      | FL    |   398000
 Miami       | FL    |  6167000
 Boise       | ID    |   455000
(5 rows)

Note the results of the following two queries. The first one defines (city, state) as a single unit of aggregation:

testdb=# select city, state, sum(pop) from citypop 
group by grouping sets((city, state)) order by 1;
   city     | state |   sum    
-------------+-------+----------
 Boise       | ID    |   455000
 Los Angeles | CA    | 12459000
 Miami       | FL    |  6167000
 Naples      | FL    |   398000
 San Jose    | CA    |  1799000
(5 rows)

The query returns only five rows, with one sum for each different city/state group. This is the exact same result that you would get with a simple group by city, state clause. The use of GROUPING SETS in this particular case does not add any value but is a valid alternative to the basic GROUP BY syntax.

The second query defines city and state as distinct groups for aggregation within the same result set:

testdb=# select city, state, sum(pop) from citypop group by grouping sets((city), (state)) order by 1,2,3;
   city     | state |   sum    
-------------+-------+----------
 Boise       | NULL  |   455000
 Los Angeles | NULL  | 12459000
 Miami       | NULL  |  6167000
 Naples      | NULL  |   398000
 San Jose    | NULL  |  1799000
 NULL        | FL    |  6565000
 NULL        | CA    | 14258000
 NULL        | ID    |   455000
(8 rows)

This query returns eight rows: five rows for each different city, plus three rows for each different state. In this case, the use of GROUPING SETS provides a simple way to return results that have multiple levels of aggregation within a single query. NULL values are used to fill in the empty columns.

To take this example one step further, you can also produce a total row for the query by specifying () as a third, but empty grouping set:

testdb=# select city, state, sum(pop) from citypop group by grouping sets((city), (state), ()) order by 1,3;
   city     | state |   sum    
-------------+-------+----------
 Boise       | NULL  |   455000
 Los Angeles | NULL  | 12459000
 Miami       | NULL  |  6167000
 Naples      | NULL  |   398000
 San Jose    | NULL  |  1799000
 NULL        | ID    |   455000
 NULL        | FL    |  6565000
 NULL        | CA    | 14258000
 NULL        | NULL  | 21278000
(9 rows)

The ninth row is the grand total for the preceding sets of rows. An empty grouping set means that all rows are aggregated in a single group. This total row is produced even when no input rows exist.

Throughout these examples, note that NULL values occupy the empty columns when there is no match for a particular grouping column. Total rows contain NULL values for all grouping columns.

NULL Handling with Grouping Sets

When you are running queries that group by grouping sets, you have two types of NULL values to think about: those that exist in the table data itself, and those produced in aggregated rows.

When NULL values exist in the data for grouping set columns and expressions, aggregated rows return NULL values in the result. For example, the 14th row in the following query result represents the group where both position and cob are NULL. There are two such rows in the player table. In both of those rows, weekly_wages is also NULL. Therefore the 14th row is a series of three NULL values.

yellowbrick=# select position, cob, sum(weekly_wages) 
from player 
group by grouping sets((position, cob)) order by 1,2,3;
 position |     cob     |  sum   
----------+-------------+--------
 D        | Netherlands | 180000
 F        | Argentina   | 572000
 F        | Chile       | 410000
 F        | Egypt       | 200000
 F        | England     | 580000
 G        | Spain       | 200000
 M        | Algeria     | 200000
 M        | Belgium     | 250000
 M        | England     | 140000
 M        | France      | 378000
 M        | Germany     | 350000
 M        | Netherlands | 618000
 M        | Spain       | 250000
 [NULL]   | [NULL]      | [NULL]
(14 rows)

If you do not want this kind of row to be evaluated or displayed, filter out NULL values in the WHERE clause.

For grouping sets queries, you will see NULL values used in two other ways:

  • To fill in the blanks for subtotal and grand total rows. Total rows contain NULL values for all grouping columns.
  • To fill in the blanks when grouped aggregation occurs at different levels. NULL values occupy the empty columns when there is no match for a particular grouping column.

If you did not intend to return these aggregated rows, you may need to rewrite the query. If you want to return these rows only under certain conditions, you can apply a filter in the HAVING clause.