Grouping Sets: the CUBE and ROLLUP subclauses

In my last blog posting I talked about the Grouping Set functionality in SQL Server. As you have seen with the examples from last week, it is very easy to perform individual groupings by simply defining the necessary grouping sets. But what if you want to have all possible groups from a given set of columns – the so-called Power Set?

Of course, you can generate the power set manually with the syntax of the grouping set functionality, but that’s still a lot of code to write. Therefore I want to show you today 2 subclauses that the grouping set functionality supports: the CUBE and ROLLUP subclauses.

CUBE Subclause

With the CUBE subclause you can generate all possible grouping sets for a given set of columns. This is the so-called Power Set. When you have the 3 columns a, b, and c, CUBE (a, b, c) produces the following groupings for you:

  • (a, b, c)
  • (a, b)
  • (a, c)
  • (b, c)
  • (a)
  • (b)
  • (c)
  • ()

The following query generates the power set for the columns CustomerID, SalesPersonID, and YEAR(OrderDate) manually through the grouping sets functionality introduced last week. 

-- Calculates the power set of CustomerID, SalesPersonID, YEAR(OrderDate)
SELECT
	CustomerID, 
	SalesPersonID, 
	YEAR(OrderDate) AS 'OrderYear', 
	SUM(TotalDue) AS 'TotalDue'
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY GROUPING SETS
(
	(CustomerID, SalesPersonID, YEAR(OrderDate)),
	(CustomerID, SalesPersonID),
	(CustomerID, YEAR(OrderDate)),
	(SalesPersonID, YEAR(OrderDate)),
	(CustomerID),
	(SalesPersonID),
	(YEAR(OrderDate)),
	()
)
GO

As you can see from the code, you have to specify every possible combination. So it’s again a hard, cumbersome job to write such a query with a simple requirement. Things get much easier if you use the CUBE subclause instead of specifying individual grouping sets. Let’s have a look at the following code.

-- Calculates the power set of CustomerID, SalesPersonID, YEAR(OrderDate) with the CUBE subclause
SELECT
	CustomerID, 
	SalesPersonID, 
	YEAR(OrderDate) AS 'OrderYear', 
	SUM(TotalDue) AS 'TotalDue'
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY CUBE(CustomerID, SalesPersonID, YEAR(OrderDate))
GO

As you can see from the code, you only specify the columns, and SQL Server itself generates the power set out of it. That’s very simple, straightforward code compared with the first listing.

ROLLUP Subclause

Besides the CUBE subclause, SQL Server supports also the ROLLUP subclause since SQL Server 2008. With the ROLLUP subclause you are able to define a subset of the power set. The ROLLUP subclause also assumes a hierarchy between the individual columns. Imagine that we have again our 3 columns a, b, and c. When you use ROLLUP(a, b, c), it produces the following grouping sets:

  • (a, b, c)
  • (a, b)
  • (a)
  • ()

You can very easily see from these individual grouping sets that there is a hierarchy between these columns. Just substitute the columns a, b, and c with columns like OrderYear, OrderMonth, and OrderDate, and you will get the idea of the kind of analytical queries you can perform here. It’s the poor-man implementation of SSAS (SQL Server Analysis Services). Let’s have a look at the following query that uses the ROLLUP subclause.

-- Calculates the following grouping sets:
-- => (OrderYear, OrderMonth, OrderDay)
-- => (OrderYear, OrderMonth)
-- => (OrderYear)
-- => ()
SELECT
	YEAR(OrderDate) AS 'OrderYear', 
	MONTH(OrderDate) AS 'OrderMonth',
	DAY(OrderDate) AS 'OrderDay',
	SUM(TotalDue) AS 'TotalDue'
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY ROLLUP(YEAR(OrderDate), MONTH(OrderDate), DAY(OrderDate))
GO

The output of that query gives you the following individual grouping sets:

  • (OrderYear, OrderMonth, OrderDay)
  • (OrderYear, OrderMonth)
  • (OrderYear)
  • ()

The ROLLUP subclause has a very simple syntax, but the output that you are getting back is really powerful for data analysis!

Summary

I hope that you have enjoyed today’s blog posting about the CUBE and ROLLUP subclause introduced with the grouping sets functionality in SQL Server 2008. If you have time you can also leave a short comment, and tell me if you already use these subclauses in your databases, or if you think that they make sense in your environment.

Like or share this blog posting to get the source code.

Thanks for reading and your time!

-Klaus

1 thought on “Grouping Sets: the CUBE and ROLLUP subclauses”

Leave a Comment

Your email address will not be published. Required fields are marked *