Archive for May, 2014

Total Row with Grouping Sets

May 26, 2014

In a previous post I mentioned that under certain situations (CPU intensive calculations like log() and exp()) Columnstore Indexes can out preform SSAS MOLAP.

However, creating results sets in T-SQL maybe a little more complicated than with MDX. In MDX for example dimension attributes can have a “All” member. This “All” member can be used to create results sets that aggregate fact data by a dimension attribute that then have a “Total” or ”All” row appended to the bottom of the attributes. This “All” represents the aggregate of all the members from the dimension(s).

In the simple example below I have aggregated the LINEITEM fact table from the TPC-H standard  by SUPPKEY and also Appended a “Total” or “All” member row.

SUPPKEY Quantity
630 16026
7697 15288
8776 16089
3751 15406
Total 62809

However, getting the Total row to appear on the bottom is not as strait forward as it is with MDX. While there are several ways to produce the Total grouping/row in T-SQL because I would also like the make the best or most efficient use of the Columnstore Index that is on the table as well.

The options:

1.   Common Table Expression

with CTE  (L_SUPPKEY,Quantity)as

(

       select [L_SUPPKEY]

          ,sum([L_QUANTITY])as Quantity

       from [dbo].[LINEITEMs1]

       where [L_SUPPKEY] in(3751,8776,630,7697)

       groupby [L_SUPPKEY]

)

 select cast(L_SUPPKEY aschar)as SUPPKEY, Quantity

 from CTE

 union all

 select‘Total’,sum(Quantity)

 from CTE

While this might seem like a good option if you check out the execution plan you see that query actually scans the fact table 2 TIMES which will at least double the CPU time.

table scan

2.Table Variables

declare @tbl astable (SUPPKEY char(20), Quantity int)

 

insertinto @tbl

select [L_SUPPKEY],sum([L_QUANTITY])as Quantity

from [dbo].[LINEITEMs1]

where [L_SUPPKEY] in(3751,8776,630,7697)

groupby [L_SUPPKEY]

 

select SUPPKEY, Quantity

from @tbl

unionall

select‘Total’,sum(Quantity)

from @tbl

Even though the result set is small this is a terrible option  because inserting into a table variable forces the query into “row mode” which is much slower than “batch mode

row mode

3. Temporary Tables

select cast([L_SUPPKEY] aschar)as SUPPKEY,sum([L_QUANTITY])as Quantity

into #temp

from [dbo].[LINEITEMs1]

where [L_SUPPKEY] in (3751,8776,630,7697)

groupby [L_SUPPKEY]

 

select SUPPKEY,Quantity

from #temp

union all

select ‘Total’,sum(Quantity)

from #temp

While this is pretty good option it seems a little excessive to write the small result set all the way back down to disk (TempDB) just pull it back again for the Total row calculation. That is a unnecessary use of IO. Why go all the way back to disk when it is not required.

4. Grouping Sets 

select isnull(cast([L_SUPPKEY] aschar),‘Total’)as SUPPKEY

       ,sum([L_QUANTITY])as Quantity

from [dbo].[LINEITEMs1]

where [L_SUPPKEY] in(3751,8776,630,7697)

group by grouping sets(([L_SUPPKEY]),())

This is the best option because it has none of the limitations from the the above options.

Advertisements