I recently upgraded the OS on my laptop to windows 8 and there are things that I like and things that I don’t like about it. I think that most of the things that don’t like have more to do with some of the new navigation features and should become second nature over time.
One of my favorite surprises in win8 is the new Task Manager. When I open it up, I am presented with a much cleaner interface that also provides more information than the previous versions of Task Manager. One these new goodies is now you can see the sizes of the L1, L2, and L3 caches:
These CPU caches act like stepping stones for data as it travels from main memory (RAM) to the CPU and the closer the cache is to the CPU the faster the data can be processed by the CPU. For example L1 and L2 caches are orders of magnitude faster than the L3 cache and as you can tell from the above image the closer the cache is to the CPU (L1) the smaller the cache size is.
If you have an application that you want to process faster on the CPU then a good way to do it would be to have the application divide up its instructions and data in batches that would ideally fit in the L1 or L2 caches. Then magically your application would run faster (there is more to it than just that but I am keeping things simple).
Well that is great but what do L1, L2, and L3 caches have to with SQL Server and data warehousing? Well in a word ALOT! The new SQL Server 2012 Columnstore Indexes not only compress and store data (tables and indexes) in a column-wise format but this columnar data is also broken down into “batches” that just so happen to match the sizes of the L1 and L2 caches on the server. This means that all things being equal, the same data set will actually process faster on the CPU using Columnstore Index’s data “batches” than had the same data been process without Columnstore Indexes.
And guess what, you can actually see this happen in SSMS when you run query on a tables that has a Columnstore Index and view the Execution Plan.
The table dbo.CSI_100m_Fact has a Columnstore Index and at execution time the query uses Actual Execution Mode = BATCH for the Columnstore Index Scan operator which mean that it is doing its L1 and L2 magic behind the scenes and your queries now run much faster!
Note: If you create a columnstore index and your data warehouse query does not use the columnstore index then that mean that your data model is wrong (fact and dimension tables) or your query does not match typical data warehouse workloads. See: Columnstore Indexes on MSDN.