Performance Tuning: Data Warehouse Storage Throughput (MB/sec)

SQL Server Data Warehouse Workload Optimization

There are significant differences in how to achieve optimal resource balance for a SQL Server Data Warehouse system versus accomplishing the same optimization for lookup based online transactional system (OLTP). OTLP systems are heavily indexed in order to support high concurrency (many users) with low latency retrieval of a small number of rows from datasets that often contain very little historical data. This workload produces significant disk head movement which can be measured in random I/O reads and writes (IOPS). Data warehouse analytical queries, however, have relatively low concurrency (fewer users) and involve much larger historical datasets that benefit greatly from sequential reads and writes (disk scans). As a result, Data Warehouse systems are best measured by analyzing total throughput (sequential reads and writes) in MB/s rather than the more traditional focus on operations per second (IOPS) for OLTP Systems.

SQL Server Data Warehouse Balanced Architecture

Due to the fact the data warehouse scan centric (sequential reads/writes) workload is best measured in MB/sec, data warehouse hardware components need be architected in such way as to allow the maximum storage system data throughput (MB/sec) to be able to be consumed by the processors. Having a balanced Data Warehouse Architecture means that the storage system has been optimized for scan centric (sequential reads/writes measured in MB/sec) operations and that the processors can receive and process the data delivered by the storage at the same MB/sec rate.

Storage

In the SQL Server Data Warehouse the throughput of the storage is the most critical, complex, and costly portion of the system. Because Data Warehouse Analytical queries are too large to fit in memory, the storage must be able to quickly locate and deliver the data from disk where it can then be aggregated by the processors. In most data warehouse systems it is the storage that is the bottle neck and so careful monitoring and measuring of its performance is essential.

The best tool for measuring storage performance is a free Microsoft tool call SQLIO. It can test the performance of storage under a great variety scenarios and produces very detailed results which can then be charted out overtime. SQLIO should be used once the storage hardware is installed to validate that it has been configured properly and during regularly scheduled maintenances cycles to ensure that the storage is preforming properly.

Laptop Throughput (MB/sec)

As an example I used SQLIO to generate the performance statistics of my personal laptop. I also used David K Lee’s excellent  Storage Analysis Website to produce the following chart which summarizes the SQLIO results very nicely.

image

As you can tell from the graph my puny laptop is actually pretty well configured for Data Warehouse workloads.  The sequential reads/writers preform much better than random reads/writes.

In fact, I would guess that my laptop might preform better than SANs that cost 100x more simply because my laptop has dedicated Direct Attach Storage (DAS) that does have to fight with other resources in a huge (OLTP/IOPs centric) SAN that every application shares.

Conclusion:

If your SQL Server Data Warehouse is preforming poorly download SQLIO and see if a laptop can get better performance. If so then you might have a problem with your storage Winking smile

Additional Resources:

Fast Track Data Warehouse Reference Guide for SQL Server 2012 – The Bible for all things DW. It is a must read !!!

Dell Reference Configuration –  This Dell ”starter” Fast Track gets 4,700+ MBs/sec!!!

image

Also see: SQL Server OLTP vs. Data Warehouse Performance Tuning

Advertisements

Tags:

4 Responses to “Performance Tuning: Data Warehouse Storage Throughput (MB/sec)”

  1. Why is my DW Query so SLLLOW?! « Garrett Edmondson Says:

    […] There are standard to tools measure your Data Warehouse System’s throughput in MB/sec. As explained in a previous post. […]

  2. Performance Tuning SQL Server Database: Extended Events | Garrett Edmondson Says:

    […] is absolutely critical for architecting a data warehouse server that preforms as expected! See: Data Warehouse Storage Throughput (MB/sec) for more […]

  3. Performance Tuning SQL Server: Parallelism | Garrett Edmondson Says:

    […] it is not the proper architecture for a SQL Server Data Warehouse Server which seeks to maximize MB/second not […]

  4. Performance Tuning SQL Server: Hardware Architectures Throughput Analysis | Garrett Edmondson Says:

    […] performance capabilities of the SQL Server Data Warehouse database server that I am testing (see: Data Warehouse Storage Throughput (MB/sec) ). There are of course many other types of tests and analyses that I preform as well as part of […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: