Posts Tagged ‘performance tuning’

Performance Tuning SQL Server: Parallelism

July 7, 2013

One of the most important architectural principles built into Dell’s latest generation of data warehouse appliances called Quickstart Data Warehouse (QSDW)is parallelism. The following post will give an over of how parallelism can be implement using QSDW as a reference.

Breaking a large data warehouse workload into smaller pieces that then can be worked on in parallel is one of the best ways of scaling SQL Server. This parallel processing technique is exemplified in Non-Uniform Memory Access (NUMA).

Hardware and Software Parallelism

Even though the NUMA architecture is used to create parallelism between main memory (RAM) the server’s CPU cores, the same principle can be applied to the entire SQL Server Data Warehouse Server as well.

However this parallelism is not present in most OLTP systems because those systems rely on a few LUNs/Disk Groups with many disks each to achieve maximum IOPs. While this approach is good for OLTP systems it is not the proper architecture for a SQL Server Data Warehouse Server which seeks to maximize MB/second not IOPS.

The best approach, therefore, is break up the disks into several separate Disk Groups or LUNs and place a SQL Server database file on each LUN. When this is done the SQL Server Database Engine can break the read and write operations into smaller chunks according to the number LUNs that were created thus achieving parallelism. The image below shows have 9 LUNs can be created and have a SQL Server Database file put on each LUN.

image

Because the data for the entire Filegroup/Database is broken  out to to multiple LUNs by placing a SQL Server database file on each LUN, all database operations will be parallelized. This has the added effect of increasing MB/sec by a factor roughly equal to the number of LUNs, in this case eight.

This architecture does create parallelism a the disk/database level, however it is not yet aligned to the number CPU cores present on the Data Warehouse Server. To accomplish this next level of parallelism, instead of creating one Filegroup for the data base, the number of Filegroups will have to align to the number of CPU cores.

The illustration below demonstrates what the configuration may look like on a server with eight cores.

image

Physically each LUN is implemented on Windows server as a Mounted Volume and each mounted volume will have eight database files, one database file for each of the eight file groups.

clip_image005

Only now can the SQL Server Data Warehouse Workload be parallelized across the entire server, from disks to the each CPU core. While this approach might seem like a lot of work, the performance benefits are tremendous and lead to much more reliable and consistent performance under a wide variety of data warehouse workload scenarios.

The parallelism techniques described above are built directly into Dell Quickstart Data Warehouse Appliance (QSDW), thus making it a highly tuned SQL Server Data Warehouse solution that can save a customer significant time and money.

Advertisements