Archive for June, 2012

Dynamically Generate SSIS Packages

June 17, 2012

The the following video describes the basics of how to generate SSIS packages from metadata. This technique uses Dynamic SQL and BiML from BIDS Helper to generate the SSIS packages.

Code from video

While the above video demonstrates a very simple example, the same technique can be used to generate more complex SSIS packages. One of the challenges to applying this technique to more complex scenarios, is that it can be difficult to create the initial BiML for the template SSIS package. In my example from the video the template SSIS package simply queries the data source and loads the data into a multi-cast.

Varigence is the company that created BiML and they have a very useful tool called Mist which has a lot of interesting  features.  One of the most important features is a the ability to convert an existing SSIS package to BiML.

Mist will enable to you generate SSIS packages dynamically without having  to know BiML:

  1. Create Template SSIS Package in BIDS/SSDT
  2. Use Mist to convert Template SSIS Package to BiML
  3. Decide which sections of the BiML you want make dynamic
  4. Build Dynamic SQL Statement using the BiML from the second step

Conclusion:

Being able to dynamically generate SSIS packages can greatly reduce SSIS package development time and administrative effort. Additionally, you  enforce standards across all of the packages in your project such standardizing SSIS execution trees and task names .

Fast Track Data Warehouse Reference Guide for SQL Server 2012

June 3, 2012

The newest batch of Microsoft White Papers was recently  released for SQL Server 2012. They should all be considered required reading. I know that I will read many of them several times.

Among the most important is the Fast Track Data Warehouse Reference Guide for SQL Server 2012.  The following comes from the article

Introduction

This document defines the component architecture and methodology for the SQL Server Fast Track Data Warehouse (FTDW) program. The result of this approach is the validation of a minimal Microsoft SQL Server database system architecture, including software and hardware, required to achieve and maintain a baseline level of out-of-box performance for many data warehousing workloads.

Audience

The target audience for this document consists of IT planners, architects, DBAs, and business intelligence (BI) users with an interest in choosing standard, proven system architectures for FTDW-conforming SQL Server workloads.

Fast Track Data Warehouse

The SQL Server Fast Track Data Warehouse initiative provides a basic methodology and concrete examples for the deployment of balanced hardware and database configuration for a data warehousing workload. For more information, see the FTDW Workload section of this document.

Balance is a measure of key system components of a SQL Server installation; storage, server, storage network, database, and operating system. Each of these components is tuned to optimal configuration. The goal is to achieve an efficient out-of-the-box balance between SQL Server data processing capability and hardware component resources. Ideally, your configuration will include minimum system hardware to satisfy storage and performance requirements for a data warehousing workload.

Fast Track

The SQL Server Fast Track brand identifies a component hardware configuration that conforms to the principles of the FTDW reference architecture (FTRA). Each FTRA is defined by a workload and a core set of configuration, validation, and database best practice guidelines. The following are key principles of the Fast Track program:

  • Workload-specific benchmarks. System design and configuration are based on real concurrent query workloads.
  • Detailed and validated hardware component specifications.
  • Component architecture balance between database capability and key hardware resources.

Value Proposition

The following principles create the foundation of the FTDW value proposition:

  • Predetermined balance across key system components. This minimizes the risk of overspending for CPU or storage resources that will never be realized at the application level.
  • Predictable out-of-the-box performance. Fast Track configurations are built to capacity that already matches the capabilities of the SQL Server application for a selected server and workload.
  • Workload-centric. Rather than being a one-size-fits-all approach to database configuration, the FTDW approach is aligned specifically with a data warehouse use case.

Methodology

Holistic Component Architecture

SQL Server FTDW reference architectures provide a practical framework for balancing the complex relationships between key components of database system architecture. Referred to generically as a stack, the component architecture is illustrated in Figure 1.

image

Figure 1: Example Fast Track database component architecture

Each component of the stack is a link in a chain of operations necessary to process data in SQL Server. Evaluating the stack as an integrated system enables benchmarking that establishes real bandwidth for each component. This ensures that individual components provide sufficient throughput to match the capabilities of the SQL Server application for the prescribed stack.

Workload Optimized Approach

Different database application workloads can require very different component architectures to achieve optimal resource balance. A classic example of this can be found in the contrast between small request, lookup-based online transaction processing (OLTP) workloads and scan-intensive, large-request, analytical data warehousing. OLTP use cases are heavily indexed to support low latency retrieval of small numbers of rows from data sets that often have little historical data volume. These types of database operations induce significant disk head movement and generate classic random I/O scan patterns. Analytical use cases, such as data warehousing, can involve much larger data requests and benefit greatly from the increased total throughput potential of sequential disk scans.

For these contrasting use cases, the implications for a balanced component stack are significant. Average, per-disk random I/O scan rates for modern SAS disk drives can be a factor of 10 times slower when compared to sequential scan rates for the same hardware. With Fast Track data warehousing workloads an emphasis is placed on achieving consistently high I/O scan rates (measured in MB/s) rather than the more traditional focus on operations per second (measured in IOPS).

The challenge of very different workloads is addressed by clearly defining the attributes of customer workloads. SQL Server Fast Track workloads comprise a qualitative list of attributes that uniquely define a common database application use case. In addition, each workload is represented by quantitative measures including standard benchmark queries. Workload-specific benchmarking is used to validate database configuration, best practices, and component hardware recommendations.

Validated SQL Server Fast Track Reference Configurations

All published Fast Track reference architectures are validated as conforming to the set of principles and guidelines provided in this reference guide. Examples of this process can be found in later sections of this document.