Posts Tagged ‘DQS’

DQS Composite Domains

June 16, 2013

Update:

I  felt a little too ambitious and actually recorded a series of 3 videos at MSBI Academy which cover all the content of this post. The videos also give a little bit more back ground information about DQS for those who have not used it yet.

1. Creating a new Data Quality Services (DQS) Domain

2. Adding Business Rules to a Data Quality Services (DQS) Domain

3. Enriching data using a Data Quality Services Project

Introduction:

Data Quality Services (DQS) is a great addition to SQL Server 2012. There were many times in the past were I had wished there was a tool which allowed Data Stewards to clean , validate, and preform matching or de-duplication on the data before it went into the Data Warehouse. In my opinion there are only two kinds of companies: those that don’t recognize that they have data quality issues and those that realize that they have data quality issues! Unfortunately, most organizations don’t find out that they have data quality issues until they attempt a data warehouse project and then those issues become all too apparent.

I am not going to go into all the functional details of DQS. (See above link for great training materials on the DQS team website) However, I want to focus on one of the most important features of DQS that seems to be misunderstood.

The ability to create Composite Domains for data correction or validation is one of the most critical features of DQS and would be required in virtually any DQS solution. However, creating Composite Domain rules in the DQS client UI can be a little tricky if you don’t know what to look for. (At least it was for me hence the reason for this post)

The following example uses the AdventureWorks Sample that is available on the DQS Team website. I suggest that you watch their training videos so that you have a good understanding of what I will describe in the next section.

Scenario:

You want to create Composite Domain rules that populates Title based on the values in Marital Status and Gender such that;

when Gender = ‘M’ male

– then Title = ‘Mr.’, or

when Marital Status = ‘S’ single and Gender = ‘F’ female

– then Title = ‘Ms.’ , or

Marital Status = ‘M’ single and Gender = ‘F’ male

– then Title = ‘Mrs.’

clip_image001

To accomplish this first you need to create all the individual domains and then create the ‘Title Enrichment’ Composite Domain. Next the Tile Enrichment Composite Domain needs a few rules to correctly populate the Title domain/column.

For the Single Female Title rule the first part of the rule or the test condition needs to be created. This is similar to the “IF” of the “IF…THEN…” syntax.

In the following example shows: if Marital Status = ‘S’ then Title = ‘Ms.’

clip_image003

Everything looks good at this point. Now all that is needed is to add an additional condition to the ‘IF’ to test when Gender = ‘F’. However, when you click on the “Adds new condition to the selected clause” button, a new clause is created for the Marital Status domain.

clip_image005

This not what is needed because the Gender domain needs to be used. It would appear that DQS is missing the very important ability to have multiple domains be tested in the same Composite Domain Rule.

The key to creating this type of Composite Domain Rule is to ensure that that no part of the current condition (“Martial Status Value is equal to S”) is in focus when you click on the “Add new condition to the selected clause”. You will know when the current condition is in focus because the vertical line will be solid blue or a blue box will around a part of the condition. See below:

clip_image007

What is required is to; 1. Click away for the part of the current condition before 2. You click on the “Add new condition to the selected clause”

clip_image008

If you do that then you can create the required condition using the Gender domain

clip_image009

Now when a project is run with this Composite Domain and the data in the Title column will have the correct values:

clip_image011