Posts Tagged ‘Tabular’

Tabular and NUMA

July 30, 2014

Recently a great white paper Entitled “An Analysis Services Case Study: Using Tabular Models in a Large-scale Commercial Solution” was released which provides a lot of information that would be helpful when on is faced with evaluating the various Microsoft BI tools (SSAS Tabular, SSAS Multi-dimensional, and ColumnStore Indexes).

I will not reproduce the whole white paper on my BLOG but below are some quotes from the white paper that I found particularly interesting:

ColumnStore Indexes:

“Unpredictable Query speed. In SQL 2012, specific query plans can really take advantage of columnstores (which are an absolute need for the volume of data of inContact). However, as soon as the query became complex (for example, mixing SUM with DISTINCT COUNT in the same query) SQL Server stopped using the columnstore and reverted to row mode, leading to poor performance. It looked nearly impossible to produce a general-purpose query engine that gave enough freedom to users in expressing their query while retaining efficiency.”

While I too had problems with SQL Server 2012 Non-Clustered ColumnStore Indexes and the query optimizer picking the much slower “row mode” instead of “batch mode”, after using SQL Server 2014 Clustered ColumnStore Indexes for almost a year, I have not seen any major issues related to “batch mode”. As a matter of fact, I have even observed about a 15% – 30% performance improvement of Clustered ColumnStore Indexes over the non-clustered variety.

The section that by far was the most surprising to was the section entitled “The complexity and ubiquity of NUMA ” as it reminded me of something that I ran across a few years ago when I was testing the then brand new SSAS Tabular with NUMA servers.

AS the article clearly states:

“Most of the servers available on the market with enough expansion to handle terabytes of RAM are based on the Non Uniform Memory Access (NUMA) architecture. This is a big issue because Tabular is not NUMA aware.”

In other words any enterprise hardware will make use of NUMA nodes and so putting a none NUMA app on this kind of hardware would be the wrong approach. What then follows in the article is a good explanation of what NUMA is and why it is important. The final two paragraphs really hit home for me why SSAS Tabular, without a significant work around, is not enterprise worthy:

“As you can see, programmers need to optimize the software to run efficiently on NUMA nodes. You can’t simply use any software on NUMA and expect good performance. The problems of memory access speeds are compounded when the software relies heavily on RAM.

Unfortunately, not only is Tabular not NUMA aware, it also uses RAM as primary storage. Thus, the marriage between Tabular and NUMA is a not an easy one.”

While this is a bit of an understatement, it is now easy to find out if your server has multiple NUMA nodes by simply opening up Task Manager to the CPU tab. If your server has multiple NUMA nodes then you will see CPU usage for each node. See example below of an 8 NUMA node server:

numa nodes

 

The solution to the Tabular NUMA node problem is (assuming that it is possible) to break the data up into separate distributions according to the number of NUMA nodes on the server and then create an equal number of VMs, each with an instance of SSAS Tabular. See below:

“· Use Hyper-V VMs to set node affinity. Tabular might not be NUMA aware, but it runs great if you install a tabular instance on VM that is affinitized to a particular node. This approach has an added benefit; you don’t need to adjust memory configuration settings.

· Distinct count performs better in a Tabular implementation. This is a well-known fact about Tabular, but if you are new to the technology, it’s worth remembering. This is especially true if the alternative you are evaluating is SSAS Multidimensional.”

Advertisements

Installing SSAS 2012

May 6, 2012

I don’t usually write posts like this but there have been a few substantial changes to how SSAS 2012 is installed as compared to the previous versions.  The changes also underscore the changes that have occurred to SSAS with the introduction of the Tabular Model.

I am not going to take you through all the steps the. But when you get to the Analysis Services Configuration screen you can now choose to install SSAS in  the traditional Multidimensional Mode or Tabular Mode. You cannot install both at the same time but you must install them one at a time.

In the example below I choose to install the Tabular Model first:

image

I am not going to walk through the rest of the steps. If you want to install the Multidimensional you will have to re-run the installation again and choose the Multidimensional Mode. See the next three screen shots:

image

image

image

If you have run through the installation two times, one for each mode, then you should see something like the following in SSMS. Notice the two different instances:

image

The above only serves as an introduction and provides a context to some of the differences you will encounter when working with a Tabular project in SQL Server Data Tools (SSDT) aka the new BIDS.

When you first fire up SSDT (I will miss BIDS because it sounds so much smoother;) and create your first Tabular project see below..

image

You will be confronted with a strange new question:

image

You will be asked to choose a workspace server. What this means is that you must choose a previously installed instance of SSAS in Tabular mode. The workspace server will be used during development of the Tabular project in SSDT.  And in fact SSAS automatically creates a new database with the project’s name with a GUID appended to it.

image

This means in order to develop a Tabular project in SSDT you must have first installed a SSAS instance in Tabular mode. (Don’t be surprised if you see this on a test sometime see my linkedin profile). With that said  I like this change because unlike the old multidimensional project in BIDS which runs in 32 bit during debug, you can now develop a Tabular project in SSDT in 64 bit mode and then deploy it to your 64 bit server. This eliminates the headache of having deal with both 32 bit and 64 bit code when deploying a project.

Additional Resource:

Cathy Dumas: Configuring a workspace database server

SSAS Tabular Model – Query Modes

March 8, 2012

One of the most compelling features of the new SSAS Tabular Model is the ability to set the model’s Query Mode to In-memory (aka Vertipaq), DirectQuery or both!

QueryMode can be set to one of the following:

  • DirectQuery – This setting specifies all queries to the model should use the relational data source only.
  • DirectQuery with In-Memory – This setting specifies, by default, queries should be answered by using the relational source, unless otherwise specified in the connection string from the client.
  • In-Memory – This setting specifies queries should be answered by using the cache only.
  • In-Memory with DirectQuery – This setting specifies, by default. queries should be answered by using the cache, unless otherwise specified in the connection string from the client.

The hybrid modes give the best of both In-memory and DirectQuery but there are a few gotchas. One of the hybrid or DirectQuery modes must be set in 2 places.

Configuration: Set Query Mode at Project level:

image

Next you can edit the table partitions if you don’t like the defaults:

image

Some DAX functions and calculated columns are not yet supported in DirectQuery Mode. If You try to deploy/process the model you will get a errors explaining which objects need to be removed.

image

In the above example  the calculated column needs to be deleted before DirectQuery Mode can be enabled.

Summary:

Configuring Tabular model to use one of the Hybrid Query Modes will allow it to scale past the memory limitation of the server and when combined with Columnstore indexes in SQL server 2012 provides a compelling reason to use a Tabular Model for enterprise BI solutions. Although there are currently some limitations I would expect them to be resolved in the future.

Additionally making Columnstore indexes available in the next version of Microsoft’s Parallel Data Warehouse would be a huge performance boost to an already very powerful appliance. The next logical step would be to simply put a Tabular Model on top of it. All this is just dream right now but won’t it be nice!

Tabular Model

January 30, 2012

One of the most exciting features in SQL Server 2012 is the introduction of the Tabular Model. The Tabular model along with the multidimensional model comprise the Business Intelligence Symantec Model (BISM).

To be honest when the Tabular Model and the BISM where introduced along with DAX I was more than a little skeptical. It seemed like yet another addition to the already complex Microsoft BI stack.  It was presented as ‘individual’ or ‘Group’ BI that allowed users to quickly and easily develop their own models. But didn’t scale because it had to fit into memory and could not handle complex requirements like Parent-Child relationships financial calculations.

However, after playing around with it for a little and talking some of the leaders in that space I have completely changed my mind. I will address my initial concerns in a general manner below:

1. Scalability – While it true that the Tabular Model is an in-memory database system. The memory limitation can be overcome by using the DirectQuery query mode (or one of the “hybrid mode”) which passes the query to the SQL Database Engine. When combined with Columnstore indexes you can build BI solutions that scale to the enterprise level.

2. Complexity – Although DAX has not reached the level sophistication as MDX to able to handle complex requirements, the gap is closing and it is closing fast! Here are some good examples:

 Parent/Child Hierarchies

Unary Operators

In summary I am now convinced that Tabular and DAX are not going to complicate MS BI but simplify and allow me to more quickly and on a more iterative basis deliver solutions to my clients. I am excited about the future.

Here are some of the best resources that I have found:

Cathy Dumas BLOG – great info.

SQLBI: Marco and Alberto have tons of great DAX information.

DAX Editor: The DAX Editor for SQL Server an extension to Visual Studio 2010 that implements a language service for the DAX language. The DAX language is the expression language used for SQL Server Analysis Services tabular models. This extension can be used when editing DAX queries and also for writing DAX measures.

MDX and DAX formatter

DAX in the BI Tabular Model Whitepaper and Samples: This whitepaper and sample workbook introduce Data Analysis Expressions (DAX), a formula expression language used to define calculations in PowerPivot for Excel® workbooks and Analysis Services tabular model projects authored in SQL Server Data Tools.

PowerPivot-info: The 1 stop shop for all things DAX/PowerPivot/Tubular!

DAX Resource Center: This Wiki includes articles, whitepapers, videos, and samples published by both Microsoft as well as experts in the Business Intelligence community. Because this is a Wiki, you too can contribute. If you have some great information about DAX and how you use it in your organization, please share it!