Posts Tagged ‘Data Warehousing’

SQL Server 2016

May 13, 2015

Last week SQL Server 2016 was announced on the SQL Server BLOG. While there are several new features mentioned on the SQL Server Fact Sheet.   I would like to focus on just those that effect Data Warehousing and those that indicate a new direction for SQL Server towards “Advance Analytics”.

Many of these features are covered and demonstrated in the Ignite (TechEd) video New Features SQL Server 2016.

Real-Time Operational Analytics & In-Memory OLTP

columnstore

As implied by the name, Non-Clustered Columnstore Indexes have now been added to In-Memory OLTP (a.k.a. Hekaton). In the Ignite video it mentions that the addition of the report queries (Non-Clustered Indexes) to the In-Memory table adds little to “no cost to the system”. While this may seems like a farfetched claim to those who have actually had to performance tune an OLTP application that is getting bogged down by reporting queries, It really is not all that unbelievable.

The reason why reads (report queries) do not block OLTP application DML operations (Updates, Deletes, and Inserts) and visa-versa is because the In-Memory OLTP structure does not use memory latches to maintain data consistence.

In-Memory OLTP (In-Memory Optimization) – “The In-Memory OLTP engine is designed for extremely high session concurrency for OLTP type of transactions driven from a highly scaled-out middle-tier. To achieve this, it uses latch-free data structures and optimistic, multi-version concurrency control. The result is predictable, sub-millisecond low latency and high throughput with linear scaling for database transactions.”

This means that an application can support near Real Time Operational Reporting without impacting the application’s performance. In others words you *may* not have to build an ODS with say Replication/AlwaysOn Availability Groups feeding data to the ODS.

*** Provided that it fits in memory

However, this does not mean that you can get away with not building a data warehouse. You will still need a data warehouse to do all the things that a data warehouse do so well  i.e. consolidate, clean and conform data from multiple source systems. Most data warehouses are far too large to fit into memory.

Built-in Advanced Analytics

advanced analytics

Basically “Advanced Analytics” is SQL Server integrated with “R” packages. While I think that SQL Server integrated with “R” is a great step (I have been using R ever since I was introduced it in my college stat classes)  There are a lot more things that can be done with R. Just browse CRAN to get an idea.

I hope that BI professionals don’t lose sight of all is possible with good old SQL.

For example in Itzik’s latest book T-SQL Querying (Developer Reference)  there is an entire chapter devoted to BI. Just check out the topics covered:

Frequencies, Descriptive statistics for continuous variables, chi-squared, analysis of variance, moving averages and entropy, histograms etc.

Polybase

SQL Server and Hadoop integration. After working with Polybase in PDW/APS I am happy that it is now making its way into SQL Server SMP. With Polybase a Data Warehouse can now use Hadoop (HDFS) as a data source and/or a data archive.

Azure SQL Data Warehouse

APS

James Serra on his BLOG provides some valuable information about Azure SQL Data Warehouse which is basically PDW/APS in the cloud.

One of the most interesting things about Azure SQL Data Warehouse is that is can be integrated with Hadoop and Azure Machine Learning/“R” to produce some very interesting solutions. See: Azure SQL Data Warehouse: Deep Dive

Advertisements