How to summarize your data using ROLLUP and CUBE in TSQL

We all know when reporiting any financial or sales information adding sub totals and grand totals to report makes it more effective but most offten develop0rs dont realize TSQL provides a effective way in doing this is using rollup and cube operators in group by clause.

So for this demo I am using Adventure works 2008 OLTP sample database

Three tables

  1. Production.Product
  2. Production.ProductCategory
  3. Production.ProductSubcategory

here is a simple querying joining three tables and grouping by product category and subcategory

select

pc.ProductCategoryID,

s.ProductSubcategoryID,

AVG(ListPrice)as Avglistprice,

MAX (ListPrice)as MAxlistprice,

MAX(StandardCost)as Maxstandardcost

from Production.Product p

inner join Production.ProductSubcategory s

on  p.ProductSubcategoryID = s.ProductSubcategoryID

inner join Production.ProductCategory pc

on  pc.ProductCategoryID = s.ProductCategoryID

group by pc.ProductCategoryID,s.ProductSubcategoryID

SAMPLE RESULT

You can see we are missing some things in the above result i.e  subtotals and grand totals

change group by cluase to include rollup operator like this

group by ROLLUP(pc.ProductCategoryID,s.ProductSubcategoryID)

trying running it with out errors

Result looks like this

CUBE is used as same way but it gives us a little different result , CUBE summarizes all the combinations of  columns in group by clause and it also gives us grandtotal which does not make sense for above example because each sub category exists once in each category.

Deep dive into statistics and why we need statistics

Statistics play an important role for the query optmizer to draw a optimizd execution plan , statistics are created against on indexed columns and and non indexed

column as default behaviour of sql server, but the point is how does sql server manage to update statistics when underlying data is updated are new data is inserted

lets see simple example when does sql server updates the statistics for indexed columns

SQL Server also updates the statistics

  • When a table has fewer than 500 rows and is increased by 500 or more rows
  • When a table has more than 500 rows and is increased by 500 rows + 20 percent of the number of rows

As a developer I was always curious to find when does sql server decides to for go Index seek or Index scan for a given table

DBCC SHOW_STATISTICS

How to enbale CLR integration mode in SQL Server

Common Language Runtime (CLR) integration is a powerful feature ofSQL Server. CLR integration was introduced in SQL Server 2005, and it has beenextended in SQL Server 2008. Prior to SQL Server 2005, it was necessary to create extended stored procedures (XP) using C++ to take advantage of functionality beyond that provided by SQL Server natively. Today, CLR integration provides you with the flexibility to define stored procedures, functions, triggers,
data types, and other database objects using either Transact-SQL or any .NET Framework language.The advantage of this approach is that you can choose the language more suited to the task at hand


sp_configure ‘show advanced options’, 1;
GO
RECONFIGURE
GO

sp_configure ‘clr enabled’,1;
go
RECONFIGURE
go

Ad hoc update to system catalogs is not supported.

Configuration option ‘show advanced options’ changed from 1 to 1. Run the RECONFIGURE statement to install.
Msg 5808, Level 16, State 1, Line 1
Ad hoc update to system catalogs is not supported.


to quick fix the above error use RECONFIGURE WITH OVERRIDE

Now, the reason why we are having that error is there is a setting called ‘allow updates’ which is a deprecated feature of SQL server 2000 , basically this allows the user to update the system tables directly in sql server although this feature is no longer available from SQL Server 2005 , but when this option is turned on or changed in SQL Sever 2005 or 2008 or R2, you will get an error message saying “Ad hoc update to system catalogs is not supported.” all though this feature is not supported then
why we are still receiving error ?, well when that option is changed, any “RECONFIGURE” is considered a side effect to the running server values but in this case if you use “WITH OVERRIDE” it worked as it overrides any warinings when we reconfigure

so do not use this feature

User defined functions SQL Server

In this post you will learn How to use
  • What is a user defined function (UDF)
  • Different types of UDF’s
  • Properties of UDF’s
  • How develop and execute UDF’s
    •  Advantages
    •  Limitations
  • How to use Scalar UDF’s in computed columns
  • What are Deterministic functions and Non Deterministic functions how do they effect UDF’s
  • How do they effect performance of the UDF’s
  • Creating indexes on the Computed columns
What is a UDF?
  • UDF(user defined function) is a prepared code that accepts parameters and performs complex calculations and returns value can be scalar or a result set
  • Depending on the what UDF returns UDF’s are categorized to two types
    • – Scalar Functions
    • – Table valued Functions
      • Inline Table Valued Functions
      • Multi statement Table Valued Functions
Benefits of UDF’s
  • Once you create the function then you can, store it in the database, and call it any number of times in your code
  • Faster execution (No re parsing, No reoptimization) by reusing the execution plans
  • Ultimately reducing the network traffic

In case of UDF it is more important to know their limitations rather than their Benefits, the difference between UDF’s and Stored procedures is very less so you have to know the limitations which will help you to  decide how to write your code

  1. Only functions which performs in Deterministic manner can be used create indexed views or creating indexes on the computed columns.
  2. No insert or update or delete statements except for table variables
  3. All the statements which effects the state of the database are not allowed in the function body
  4. You can not call stored procedures in the database only extended stored procedures
  5. A function can have a maximum of 2,100 input parameters
  6. 6.Only one output parameter in case of scalar UDF and single result set in case of In line or multi statement UDF’s.
  7. UDF’s can accept and return any data types except TIMESTAMP data type.
  8. No dynamic SQL, For xml is not allowed.
  9. Does not support error handling Ex:@@ERROR_NUMBER(), No TRY..CATCH

Now lets dissect this

For more details

>How to use Scalar Udf’s in computed columns
>What are Deterministic functions and Non Deterministic functions how do they effect UDF’s
>How do they effect performance of the UDF’s
>Creating indexes on the Computed columns

Please use the below video

System Functions – $partition

Returns the partition number into which a set of partitioning column values would be mapped for any specified partition function.

Syntax

[ database_name. ] $PARTITION.partition_function_name(expression)

It returns a Partition number of column value when mapped to a partition function

this will be useful if you want find the rows in a particular partition

so lets explain this with a simple example

 

create partition function

 

 

 

 

 

 

 

 

 

 

 

now lets create a partitioned table

 

 

 

 

 

insert some sample data

now lets see some examples on $partition function