Data Cube: A Relational Aggregation Operator : Microsoft research paper

Very nice article on group by clause , how it works and what are its limitations and how can we overcome with a CUBE operator

Data analysis applications typically aggregate data across many dimensions looking for unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or 1-dimensional answers. Applications need the N-dimensional generalization of these operators. This paper defines that operator, called the data cube, or simply cube. The cube operator generalizes the histogram, cross-tabulation, drill-down, and sub-total constructs found in most report writers. The cube treats each of the N aggregation attributes as a dimension of N-space. The aggregate of a particular set of attribute values is a point in this space. The set of points form an N-dimensional cube. Super-aggregates are computed by aggregating the N-cube to lower dimensional spaces. Aggregation points are represented by an “infinite value”, ALL. For example, the point would represent the global sum of all items. Each ALL value actually represents the set of values contributing to that aggregation

Source :


Download PDF file by clicking below link

MIcrosoft cube research



How to achieve Initcap functionality in SSIS , and how to use FINDSTRING(),


FINDSTRING() works as the same as CHARINDEX() in TSQL

it returns the position of character or the string in side the string that we want to search

Syntax : FINDSTRING( “search string”,”searching term”,occurrence)

to better understand we will use a simple example in validating a email address, so for the email to be valid it needs to have at least one @ character

in the email address , if not its considered as not a valid address

As we learned how to use FINDSTRING() now we will use this functionality to get the initcap functionality in SSIS , Initcap is a cool feature in oracle when this function used,

The first letter of each word into uppercase, all other letters in lowercase. Words are delimited by white space or characters that are not alphanumeric

This is how it is used in Oracle example

The following example capitalizes each word in the string:

\\***** PL\SQl****\\\
SELECT INITCAP(‘the soap’) “Capitals” FROM DUAL;

The Soap
\\*** PL/SQl*****\\

Now we will achieve this in SSIS using FINDSTRING, TRIM, UPPER, and LOWER functions in derived columns

In the derived column we are using

(UPPER(SUBSTRING(TRIM(Name),1,1)) + TRIM(LOWER(SUBSTRING(TRIM(Name),2,FINDSTRING(TRIM(Name),” “,1) – 1 < 0 ? LEN(TRIM(Name)) : FINDSTRING(TRIM(Name)," ",1) – 1)))) + " " + (FINDSTRING(TRIM(Name)," ",1) – 1 < 0 ? " " : UPPER(SUBSTRING(TRIM(SUBSTRING(TRIM(Name),FINDSTRING(TRIM(Name)," ",1),LEN(TRIM(Name)))),1,1)) + TRIM(LOWER(SUBSTRING(TRIM(Name),FINDSTRING(TRIM(Name)," ",1) + 2,LEN(TRIM(Name))))))

The above expression looks big but it is very simple

In this we are taking the first letter of the word and making it upper case , if there is a empty in the string we are considering as a second word and then we are selecting the

first letter of the world to upper case and all other characters are mapped to lower case