Importing entire Active directory information using SSIS and .net framework 3.5 and above

Goal is to extract complete active directory information with some commonly used attributes using SSIS

There are other ways achieve this using Linked server , Microsoft OLE DB provider for directory services and native TSQL but the limitation is most of the users face is it can only retrieve 100o rows because of the default limitation set by the active directory and OLEDB provider doesnt allow us to over ride the page size option.

We can also use SQL Server CLR to achieve same goal but here there no limit of how many rows you can return in your result set

Last but not least using SSIS and here also there is no limit

We will be using Script component as source with .Net framework 3.5

source type

Source columns

set all the data types to string

AD code

result

Here is the code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Collections;
using System.Collections.Generic;
using System.DirectoryServices.AccountManagement;
using System.DirectoryServices;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

public override void PreExecute()
{
base.PreExecute();
/*
Add your code here for preprocessing or remove if not needed
*/
}

public override void PostExecute()
{
base.PostExecute();
/*
Add your code here for postprocessing or remove if not needed
You can set read/write variables here, for example:
Variables.MyIntVar = 100
*/
}

public override void CreateNewOutputRows()
{
string[] ACTIVE_DIRECTORY_DOMAINS = new string[] { “WEISCORP”, “WEISSTORE” };

foreach (string s in ACTIVE_DIRECTORY_DOMAINS)
{
using (PrincipalContext pc = new PrincipalContext(ContextType.Domain, s))
{
using (PrincipalSearcher sr = new PrincipalSearcher(new UserPrincipal(pc)))
{

PrincipalSearchResult<Principal> list = sr.FindAll();
foreach (Principal result in list)
{
DirectoryEntry de = (DirectoryEntry)result.GetUnderlyingObject();
Output0Buffer.AddRow();
Output0Buffer.AD = de.Properties[“samAccountName”].Value.ToString();
Output0Buffer.objectclass = de.Properties[“objectCategory”].Value.ToString();
Output0Buffer.domain = s;

}

}
}

}
}

}

Credits to Rory

Creating, querying, tuning hierarchical ROLAP cubes in Oracle

This post will explain how to  use  Oracle grouping functions to leverage ROLAP capability in Oracle

To begin with some introduction Multi dimensional data marts or an Enterprise data warehouse has become need than necessity as in the end data that drives the business

Having right staging area and presentation layer is very important, there are many Bi tools out there these tools helps the users to create multi dimensional OLAP cubes up on statging layer by MOLAP, ROLAP, HOLAP

indepth analysis of these storage methods are outside the scope of our discussion

ROLAP stands for Relational Online Analytical Processing

ROLAP features

  • ROLAP is considered to be more scalable in handling large data volumes, especially models with dimensions with very high cardinality (i.e., millions of members).
  • With a variety of data loading tools available, and the ability to fine tune the ETL code to the particular data model, load times are generally much shorter than with the automated MOLAP loads.
  • The data are stored in a standard relational database and can be accessed by any SQL reporting tool (the tool does not have to be an OLAP tool).
  • ROLAP tools are better at handling non-aggregatable facts (e.g., textual descriptions). MOLAP tools tend to suffer from slow performance when querying these elements.
  • By decoupling the data storage from the multi-dimensional model, it is possible to successfully model data that would not otherwise fit into a strict dimensional model.
  • The ROLAP approach can leverage database authorization controls such as row-level security, whereby the query results are filtered depending on preset criteria applied, for example, to a given user or group of users (SQL WHERE clause).

ROLAP simplifies the report and analysis development by 100 times when compared to MOLAP

Oracle Provides following grouping functions

  • ROLLUP
  • CUBE
  • GROUPING SETS

Rolap cube over view

In order to understand we will take a simple example a online retail bike selling we will have simple multidimentional data model with

Dimensions

  • dimgeog(loacation)
  • dimdate(Date)
  • dimproduct(Product)

Fact

  • Sales

oracle cube

lets look at the Dimensional model at staging layer

DATA MODELS

ROLLUP enables us to calculate multiple levels of subtotals across a specified group of dimensions(Hierarchy) , allowing us to create subtotals for different levels for hierarchy’s  in dimensions

ROLLUP creates subtotals for n+1 levels of subtotals where n is the number of grouping columns

Simple Rollup

Simple ROLAP example

rollup examples1

How ROLLUP works

rollup examples2

Understanding NULLS in ROLLUP , Readability and Querying, controlling grouping

NULLS returned by GORUP BY extensions are not always always NULL, a NULL indicate that its a subtotal, in order to differentiate against a traditional NULL value

Oracle provides two functions GROUPING and GROUPING_ID

GROUPING(Column name)  will result 1 if a row is subtotaled on that given column name  and returns 0 if its null value or if row is not subtotaled

Using DECODE and GROUPING  functions we can rewrite above query for better readability

rollup examples3

rollup examples4

In order to build a complete cube we have to also need to include other dimensions as we discussed above geog, product, So we will add to these aggregation levels to our existing query

rollup examples5

SELECT DECODE (grouping(dt.fiscalyear),1,’All years’,dt.fiscalyear) AS YEAR,
DECODE (grouping(dt.fiscalquarter),1,’All Quarters’ ,dt.fiscalquarter) AS quarter,

DECODE (grouping(dt.fiscalsemester),1,’All Semesters’ ,dt.fiscalsemester) AS semister,

DECODE (grouping(dt.weeknumberofyear),1,’All Weeks’ ,dt.weeknumberofyear) AS weeknmyr,
DECODE (grouping(dg.englishcountryregionname),1,’All’,dg.englishcountryregionname) AS country,
DECODE (grouping(dg.stateprovincename),1,’All State”s’ ,dg.stateprovincename) AS state,
DECODE (grouping(dg.city),1,’All City”s’ ,dg.city) AS city,
DECODE (grouping(dp.englishproductname ),1,’All Products’,dp.englishproductname) AS product,
DECODE (grouping(prsubcat.englishproductsubcategoryname),1,’All Categories’
, prsubcat.englishproductsubcategoryname) AS productsubcat,
SUM( fs.salesamount) AS sales,
GROUPING_ID(fiscalyear, dt.fiscalquarter,dt.fiscalsemester
,dt.weeknumberofyear,englishcountryregionname,stateprovincename,city) AS GROUPID
FROM factsales fs
INNER JOIN dimdate dt
ON fs.orderdatekey = dt.datekey
INNER JOIN dimgeog dg
ON dg.geographykey = fs.salesterritorykey
INNER JOIN dimproduct dp
ON dp.productkey = fs.productkey
INNER JOIN dimproductsubcat prsubcat
ON prsubcat.productsubcategorykey = dp.productsubcategorykey
WHERE dp.englishproductname IN (‘Road Tire Tube’)
GROUP BY rollup(fiscalyear, dt.fiscalquarter,dt.fiscalsemester,dt.weeknumberofyear),
ROLLUP( prsubcat.englishproductsubcategoryname,dp.englishproductname),
ROLLUP(englishcountryregionname,stateprovincename,city);

Now we have the query we have build a cube using Materialized views

First step is to build materialized views logs

Second build materialized view

Script can be download from here

https://www.dropbox.com/s/9eljxv6p91s8hkz/materialize%20views.sql

oralce materialized view

Now we have the ROLAP cube ready we can query it using native sql

Querying ROLAP cube can be leveraged using GROUPING_ID output

ROLAP cube 1

Using grouping_id to query to ROLAP cube

Q) Retrieve total sales for all years  and country “Australia”, all city’s all states  for “Road Tire Tube”

To find group id

GROUPING_ID(FISCALYEAR, FISCALQUARTER, FISCALSEMESTER, WEEKNUMBER,COUNTRY, STATE, CITY,PRODUCTSUBCATEGORY, PRODUCTNAME)

1                                 1                                          1                                     1                    0                     1             1                         0                                               1

groupid query 1

Q)  Total Sales in 2008 , for Australia  and state New South Wales all city’s

groupid query 2

GROUPING SETS 

Grouping sets allow selectively group in to sets and can also be used with ROLLUP to customize result

grouping sets

grouping sets equavalent

Column information of all the tables in a database (SqlServer2008 R2, MySQL, Oracle)

 

Different RDBMS provides system views with information about all the columns and their definitions in the database tables 

 

SQLServer :  SYS.COLUMNS (Microsoft) 

                   INFORMATION_SCHEMA.COLUMNS (ANSI SQL)

Oracle       : ALL_TAB_COLUMNS (ANSI SQL)

My Sql       : INFORMATION_SCHEMA.COLUMNS (ANSI SQL)

 

Microsoft SQL server’s INFORMATION_SCHEMA.COLUMNS is equivalent to ALL_TAB_COLUMNS in oracle and INFORMATION_SCHEMA.COLUMNS in Mysql 

ALL_TAB_COLUMNS in oracle  is equivalent to  Microsoft SQL server’s INFORMATION_SCHEMA.COLUMNS and INFORMATION_SCHEMA.COLUMNS in Mysql 

INFORMATION_SCHEMA.COLUMNS in Mysql is equivalent to  ALL_TAB_COLUMNS in oracle  Microsoft SQL server’s INFORMATION_SCHEMA.COLUMNS and 

 

SQL Server 

select *  from INFORMATION_SCHEMA.COLUMNS

Oracle

select * from ALL_TAB_COLUMNS

 

My SQL 

select *  from INFORMATION_SCHEMA.COLUMNS

ISNULL , COALESCE in SQLSERVER, NVL in ORACLE, IFNULL() in MYSQL

as a developer and writing querys you offtn encounter in a situation where you want to replace a NULL in your result every RDBMS has its own system function to handle this today we will go through these functions in detail

We will start with SQL Server

In sql Server we have to functions one is ISNULL and other COALESCE

ISNULL is a bult in TSQL function develped by Microsoft , it replaces null value with scpecified value

ISNULL(Check expression (Column/variable/value), replacement expression(Column/Variable/value))

The replacement expression should be impilictly convertable to check expression data type, SQl server implicitly converts

replacement value to the data type of Check expression and returns the value if check expression is null

If we look at new column’s defnition it is a char datatype

However if you switch our replacement expression with check expression it throws an error saying

Msg 245, Level 16, State 1, Line 7

Conversion failed when converting the varchar value ‘a’ to data type int.

Which is not possible according to sqlserver

Sql server data type conversion chart

http://msdn.microsoft.com/en-us/library/ms191530.aspx

It can only have two input expressions

Where as if we look at COALESCE it takes multiple expressions and returns first not null value

The one big dfference between ISNULL and COALESCE is the return data type,

When using ISNULL if check expression is evaluated to NULL and return expression is NOT NULL and is different data type that of check expression SQL Server implictly converts return expression to the data type of check expressions

Where as COALESCE well it returns the first not null value but the data type it returns depends on the SQL Server data type precedence

http://msdn.microsoft.com/en-us/library/ms190309.aspx

Example:

SQL server data type precedence

http://msdn.microsoft.com/en-us/library/ms190309.aspx

about COALESCE

http://msdn.microsoft.com/en-us/library/ms190349(v=sql.105).aspx

NVL equavalent to ISNULL in SQLSERVER

NVL in ORACLE

NVL in Oracle works excatly as ISNULL in sql server

NVL (exp1 , exp2)

Exp1 is check expression and Exp2 is a replacement value

if Exp1 is NULL it returns Exp2 else it returns Exp1

Exp1 and Exp2 might not be same datatype just like in sql server but it should be implicitly convertble  to the datatype of Exp1

 

But this implicit conversion is done in a smarter way when compared to isnull

  • If expr1 is character data, then Oracle Database converts expr2 to the datatype of expr1 before comparing them and returns VARCHAR2 in the character set of expr1.
  • If expr1 is numeric, then Oracle determines which argument has the highest numeric precedence, implicitly converts the other argument to that datatype, and returns that datatype.

Numeric Precedence

Numeric precedence determines, for operations that support numeric datatypes, the datatype Oracle uses if the arguments to the operation have different datatypes. BINARY_DOUBLE has the highest numeric precedence, followed by BINARY_FLOAT, and finally by NUMBER. Therefore, in any operation on multiple numeric values:

  • If any of the operands is BINARY_DOUBLE, then Oracle attempts to convert all the operands implicitly to BINARY_DOUBLE before performing the operation.
  • If none of the operands is BINARY_DOUBLE but any of the operands is BINARY_FLOAT, then Oracle attempts to convert all the operands implicitly to BINARY_FLOAT before performing the operation.
  • Otherwise, Oracle attempts to convert all the operands to NUMBER before performing the operation.

 

 

Table 2-10

 

 

 

Implicit Type Conversion Matrix

 

Table 2-10 Implicit Type Conversion Matrix

CHAR VARCHAR2 NCHAR NVARCHAR2 DATE DATETIME/INTERVAL NUMBER BINARY_FLOAT BINARY_DOUBLE LONG RAW ROWID CLOB BLOB NCLOB
CHAR X X X X X X X X X X X X X
VARCHAR2 X X X X X X X X X X X X X
NCHAR X X X X X X X X X X X X X
NVARCHAR2 X X X X X X X X X X X X X
DATE X X X X
DATETIME/ INTERVAL X X X X X
NUMBER X X X X X X
BINARY_FLOAT X X X X X X
BINARY_DOUBLE X X X X X X
LONG X X X X X X X X
RAW X X X X X X
ROWID X X X
CLOB X X X X X X
BLOB X
NCLOB X X X X X X

Oracle Implicit Conversion Matrix

IFNULLI() in MY SQL

 

IFNULL() is equavalent to NVL() in oracle and ISNULL() in Microsoft Sql server

 

 

Data Cube: A Relational Aggregation Operator : Microsoft research paper

Very nice article on group by clause , how it works and what are its limitations and how can we overcome with a CUBE operator

Data analysis applications typically aggregate data across many dimensions looking for unusual patterns. The SQL aggregate functions and the GROUP BY operator produce zero-dimensional or 1-dimensional answers. Applications need the N-dimensional generalization of these operators. This paper defines that operator, called the data cube, or simply cube. The cube operator generalizes the histogram, cross-tabulation, drill-down, and sub-total constructs found in most report writers. The cube treats each of the N aggregation attributes as a dimension of N-space. The aggregate of a particular set of attribute values is a point in this space. The set of points form an N-dimensional cube. Super-aggregates are computed by aggregating the N-cube to lower dimensional spaces. Aggregation points are represented by an “infinite value”, ALL. For example, the point would represent the global sum of all items. Each ALL value actually represents the set of values contributing to that aggregation

Source :

http://research.microsoft.com/apps/pubs/default.aspx?id=69578

 

Download PDF file by clicking below link

MIcrosoft cube research