SQL server 2012 ,Introduction to Reporting Services 2012 – Chart Series: Areas

Data Visualization plays an important role in the BI, as we know the Reporting services is one of the popular BI reporting tool in the Microsoft BI stack

In this I will demonstrate how to use charts in SQL server reporting services , specially we are going to focus on Area charts in this post

The most important thing in using the different charts in reporting services is that you have to know when to use right type of charts, as same information can be represented with  different visual elements so all it matters is using the right visual element in order to clearly expose the data characteristics to the business users

Area charts can be used for  Time series relationships and Distribution relationships

Time series relationships : A series of data points taken at equal intervals of time , this is the most used BI chat types, eg : sales over certain period of time

Distribution relationships: When a set of values is displayed across their entire range, this is especially used for measuring process efficiency, like no days, products complete

now you learnt little bit about charts we will see different types of Area charts

Area, Smooth Area, Stacked Area, 100% Stacked Area, 3-D Area, 3-D Smooth Area, 3-D Stacked Area, 3-D 100% Stacked Area

This is the preview of modified report

Customizing report using report properties

ASCII() , CHAR() — String Functions

Before we go ahead about these functions first I will introduce little bit about ASCII

ASCII stands for American Standard Code for Information Interchange. Computers can only understand numbers, so an ASCII code is the numerical representation of a character such as ‘a’ or ‘@’ or an action of some sort. ASCII was developed a long time ago and now the non-printing characters are rarely used for their original purpose

These are divided into three categories listed below in the table

ASCII control characters (character code 0-31)

The first 32 characters in the ASCII-table are unprintable control codes and are used to control peripherals such as printers.

ASCII printable characters (character code 32-127)

Codes 32-127 are common for all the different variations of the ASCII table, they are called printable characters, represent letters, digits, punctuation marks, and a few miscellaneous symbols. You will find almost every character on your keyboard. Character 127 represents the command DEL.

 

 

 

 

 

 

 

 

 

The extended ASCII codes (character code 128-255)

There are several different variations of the 8-bit ASCII table. The table below is according to ISO 8859-1, also called ISO Latin-1. Codes 129-159 contain the Microsoft® Windows Latin-1 extended characters.

 

 

 

 

 

 

 

 

 

 

So the ASCII function in TSQL returns the (ASCII code)value of  the first character of string expression

Syntax ASCII( string expression)

Char function does this in reverse converts the ASCII code value  to an equivalent character

Syntax CHAR(integer expression)

Now we have the basics of the ASCII and CHAR let’s have some fun here

a simple task is we have to write a trigger which should not allow a  column in the  table to accept any invalid characters like “ã,ô,….” it should simply throw an error message saying invalid characters

if you look at the extended ascii codes table you will notice all the ASCII codes from  128 to 255  have invalid characters which we do not want them to be inserted in to the table

so now we have a table  DBO.PRODUCT_CATALOG and the column is the name

in this we will exact similar logic we just used in the above ASCII() , CHAR() demo

CODE:

create

TRIGGER PRODUCT_INVALIDCHAR ON dbo.PRODUCT_CATALOG INSTEAD OF INSERT,UPDATE

AS

BEGIN

DECLARE @POSITION INT= 0

DECLARE @STRING VARCHAR(100)=(SELECT NAME FROM inserted)

WHILE @POSITION <LEN(@STRING) BEGIN

if(ASCII(SUBSTRING(@STRING,@POSITION+1,@POSITION+2))>124

 AND

ASCII(SUBSTRING(@STRING,@POSITION+1,@POSITION+2))<= 255)

BEGIN

PRINT‘INVALID CHARACTER , PLEASE INSERT A VALID CAHRACTER’

RETURN

END

ELSE

BEGINSET @POSITION = @POSITION + 1

ENDEND

INSERT INTO dbo.PRODUCT_CATALOG

VALUES(@STRING)

END

http://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js

(adsbygoogle = window.adsbygoogle || []).push({});

Inserting Records and Getting Identity in SSIS

Hi ,

When one of my friend asked about how can we catch a Identity value after inserting the rows in to table  having a Identity column and using that result in the same package

well there is no out of box solution in SSIS  once the data is fed in to the OLEDB destination or SQL server destination it becomes a impossible to find inserted Identity values (Unless a trigger is used on the destination table) for these components outputs are ERROR redirection, which we can not use

Problem :

During the Loading phase of ETL it is so common to have identity columns , but it becomes to tough to find the Identity values after inserting and using them in the same Data flow

Solutions :

There are many work around solutions to perform this which I will list them , I will try to demonstrate the one way which I like

1) Using trigger on the destination table catching the inserted values in another table

2) Using the OLEDB command and a stored procedure which uses Output clause to find all Identity values or we can use Identity_scope (problem: dataflow will be terminated which makes the procedure to be the control flow of data (which makes no sense of using SSIS))

Famous MVP Todd Mcdermid  tries to explain the same problem in his view point please take a look

http://toddmcdermid.blogspot.com/2011/08/inserting-records-and-getting-identity.html

3)add a unique number to a comment field to find the rows just inserted and in other data flow task use the unique number to process them as required

As it is a RBAR(row by agonizing row) which is time consuming and it degrades the package performance when the number of rows are more

4)  Using the Record-set destination and taking the rows back to the control flow , and using the for each loop enumerator and catch the inserted Identity columns and use the values for further processing in SSIS

lets look at the example

In this session we will also learn how to use  Recordset destination

How to perform cusrsor action in Dataflow

how to use Foreach loop (Foreach ADO enumerator)

how to transfer data between control flow and dataflow

so looks like nice information session on SSIS so please stay with me

Parallelism in ETL Process :: SSIS 2008 and SSIS 2012

Hi, Today we will talk about achieving  parallelism in ETL using SSIS 2008 and 2012 We can generalize this topic two ways is by
 
1) Using maximum concurrent executable s in SSIS control flow and data flow( easy way to achieve parallelism and not so dynamic everythng should be configured in design)
 
2)Partitioning the source and destination ( little bit complicated but most power full way to achieve parallelism, everything is dynamic)
    1) There is a setting in the SSIS which allows the user to specify the maximum number of executable’s it can execute in a given package Max Concurrent executable
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2)Partitioning the source and destination ( little bit complicated but most power full way to achieve parallelism, everything is dynamic) 
In order achieve parallelism in ETL
 
1) Source and destination should be partitioned according the Data load
 
2) Entire work load should be divided to individual tasks
3) Each task should be sub divided into multiple tasks
4) Single unit of task should not not depend up on the other task(dead locks, I/O waits..)
 
5) before using the maximum threshold CPU for parallelism proper testing should be done, so there are always resources available for system processes Some times Parallelism may degrade the performance
 
6) If work load is divided in to to many small tasks which may also degrade the performance (ex: Time spent by the Worker thread to wait for child threads may exceed single  execution of the process ) Declare variables  Conclusion: Parallelism can be save lot of time in ETL process if utilized properly

Sequences in SQL server 2012 – Implementing,Managing, Performance

Today we will  learn a new feature in SQL server 2012 – Sequences

What is Sequence and Why do we need in a Relational database?

Sequence is a linear progression of any numeric value with a specified interval , well sequence is not a new concept for RDMS , Oracle has been using this feature for a while and SQL server manges to use the same concept with IDENTITY column and on the other side Oracle does not have the IDENTITY column feature

So what is the advantage of using Sequence Number over Identity columns ?

  • The application requires a number before the insert into the table is made.
  • The application requires sharing a single series of numbers between multiple tables or multiple columns within a table.
  • The application must restart the number series when a specified number is reached. For example, after assigning values 1 through 10, the application starts assigning values 1 through 10 again.
  • The application requires sequence values to be sorted by another field. The NEXT VALUE FOR function can apply the OVER clause to the function call. The OVER clause guarantees that the values returned are generated in the order of the OVER clause’s ORDER BY clause.
  • An application requires multiple numbers to be assigned at the same time. For example, an application needs to reserve five sequential numbers. Requesting identity values could result in gaps in the series if other processes were simultaneously issued numbers. Calling sp_sequence_get_range can retrieve several numbers in the sequence at once.
  • You need to change the specification of the sequence, such as the increment value.

How Sequence works ?

How to create Sequences in SQL Server 2012

create sequence

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

How to use sequence

 

 

 

 

 

 

 

 

 

 

 

 

 

Using Sequences

update [dbo].[product]

set ProductID = next value for [dbo].[Masterproductid]

select name,minimum_value,maximum_value,current_value

from sys.sequences

where name = ‘Masterproductid’

select max(ProductID) as productid from dbo.product

TSQL

CREATE SEQUENCE [schema_name . ] sequence_name
[ AS [ built_in_integer_type | user-defined_integer_type ] ]
[ START WITH <constant> ]
[ INCREMENT BY <constant> ]
[ { MINVALUE [ <constant> ] } | { NO MINVALUE } ]
[ { MAXVALUE [ <constant> ] } | { NO MAXVALUE } ]
[ CYCLE | { NO CYCLE } ]
[ { CACHE [ <constant> ] } | { NO CACHE } ]
[ ; ]

Performance sequences

 

 

 

 

 

 

 

 

 

Limitations

  • Unlike identity columns, whose values cannot be changed, sequence values are not automatically protected after insertion into the table. To prevent sequence values from being changed, use an update trigger on the table to roll back changes.
  • Uniqueness is not automatically enforced for sequence values. The ability to reuse sequence values is by design. If sequence values in a table are required to be unique, create a unique index on the column. If sequence values in a table are required to be unique throughout a group of tables, create triggers to prevent duplicates caused by update statements or sequence number cycling.
  • The sequence object generates numbers according to its definition, but the sequence object does not control how the numbers are used. Sequence numbers inserted into a table can have gaps when a transaction is rolled back, when a sequence object is shared by multiple tables, or when sequence numbers are allocated without using them in tables. When created with the CACHE option, an unexpected shutdown, such as a power failure, can lose the sequence numbers in the cache.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

code

–@ SQL JUNKEI SHARE
–TABLE TO TEST THE PERFORMANCE OF SEQUENCES
CREATE TABLE [dbo].[SEQUENCEPRODUCT](
[ProductID] [int] NULL,
[Name] [nvarchar](50) NULL,

) ON [PRIMARY]

GO

ALTER TABLE [dbo].

[SEQUENCEPRODUCT] ADD CONSTRAINT [DEFSEQFORPRODUCT1] DEFAULT (NEXT VALUE FOR [MASTERPRODUCTID]) FOR [ProductID]
GO

ALTER SEQUENCE MASTERPRODUCTID

RESTART WITH 1;

–TABLE TO TEST PERFORMNACE OF IDENTITY
CREATE TABLE [dbo].[IDENTPRODUCT](
[ProductID] INT IDENTITY (1,1) ,
[Name] [nvarchar](50) NULL,

) ON [PRIMARY]

GO
CREATE TABLE SOURCEPRODUCT (NAME NVARCHAR(50))
–INSERT TEST DATA

GO
CREATE PROCEDURE INSERTIDENT
AS
BEGIN
INSERT INTO IDENTPRODUCT(NAME)
SELECT NAME FROM SOURCEPRODUCT
END
GO
CREATE PROCEDURE INSERTSEQUENCE
AS
BEGIN
INSERT INTO SEQUENCEPRODUCT(NAME)
SELECT NAME FROM SOURCEPRODUCT
END

SELECT COUNT(*) AS ‘NUMBER OF TEST ROWS’ FROM SOURCEPRODUCT
–4943950
DECLARE @STARTTIME DATETIME = GETDATE()
EXEC INSERTIDENT
SELECT DATEDIFF(SS,@STARTTIME,GETDATE()) AS ‘ENDTIME IDENTITY’
GO
DECLARE @STARTTIME DATETIME = GETDATE()
EXEC INSERTSEQUENCE
SELECT DATEDIFF(SS,@STARTTIME,GETDATE()) ‘ENDTIME SEQUENCE’
GO

— USING IN INSERT STATEMENT
INSERT INTO PRODUCT (PRODUCTID,NAME)

VALUES (NEXT VALUE FOR DBO.MASTERPRODUCTID,’TV’)
–USING IN SELECT WITH ORDER CLAUSE
SELECT NEXT VALUE FOR DBO.MASTERPRODUCTID OVER (ORDER BY NAME) AS ORDERPROD

FROM PRODUCT

–USING IN UPDATE
UPDATE DBO.PRODUCT

SET PRODUCTID = NEXT VALUE FOR DBO.MASTERPRODUCTID
–SEQUENCE IN TRANSACTIONS
—NO EFFECT ON SEQUENCE LIKE IDENTITY
SELECT CURRENT_VALUE FROM SYS.SEQUENCES WHERE NAME = ‘MASTERPRODUCTID’

BEGIN TRAN

INSERT INTO PRODUCT (PRODUCTID,NAME)

VALUES (NEXT VALUE FOR DBO.MASTERPRODUCTID,’TV’)
ROLLBACK TRAN

SELECT CURRENT_VALUE FROM SYS.SEQUENCES WHERE NAME = ‘MASTERPRODUCTID’

SELECT NAME FROM PRODUCT WHERE NAME = ‘TV’
–SQL JUNKIE SHARE

— REPLACING THE IDENTITY WITH SEQUENCES BY MODIFYING THE DEFAULT CONSTRAINT

—-CHECK CURRENT MAX VALUE IN PRODUCTS TABLE

SELECT MAX(PRODUCTID) AS MAXPRODUCTID FROM PRODUCT

— CURRENT VALUE FOR THE SEQUENCE

SELECT CURRENT_VALUE ‘BEFORE INSERT’ FROM SYS.SEQUENCES WHERE NAME =’MASTERPRODUCTID’
–MODIFY THE SEQUENCE TO STRAT FROM 504

ALTER SEQUENCE MASTERPRODUCTID

RESTART WITH 505;

–CHECK THE CURRENT VALUE FOR SEQUENCE

SELECT CURRENT_VALUE FROM SYS.SEQUENCES WHERE NAME =’MASTERPRODUCTID’

— MODIFY THE TABLE FOR DEFAULT CONSTRAINT

ALTER TABLE PRODUCT

ADD CONSTRAINT DEFSEQFORPRODUCT DEFAULT (NEXT VALUE FOR MASTERPRODUCTID) FOR PRODUCTID

–TEST IT BY INSERTING A VALUE
INSERT INTO PRODUCT (NAME)

VALUES (‘FOO’)

–CHECK PRODUCT AND SEQUENCES

SELECT PRODUCTID AS ‘PRODUCT ID AFTER INSERT’ FROM PRODUCT WHERE NAME = ‘FOO’

SELECT CURRENT_VALUE AS ‘CURRENTVALUE AFTER INSERT IN PRODUCT’ FROM SYS.SEQUENCES WHERE NAME =’MASTERPRODUCTID’

THROW statement TSQL 2012

Today,

We will go through the THROW error handling technique which Microsoft is introducing in SQL server 2012

at first we will be looking at RAISERROR () and will compare this with THROW and find out the difference and advantages , disadvantages

Highlights of RAISERROR()

  • RAISERROR allows you to throw an error based on either an error number or a message, and you can define the severity level and state of that error:
    1
    2
    3
    RAISERROR(50001, 16, 1);
    --or
    RAISERROR('Ooops', 16, 1);
  • If you call RAISERROR with an error number, that error number has to exist in sys.messages.
  • You can use error numbers between 13001 and 2147483647 (it cannot be 50000) with RAISERROR.

RAISERROR can return either:

  • A user-defined error message that has been created using the sp_addmessage system stored procedure. These are messages with a message number greater than 50000 that can be viewed in the sys.messages catalog view.
  • A message string specified in the RAISERROR statement.

RAISERROR can also:

  • Assign a specific error number, severity, and state.
  • Request that the error be logged in the Database Engine error log and the Microsoft Windows application log.
  • Substitute argument values into the message text, much like the C language printf_s function.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Drawbacks of RAISERROR()

  • One of the biggest draw back of the RAISERROR() is the in ability  of re throwing the error in catch block of TRY-CATCH  error handling technique introduced in SQL SERVER 2005

RAISERROR() in TRY-CATCH

 

BEGIN TRY
    -- RAISERROR with severity 11-19 will cause execution to
    -- jump to the CATCH block.
    RAISERROR ('Error raised in TRY block.', -- Message text.
               16, -- Severity.
               1 -- State.
               );
END TRY
BEGIN CATCH
    DECLARE @ErrorMessage NVARCHAR(4000);
    DECLARE @ErrorSeverity INT;
    DECLARE @ErrorState INT;

    SELECT
        @ErrorMessage = ERROR_MESSAGE(),
        @ErrorSeverity = ERROR_SEVERITY(),
        @ErrorState = ERROR_STATE();

    -- Use RAISERROR inside the CATCH block to return error
    -- information about the original error that caused
    -- execution to jump to the CATCH block.
    RAISERROR (@ErrorMessage, -- Message text.
               @ErrorSeverity, -- Severity.
               @ErrorState -- State.
               );
END CATCH;
   we lost whole point of catching with a fair level of complexity in the catch block

XACT_ABORT

As every T-SQL programmer worth his (or her) salt should know, an exception does not roll back a transaction by default

I.e. the following code would cause two rows to be inserted in the table


when using raise error to throw the same error lets see what happens

keeping these things in mind Microsoft has introduced a new technique to replace the RAISERROR()

THROW SYNTAX THROW

[ { error_number | @local_variable }, { message | @local_variable },     { state | @local_variable } ] [ ; ] error_number

Is a constant or variable that represents the exception. error_number is int and must be greater than or equal to 50000 and less than or equal to 2147483647.

message

Is an string or variable that describes the exception. message is nvarchar(2048).

state

Is a constant or variable between 0 and 255 that indicates the state to associate with the message. state is tinyint.

  • The statement before the THROW statement must be followed by the semicolon (;) statement terminator.
  • If a TRY…CATCH construct is not available, the session is ended. The line number and procedure where the exception is raised are set. The severity is set to 16.
  • If the THROW statement is specified without parameters, it must appear inside a CATCH block. This causes the caught exception to be raised. Any error that occurs in a THROW statement causes the statement batch to be ended.

Three Main Differences Between RAISERROR and THROW

The following table lists differences between the RAISERROR and THROW statements.

RAISERROR statement THROW statement
If a msg_id is passed to RAISERROR, the ID must be defined in sys.messages. The error_number parameter does not have to be defined in sys.messages.
The msg_str parameter can contain printf formatting styles. The message parameter does not accept printf style formatting.
The severity parameter specifies the severity of the exception. There is no severity parameter. The exception severity is always set to 16.

as we can see this how error can be re thrown in try catch block One good feature that I like in RAISERROR is Custom formatting the error message as shown below the simple one with substitution arguments please use the below link http://msdn.microsoft.com/en-us/library/ms178592(v=SQL.110).aspx

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

References:

http://msdn.microsoft.com/en-us/library/ms186788(v=SQL.110).aspx

http://msdn.microsoft.com/en-us/library/ms177497(v=SQL.105).aspx

http://msdn.microsoft.com/en-us/library/ms189583.aspx

SSIS package logging – Custom logging 2008 R2 and 2012

Logging is an important part of SSIS package development , I personaly prefer to have logging in all production level packages

Logging can be used as more than debugging errors in a scheduled package execution environment but also performs the audit functions it needs and we can also measure

performance, Out of the box SSIS 2008 and SSIS 2012 have some nice logging capabilities

The following table describes the predefined events that can be enabled to write log entries when run-time events occur. These log entries apply to executables, the package, and the tasks and containers that the package includes. The name of the log entry is the same as the name of the run-time event that was raised and caused the log entry to be written.

Events Description
OnError Writes a log entry when an error occurs.
OnExecStatusChanged Writes a log entry when the execution status of the executable changes.
OnInformation Writes a log entry during the validation and execution of an executable to report information.
OnPostExecute Writes a log entry immediately after the executable has finished running.
OnPostValidate Writes a log entry when the validation of the executable finishes.
OnPreExecute Writes a log entry immediately before the executable runs.
OnPreValidate Writes a log entry when the validation of the executable starts.
OnProgress Writes a log entry when measurable progress is made by the executable.
OnQueryCancel Writes a log entry at any juncture in the task processing where it is feasible to cancel execution.
OnTaskFailed Writes a log entry when a task fails.
OnVariableValueChanged Writes a log entry when the value of a variable changes.
OnWarning Writes a log entry when a warning occurs.
PipelineComponentTime For each data flow component, writes a log entry for each phase of validation and execution. The log entry specifies the processing time for each phase.
Diagnostic Writes a log entry that provides diagnostic information.For example, you can log a message before and after every call to an external data provider

there are some custom logging messages to some of the control flow tasks and data flow tasks

Data Flow Task

The following table lists the custom log entries for the Data Flow task.

Log entry Description
BufferSizeTuning Indicates that the Data Flow task changed the size of the buffer. The log entry describes the reasons for the size change and lists the temporary new buffer size.
OnPipelinePostEndOfRowset Denotes that a component has been given its end-of-rowset signal, which is set by the last call of the ProcessInput method. An entry is written for each component in the data flow that processes input. The entry includes the name of the component.
OnPipelinePostPrimeOutput Indicates that the component has completed its last call to the PrimeOutput method. Depending on the data flow, multiple log entries may be written. If the component is a source, this means that the component has finished processing rows.
OnPipelinePreEndOfRowset Indicates that a component is about to receive its end-of-rowset signal, which is set by the last call of the ProcessInput method. An entry is written for each component in the data flow that processes input. The entry includes the name of the component.
OnPipelinePrePrimeOutput Indicates that the component is about to receive its call from the PrimeOutput method. Depending on the data flow, multiple log entries may be written.
OnPipelineRowsSent Reports the number of rows provided to a component input by a call to the ProcessInput method. The log entry includes the component name.
PipelineBufferLeak Provides information about any component that kept buffers alive after the buffer manager goes away. This means that buffers resources were not released and may cause memory leaks. The log entry provides the name of the component and the ID of the buffer.
PipelineExecutionPlan Reports the execution plan of the data flow. It provides information about how buffers will be sent to components. This information, in combination with the PipelineExecutionTrees entry, describes what is occurring in the task.
PipelineExecutionTrees Reports the execution trees of the layout in the data flow. The scheduler of the data flow engine use the trees to build the execution plan of the data flow.
PipelineInitialization Provides initialization information about the task. This information includes the directories to use for temporary storage of BLOB data, the default buffer size, and the number of rows in a buffer. Depending on the configuration of the Data Flow task, multiple log entries may be written.

for example in a data flow task you can also log some performance counters to measure the buffer allocation and buffer leakage during the transformations

to know more about custom logging messages in SSIS click the link below

http://msdn.microsoft.com/en-us/library/ms345174.aspx

Why do we need custom logging using event handlers?

Some times it becomes overhead by using the default log providers and it often ends up using the event handlers and system variables to log some key information which some times difficult to analyze using default logging

in order to perform the logging using event handlers we need to better understand the system variables and information they can provide


The following table describes the system variables that Integration Services provides for packages.

System variable Data type Description
CancelEvent Int32 The handle to a Windows Event object that the task can signal to indicate that the task should stop running.
CreationDate DateTime The date that the package was created.
CreatorComputerName String The computer on which the package was created.
CreatorName String The name of the person who built the package.
ExecutionInstanceGUID String The unique identifier of the executing instance of a package.
InteractiveMode Boolean Indicates whether the package is run in interactive mode. If a package is running in SSIS Designer, this property is set to True. If a package is running using the DTExec command prompt utility, the property is set toFalse.
LocaleId Int32 The locale that the package uses.
MachineName String The name of the computer on which the package is running.
OfflineMode Boolean Indicates whether the package is in offline mode. Offline mode does not acquire connections to data sources.
PackageID String The unique identifier of the package.
PackageName String The name of the package.
StartTime DateTime The time that the package started to run.
UserName String The account of the user who started the package. The user name is qualified by the domain name.
VersionBuild Int32 The package version.
VersionComment String Comments about the package version.
VersionGUID String The unique identifier of the version.
VersionMajor Int32 The major version of the package.
VersionMinor Int32 The minor version of the package.

please use the below link to better understand the system variables

http://msdn.microsoft.com/en-us/library/ms141788(v=SQL.105).aspx

lets do some  custom logging using event handlers for a simple package

before digging into more details lets create the table to log the details

as

CREATE TABLE [DBO].[PACKAGELOG](
[LOGID] [INT] IDENTITY(1,1) NOT NULL,
[EXECUTIONID] [UNIQUEIDENTIFIER] NULL,
[PACKAGENAME] [NVARCHAR](255) NULL,
[PACKAGEID] [VARCHAR](50) NULL,
[USERNAME] [NVARCHAR](100) NULL,
[MACHINENAME] [NVARCHAR](255) NULL,
[STARTDATETIME] [DATETIME] NULL,
[ENDDATETIME] [DATETIME] NULL,
[SOURCENAME] [VARCHAR](255) NULL,
[MESSAGE] [NVARCHAR](MAX) NULL,
[MESSAGECODE] [INT] NULL,
[LOGDATETIME] [DATETIME] NULL,
[INTERACTIVEMODE] [BIT] NULL,
[SOURCEDESCRIPTION] [VARCHAR](255) NULL,
[TASKNAME] [VARCHAR](255) NULL,
[TASKID] [NVARCHAR](255) NULL,
[EXECSTATUS] [BIT] NULL,
[SERVERNAME] [NVARCHAR](255) NULL,
CONSTRAINT [PK__PACKAGEL__5E5499A820C44329] PRIMARY KEY CLUSTERED
(
[LOGID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [DBO].[PACKAGELOG] ADD CONSTRAINT [DF_PACKAGELOG_EXECSTATUS] DEFAULT ((0)) FOR [EXECSTATUS]
GO

Note : there are some noted issues while using the system:: start time system variable and mapping to any SSIS date time data type

The reason is the value of the system start time looks like  this SYSTEM::STARTTIME  {18/10/2010 14:30:27}

which can not be mapped to any SSIS date time data types we will be getting a error

“The type is not supported.DBTYPE_DBDATE”. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly

To avoid this error we have a work around is to create a variable with string and use a expression and setting the variable value to be evaluate as expression and get the parsed System start time from the variable and map that variable to the execute SQL task

as

(DT_WSTR, 4)YEAR(@[System::StartTime])

+ “-” +
RIGHT(“0” + (DT_WSTR, 2)MONTH(@[System::StartTime]), 2)

+ “-” +

RIGHT(“0″ + (DT_WSTR, 2)DAY(@[System::StartTime]), 2)

+ ” ” +

RIGHT(“0” + (DT_WSTR, 2)DATEPART(“Hour”, @[System::StartTime]), 2)

+ “:” +

RIGHT(“0” + (DT_WSTR, 2)DATEPART(“Minute”, @[System::StartTime]), 2)

+ “:” +

RIGHT(“0” + (DT_WSTR, 2)DATEPART(“Second”, @[System::StartTime]), 2)

+ “.” +

RIGHT(“00” + (DT_WSTR, 3)DATEPART(“Millisecond”,@[System::StartTime]), 3)

insert into dbo.packagelog (ExecutionID,packagename,packageid,username,machinename,interactivemode,sourcename,taskname,taskid,Message,MessageCode,execstatus,servername,startdatetime,enddatetime)
values (?,?,?,?,?,?,?,?,?,?,?,((0)),@@SERVERNAME,?,getdate())

as

log

success log

there are some nice free third party tools to dig the information from the log graphically like log analyzer

http://ssisloganalyzer.codeplex.com/

SSIS Script task and Components (Tool Kit) using – C#

With the introduction of C#, and the embedding of the new Visual Studio Tools for Applications into SSIS, you can ’ t think of using the Script Task and Script Component as scripting anymore; now it ’ s all – out programming. Typically, you ’ d code logic into these Script Tasks to control the
execution and logic flow within a package or to perform some specialized business validation.

The two Scripting Components provide access into a new scripting development environment using Microsoft Visual Studio Tools for Applications (VSTA). This change finally allows us to script logic into packages using Visual Basic 2008 or Visual C# 2008 .NET code.

It is not necessary to know the C# universe in order to write valuable C# script tasks and components . Keep in mind that it is being used as a scripting language, not a full-fledged application development language.

The ability to perform the following basic operations will allow you to do quite a lot.

  • Access SSIS variables
  • Modifying SSIS variables
  • File validation Task (File properties task)
  • Read/write ASCII files
  • Parse strings with the split function
  • Create and loop through list arrays
  • Copy, move and delete files
  • Capture a listing of specified files in a sub directory
  • Create a database connections to SQL Server
  • Execute T-SQL queries and stored procedures and capture any results

Our first task is  accessing  variables

  1. Access SSIS variables

create a variable

drag and drop a script task in control flow

Double click on the variables and use the below shown settings

click on edit script ,once you click the edit script button a window pops pup (Microsoft Visual Studio Tools for Applications (VSTA))

Result

as

Code:

string str = null;

Dts.VariableDispenser.LockForRead(“User::user”);

str = (string) Dts.Variables[“user”].Value;

MessageBox.Show(str);
Dts.TaskResult = (int)ScriptResults.Success;

2. Modifying variables in SSIS using script task

click on the script task and remove the variable “user” as read only variable and assign it to the read write variable

Modifying the variable

string str = null;

Dts.VariableDispenser.LockForWrite(“User::user”);

str = (string)Dts.Variables[“user”].Value;

MessageBox.Show(str,”variable value before changing “);

Dts.Variables[“user”].Value = “Sql junkie share”;

str = (string)Dts.Variables[“user”].Value;

MessageBox.Show(str,”variable value after changing “);

3. File Validation Task 

In any of the ETL processes when dealing with flat files, some times there will be a need to check the existence of the file so lets create a simple file validation task using C#

create a  variable “file exits” in  control flow and assign default value as False

as

Click the button Edit Script

Script

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;
namespace ST_ad163f48be91459c9c185a38bccd8436.csproj
{
    [System.AddIn.AddIn(“ScriptMain”, Version = “1.0”, Publisher = “”, Description = “”)]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion
        public void Main()
        {
            string filelocation = “D:\\New Text Document.txt”;
            if (File.Exists(filelocation) == true)
            {
                Dts.Variables[“fileexists”].Value = true;
            }
            else
            {
                Dts.Variables[“fileexists”].Value = false;
            }
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}

File exists

File did not exist

Success

Script Component

  1. Read/write ASCII files
  2. Parse strings with the split function

we will use the earlier package , in this section  we will learn how to create a flat file source connection using script component and read the contents of the flat file and push the columns into the data flow task and  parse the contents of the text using split function

Replace the file exists part with a Data flow task , in the data flow task drag and drop a script component , click on the script component and select the component type as Source

and the flat file we will be dealing with here  looks like this

1,john,apt#3 – California USA\
2,tom,st 112 ca;lifornia USA

Task 

  • Read the flat file
  • Separate the columns which are delimited by ‘,’
  • and parse the address column and remove the special characters (apt#3 – California USA\) to (apt 3  California USA)

back to the source component

and in the script section select the scripting language as C# and click the button edit script

Code:
using System;
using System.Data;
using System.IO;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Text;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

public override void CreateNewOutputRows()
{

StreamReader reader = new StreamReader(“D:\\New Text Document.txt”);

string line;

while ((line = reader.ReadLine()) != null)
{
string[] columns = line.Split(‘,’);

this.Output0Buffer.AddRow();
if (columns.Length > 0)
{
Output0Buffer.id = (columns[0]);
}
if (columns.Length > 1)
{
Output0Buffer.name = (columns[1]);
}
if (columns.Length > 2)
{

string[] split = columns[2].Split(new Char[] { ‘#’, ‘,’, ‘.’, ‘:’, ‘-‘, ‘\\’, ‘;’ });

Output0Buffer.address = ConvertStringArrayToString(split);

}
}
}
static string ConvertStringArrayToString(string[] array)
{
//
// Concatenate all the elements into a StringBuilder.
//
StringBuilder builder = new StringBuilder();
foreach (string value in array)
{
builder.Append(value);
}
return builder.ToString();
}
}

Output