For XML PATH ROOT in SQL Server 2008

Now and Then I used to observe that when it comes to XML, .NET and SQL Developer start fearing. This is not because of It is VERY DIFFICULT but because of unawareness. Today I am going to show you the use of  “FOR XML PATH”  clause in SQL Server.
Suppose we have ID, FirstName and LastName columns in one of our SQL Server table and we want output something like this:
<Employees>
  <Employee>
    <ID value=1 />
    <FirstName value=Ritesh />
    <LastName value=Shah />
  Employee>
  <Employee>
    <ID value=2 />
    <FirstName value=Rajan />
    <LastName value=Jain />
  Employee>
Employees>
Let me give you TSQL to generate table in SQL Server and insert records in that.
Create Table EmployeeData
(
      ID INT Identity(1,1),
      FirstName varchar(10),
      LastName varchar(10)
)
insert into EmployeeData
select ‘Ritesh’,‘Shah’ UNION ALL
select ‘Rajan’,‘Jain’
GO
So now, here is the use of “FOR XML PATH ROOT” clause which will make our life easier in this kind of situation.
select ID as “ID/@value”
     , FirstName as “FirstName/@value”
     , LastName as “LastName/@value”
from (
   Select * from EmployeeData
) as t
for xml path(‘Employee’), root(‘Employees’);
Isn’t it easy to use?
BTW, below given are some links which will redirect you to my some of the past articles on XML subject.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Character Map Transformation task in SSIS 2008

When it comes to string formatting in SSIS, “Character Map” transformation task is one of the very useful tasks. It generally used to convert data to Uppercase, lowercase, byte reversal etc. let us see one small demo for the same.
Before we start our package development, let us first create one small table in SQL Server and insert some data into it by following TSQL Script.
create table Employee
(
      FirstName varchar(10),
      LastName varchar(10)
)
insert into Employee
select ‘rITesH’,‘ShAh’ union all
select ‘Rajan’,‘SHAH’
GO
Now, create one new project in BIDS (Business Intelligence Development Studio) for SSIS package. Drag “Data Flow” task on your “Control Flow” tab and double click on that “Data Flow” to configure it which will forward you to “Data Flow” tab.
Drag “Ado Net Source” from tool box and drop it into your “Data Flow” tab.  Double click on “Ado Net Source” to configure it and select your Server, database and table, “Employee” table in our case.
Now, drag “Character Map” transformation task to your work area and join it with extended green arrow coming from “Ado Net Source”. Now double click on “Character Map” transformation task to configure it. We are going to convert FirstName column to UPPERCASE and LastName column to LOWERCASE. Look at screen capture below for more information.
Once you finish configuring “Character Map” transformation task, drag “Ado Net Destination” from tool box to right below “Character Map” transformation task and join it with extended green arrow coming from “Character Map” transformation task.
Double click on “Ado Net Destination” to configure it and select your SQL server, Database and table (we are going to use same Employee table as destination too) .
Once you finish all these steps, you are ready to launch your package. Hit F5 to run the package and see EMPLOYEE table in SQL Server whether it has two more rows with different case of alphabet.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Fuzzy lookup in SSIS 2008 to keep data integrity

“Human makes mistakes” which is quite obvious. While making a data entry it is possible to make typo but as a database professional, it is our duty to keep data consistent.  Fuzzy Lookup is helpful in this case. Before we start making package in SSIS, let us have some pre-preparation for that. We are going to create one source table (it could be any source like excel, csv file etc. but we are making it in SLQ Server), one reference table which is guaranteed to have proper data. Here is the TSQL to create, source and reference, table and insert some dummy data.
create table fuzzyLookupSource
(
      firstName varchar(10),
      LastName varchar(10),
      BirthDate datetime
)
insert into fuzzyLookupSource
select ‘Rites’,‘Shah’,’02/07/1980′ union all
select ‘Rajen’,‘Shah’,’03/31/1983′ union all
select ‘Dharmesh’,‘Kalaria’,’04/09/1980′  union all
select ‘Jesica’,‘Cruize’,’05/05/1980′  union all
select ‘Roger’,‘Moore’,’04/15/1980′
GO
create table fuzzyLookupReference
(
      firstName varchar(10),
      LastName varchar(10),
      BirthDate datetime
)
insert into fuzzyLookupReference
select ‘Ritesh’,‘Shah’,’02/07/1980′ union all
select ‘Rajan’,‘Shah’,’03/31/1983′ union all
select ‘Jessica’,‘Cruise’,’06/05/1980′  union all
select ‘Dharmesh’,‘Kalaria’,’04/09/1980′
GO
Observe the data in both table, in first, source, table, there are some typos which you can compare with your second, reference, tables and get the purified data.
Anyway, once you are ready with both the tables , create one new project in BIDS (Business Intelligence Development Studio) and drag one “DataFlow” task from tool box to your “Control Flow” tab. Double click on “DataFlow” task to configure it so that it would redirect you to “DataFlow” tab.
Now, create one “Ado Net Source” which will refer our “fuzzyLookupSource” table in sql server database. Double click on “Ado Net Source” to configure it and look at below image to have crystal clear idea about its configuration.

Now,  drag “fuzzy lookup” transformation task below your “Ado Net Source” and connect extended green arrow from “Ado Net Source” to your fuzzy lookup. Double click on “Fuzzy Lookup” task to configure it.

In “Reference Table” tab, give reference of your database and our reference table which is “FuzzyLookupReference” in our case. Look at image below for more idea.

Click on “Columns” tab to configure which column to check with reference from source table and select “firstName” and “lastName” column and connect it so that our fuzzy lookup task will compare these two fields from source to reference table.

Once you configure “columns”, you have to click on “Advanced” tab, you can set “Similarity Threshold” which will give you how much identical both fields are…. If it is 1 than it is perfect match, if it is 0 than no match or data not present in reference table so more near to 1, good match it is. We are not going to take any decision like if it is greater than .50 then do this otherwise do that so it would be ok if you don’t change “Similarity Threshold”.

Now, drag “SQL Server Destination” task so that this matched and unmatched data could fall in SQL Server table, though we have not created any SQL Server table for this so far. Connect green extended arrow from “Fuzzy Lookup” transformation task to “SQL Server Destination” task. Before we configure “SQL Server Destination” we would like to do one more thing. Double click on GREEN arrow between Fuzzy Lookup task and SQL Server destination task.

We would like to see data in grid while running this package and before it fall into our destination table, we are going to specify this only now.
As soon as you click on green arrow, it will open “Data Flow Path Editor”, click on “Data Viewer” tab and click on “Add” button to add “Grid”.

Now, double click on “SQL Server Destination” task to configure it. Give details of your SQL Server and database into “Connection Manager” name. since we don’t have destination table already created for our data, we are going to click on “New” button besides “Use a table or view” property which will create one destination table in our SQL Server.

Now you are ready to run your package, hit F5 to run it, when it crosses Fuzzy Lookup Task, it will show you data in grid, check it and click on “Green Arrow” above the grid in same dialog box so that data falls into our SQL Server table.

you can later on check the same data into SQL Server by executing TSQL Query, while generating new table in SQL Server, if you didn’t have rename the table, it would be by default [SQL Server Destination]. So you can execute something like

SELECT * FROM [SQL Server Destination]
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Derived Column and Union All Transformation task in SSIS 2008

Many times when we are trying to get data from different sources, there is a possibility to have same key fields. To eliminate this kind of problem, “Derived Column” transformation task is very useful. For example one company has factory at one location and back office work performed from other office. Both locations has their own employee table with IDENTITY as a primary key. Obviously Identity key generally starts with 1 by default so both location would have 1,2,3 etc. in their employee table. When you try to move both the table at one single destination, you might come across with situation where employee ID 1 would have two employee and sample with many more IDs. To eliminate this situation, you can have one more column which is not exists in source table which describes where this data come from so you have unique combination of that new column and an existing EmployeeID column.
Let us see how we could perform this. Before we head towards SSIS package development, let us first create two different table in SQL Server and one blank table as a destination of these both table. You can use below script to generate those table with/without data in your SQL Server database.
create table ForDerivedColumn1
(
      EmpID INT Identity(1,1),
      EmpName varchar(10)
)
INSERT INTO ForDerivedColumn1
SELECT ‘Ritesh’ union all
SELECT ‘Rajan’
GO
create table ForDerivedColumn2
(
      EmpID INT Identity(1,1),
      EmpName varchar(10)
)
INSERT INTO ForDerivedColumn2
SELECT ‘Rushik’ union all
SELECT ‘Rajvi’
GO
create table ForDerivedColumnNewTable
(
      SourceID INT,
      EmpID INT,
      EmpName varchar(10)
)
GO
Now, open BIDS (Business Intelligence Studio) and create new SSIS project. Draw dataflow task in your “Control Tab” and double click on it to configure it which will redirect you to “Data Flow” tab. Since we have two different SQL Server table as an source, let us first create two “Ado Net Source” in our work place and select your database and table as a source. Below screen shot will show you how to configure “Ado NET Source”. I am showing example for FIRST “ADO NET Source” you have to do it for second “ADO NET Source” too with table name “ForDerivedColumn2”.
Now take two “Derived Column” transformation task and put them under “ADO NET Source” task, double click on it to configure.

Above screen capture showing you configuration for first “DerivedColumn” do it for second “DerivedColumn” task too with value 2 in SourceID column.

Once we have our dataset ready with Derived Column, we have to merge both dataset with “UnionALL” Transformation task. “UnionAll” work exactly same as “UNION ALL” in TSQL. So drag “UnionAll” transformation task below both “DerivedColumn” task and join extended green arrow from both “DerivedColumn” to “UnionAll” task.
Now, this is a time to convert the data in proper format with “DataConversion” transformation task as many time from different sources, same kind of data coming with different datatype, let us convert it with “DataConversion” task by putting it under “UnionAll” task and join extended green arrow from “UnionAll” task to “DataConversion”.

Now, we have to perform our last steps. Data is ready to go anywhere but we have to tell where it suppose to go. Since we want it to insert it in our SQL Server table, we are going to use “SQL Server Destination” task. Double click on it to configure it. Give your database and table reference to it as below image.

Once you give reference of your table, you have to go to “Mappings” tab in the same dialog box and set the mappings there so that proper data get inserted in proper place. Look at the below image for more detailed information.

Finally you are done with package now run it by hitting F5 key and check your database table whether you have actually received the data inside or not. After running if you get “Green Signal” in every task, J you are done.

Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com

Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Merge Join Transformation Task in SSIS 2008

SSIS is the very crucial tool for Data Warehouse and collecting data from different sources and merge it into single master source is the main task in Data Warehouse.  Merge Join Transformation task is one of the useful tool in SSIS to achieve this. Let us see how it works.
Well, before we start developing package, let us create two different data sources where data come from in single source. We are also going to create one blank table which will hold the data come from two different sources.
Our first source will be SQL Server table.
Our second source will be Flat file which is comma separated.
And the destination of the both sources would be in again SQL Server single table.
Script for First Source:
Create Table OrderMaster
(
      OrderID Varchar(5),
      ClientCode Varchar(10),
      ProjectName varchar(25),
      OrderDate DateTime
)
INSERT INTO OrderMaster
SELECT ‘A1001’,‘CHEM02’,‘NY WaterField’,GETDATE() UNION ALL
SELECT ‘A1002’,‘ACCU01’,‘Plainfield Soil’,GETDATE()
GO
Create table in SQL Server database and insert data in that table via above given script.
Second Source:
Create one text file in D drive with name OrderDetails.txt with following data.
OrderID,SampleNumber,Matrix
A1001,A1001-01,Water
A1001,A1001-02,Soil
A1002,A1002-01,Water
A1002,A1002-02,Water
First row in the above data is column name.
Now, let us create destination table.
Create Table OrderMasterAndDetail
(
OrderID Varchar(5),
ClientCode Varchar(10),
SampleNumber Varchar(10),
Matrix Varchar(10)
)
GO
Ok, now we are ready to start developing SSIS package. Open new project in BIDS (Business Intelligence Development Studio). Drag “Data Flow” task from tool box to your “Control Flow” tab and double click on it to configure. As soon as you will double click on that, you will be redirected to the “Data Flow” tab. Now, drag  “ADO Net Source” and “Flat File Source” task to your “Data Flow”. Let us now configure them.
Double click on “Ado Net Source” to configure it and select your database and table to configure it. Look at the below given Image for more details.
Now, double click on “Flat File Source” to configure it. From the “Flat File Source Editor” please click on “New” button to create “Flat File Connection Manager”. Now from “Flat File Connection Manager Editor” give name “OrderDetails” in “Connection Manger Name” property. By clicking on “Browse” button, select your “OrderDetails.txt” file from D drive. Don’t forget to check on the CheckBox “Column name in first data row” as our first row in text file is our column header.  For more details, look at image below.
Now drag “Sort Transformation” task from tool box to your data flow tab and repeat this step to have two “Sort Transformation” task. One is for “Ado Net Source” and second one is for “Flat File Source”. Join both sort task with appropriate green arrow and double click on it one by one to configure it and make ascending sort on “OrderID” field in both the sort task.
Now, drag the merge join task from tool box and drop it below both sort tasks. Connect green arrow from both sort task to “Merge Join” task and double click on “Merge Join Transformation” task to configure it. “OrderID” column in both the source should be work as JOIN key.  Select “OrderID”, “ClientCode” from first sort and “SampleNumber” and “Matrix” from second sort as shown in screen capture below.
Now, you are having data from both the source to your “Merge Join”. You have to send that data to specified location, in our case it is one SQL Server table we have already created with the name “OrderMasterAndDetail”. Since we want to send our merged data to SQL Server, let us now drag “Ado Net Destination” task to our dataflow and connect it to our merge join. Now, this is the time to configure “Ado Net Destination” so double click on it and select your database connection and table named “OrderMasterAndDetail”, look at the screen shot below for more information.
So, finally you are ready to run the package, hit F5 to run that and if every task seems green as below screen capture than you are all set and you have already received data in your destination table.
Check your “OrderMasterAndDetail” table in SQL Server whether you have actually received any data there or not.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Conditional Split Transformation Task in SSIS 2008

Well, divide data in different destination based on certain condition, is very important task in any ETL tool. SSIS 2008 is really very rich for this kind of task. One of the tool help us to do it very quickly is “Conditional Split Transformation Task” in SSIS 2008.
Let us see one very small example which will give you demonstration of how to use “Conditional Split Transformation Task” in SSIS 2008.
We will first create one table, named “SampleReceive”, generally in environmental laboratory there are few different kind of department in which sample comes for different kind of analysis. I will use three department. 1.) VOA 2.)Semi-VOA 3.) Pesticide. We will make an entry in these three departments in “SampleReceive” table and later on, we will split data in three different table respective to particular department.
So, before we head towards SSIS project, let us first create necessary table in SQL Server. Below given is T-SQL script for the same.
create table SampleReceive
(
      Department varchar(10),
      NumberofSample int,
      DateReceived datetime,
      ClientID int
)
INSERT INTO SampleReceive
SELECT ‘VOA’,10,GETDATE(),1 UNION ALL
SELECT ‘SEMI-VOA’,7,GETDATE(),2 UNION ALL
SELECT ‘Pesticide’,16,GETDATE(),1 UNION ALL
SELECT ‘VOA’,5,GETDATE(),3 UNION ALL
SELECT ‘SEMI-VOA’,3,GETDATE(),1
GO
Create Table SampleReceiveVOA
(
      Department varchar(10),
      NumberofSample int,
      DateReceived datetime,
      ClientID int
)
Create Table SampleReceiveSemiVoa
(
      Department varchar(10),
      NumberofSample int,
      DateReceived datetime,
      ClientID int
)
Create Table SampleReceivePest
(
      Department varchar(10),
      NumberofSample int,
      DateReceived datetime,
      ClientID int
)
GO
Now, let us open new project in SSIS and start making a SSIS package.  As soon as you create new project, have “Data Flow Task” on your screen and double click on it to configure. Look at the below screen shot.
Once you double click on “Data Flow Task” you will be forwarded to the “Data Flow” tab besides “Control Flow” tab.  Now, drag “Ado Net  Source” task to your screen and double click on it to configure it. Give your server credential in configuration editor and select the source table which is “SampleReceive” in our case. For more details, have a look at below given screen shot.

Once you are done with configuration of “Ado Net Source”, drag “Conditional Split Transformation” task from tool box to your work area and drag green arrow coming from “Ado Net Source” to your “Conditional Split Transformation” to connect “Conditional Split Transformation” with “Ado Net Source”.

Now, this is the time to configure “Conditional Split Transformation” so double click on it. In “Conditional Split configuration editor” You have to set three conditions for all three different departments and give it a appropriate “Output name”. have a look at image below for more clarification.

Now, we are all set to split data into three different table from our “SampleReceive” table. Have three “Ado Net Destination”  on the screen and rename it to “VOA”, “Semi-Voa”, “Pesticide” and connect all three “Ado Net Destination” from “Conditional Split Transformation” task.  When you will try to connect that green arrow from “Conditional Split Transformation” to “VOA” Ado Net destination, it will ask you for select condition, you have to select “Voa” condition which we have already defined in “Conditional Split Transformation” task. Look at screen capture below for more information.

Once you connect it (all three destinations), You have to configure all three “Ado Net Destination” . I will show you how to do it for first one, for remaining two; you can follow the same procedure.

Double click on “Ado Net Destination” task named “VOA” and configure it with your SQL server details and table named “SampleReceiveVOA” as shown in below image.

Once you will configure “VOA” table, configure all remaining two “Ado Net Destination” for respected table. Finally your diagram will looks like below image.

Now, you are ready to run, hit F5 to run your package and check whether you have really received data in all three blank table or not by running following TSQL commands in your SQL Server database from SSMS.

select * from SampleReceiveVOA
select * from SampleReceiveSemiVoa
select * from SampleReceivePest
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah