Thursday 15 August 2013

Alternate of SCD Type-I

Introduction

In my previous article we are demonstrating related to Slowly Changing Dimension. If nay one wants to review it please refer to my previous article related to SCD.
SSIS Slowly Changing Dimension (SCD) Part- I

SSIS Slowly Changing Dimension (SCD) Part –II



Here in this article we are trying to learn an alternative of Slowly Changing Dimension Type-1 methodology by using Lookup Transform.

To understand it properly we are taking the same example that we used in Type-1 of Slowly Changing Dimension (SCD).

Review from previous Article

Type-1 [ Changing Attributes ]

When we are going to overwrite the column existing records.
Here we are taking an example to understand the Type-1.

Suppose we have a situation where the first name of the Employee is misspelled and the wrong spelling is stored in the first name of the employee. For making the first name correct we need to add one more records for same employee so that we can overwrite the first name of the employee. This SCD transformation directs these rows to an output named Changing Attributes Updates Output.


EMPID
FIRST_NAME
FIRST_NAME
MIDDLE_NAME
101
SUBOOOOL
CHANDRA
DAS
101
SUBAL
CHANDRA
DAS


Case Scenario
Here we are creating two tables with same structure.
The table one is tbl_Employee_Details wich contains the original employee records and tbl_Employee_Details_Change which contains the same records, new records and the records with hanged data to the tbl_Employee_Details table objects.

Hope you understand the scenario.

How to do it by using look up transform

Step-1 [ Create the Table Objects and Insert the Values ]

-- Base Table
IF OBJECT_ID(N'dbo.tbl_Employee_Details', N'U') IS NOT NULL
   BEGIN
      DROP TABLE dbo.tbl_Employee_Details;
   END
GO 
CREATE TABLE dbo.tbl_Employee_Details
 (EMPID                INT,
  FIRST_NAME           VARCHAR(50),
  MIDDLE_NAME          VARCHAR(50),
  LAST_NAME            VARCHAR(50),
  DOB                  DATETIME,
  NATIONALITY          VARCHAR(50),
  JOB_TITLE            VARCHAR(20),
  MARITAL_STATUS       CHAR(1),
  EMPLOYEE_GENDER      CHAR(1));
            
GO
-- Table for Changing Records
IF OBJECT_ID(N'dbo.tbl_Employee_Details_Change', N'U') IS NOT NULL
   BEGIN
      DROP TABLE dbo.tbl_Employee_Details_Change;
   END
GO 
CREATE TABLE dbo.tbl_Employee_Details_Change
 (EMPID                INT,
  FIRST_NAME           VARCHAR(50),
  MIDDLE_NAME          VARCHAR(50),
  LAST_NAME            VARCHAR(50),
  DOB                  DATETIME,
  NATIONALITY          VARCHAR(50),
  JOB_TITLE            VARCHAR(20),
  MARITAL_STATUS       CHAR(1),
  EMPLOYEE_GENDER      CHAR(1));
GO          
            


-- Inserting Recors in Base Table
INSERT INTO dbo.tbl_Employee_Details
       (EMPID, FIRST_NAME, MIDDLE_NAME, LAST_NAME,   
        DOB, NATIONALITY, JOB_TITLE, MARITAL_STATUS,
        EMPLOYEE_GENDER)
VALUES (101, 'SUBOOOOL', 'CHANDRA', 'DAS',
        '12-07-1945', 'INDIAN', 'GRADE-A', 'M',
        'M'),
        (102, 'JOYDEEP', NULL, 'DAS',
        '12-17-1974', 'INDIAN', 'GRADE-B', 'M',
        'M');
GO       
-- Inserting Recors for Change
INSERT INTO dbo.tbl_Employee_Details_Change
       (EMPID, FIRST_NAME, MIDDLE_NAME, LAST_NAME,   
       DOB, NATIONALITY, JOB_TITLE, MARITAL_STATUS,
       EMPLOYEE_GENDER)
VALUES (101, 'SUBAL', 'CHANDRA', 'DAS',
        '12-07-1945', 'INDIAN', 'GRADE-A', 'M',
        'M'),
        (102, 'JOYDEEP', NULL, 'DAS',
        '12-17-1974', 'INDIAN', 'GRADE-B', 'M',
        'M');
GO       

Step-2 [ SSIS Data Flow Tasks ]

Now we are going to understand the Data flow diagram and how it's work like Type-1 of Slowly changing Dimension (SCD).



1.    Here the OLE DB Source connection Extract data from Table Object named : tbl_Employee_Details
2.    The First Lookup Transform check the records of tbl_Employee_Details With another table object called tbl_Employee_Details_Change By Business Kye. In our case it is EMPID.
3.    If match not found, in this case we can understand the new records exists in the tbl_Employee_Details_Change Table objects. So this records must be Inserted into table tbl_Employee_Details. So the data flow moves to the OLE DB Destination.
4.    If the match records found, in this case we have to check others attributes changed or not. Hence the second Lookup Transform works. It check all the records columns by columns. If No match found then the data flow moves to the OLE DB Command to update the records.

Step-3 [ Lookup Transform Configuration ]

First Lookup Transform





Second Lookup Transform



Step-4 [ OLE DB Command Editor ]



The SQL Statement that is used to Update the Records

UPDATE a
SET
FIRST_NAME=b.FIRST_NAME,
MIDDLE_NAME=b.MIDDLE_NAME,
LAST_NAME=b.LAST_NAME,
DOB=b.DOB,
NATIONALITY=b.NATIONALITY,
JOB_TITLE=b.JOB_TITLE,
MARITAL_STATUS=b.MARITAL_STATUS,
EMPLOYEE_GENDER=b.EMPLOYEE_GENDER
FROM   tbl_Employee_Details a
       INNER JOIN tbl_Employee_Details_Change b
       ON a.EMPID = b.EMPID;

Hope you like it.


Posted by: MR. JOYDEEP DAS

5 comments:

  1. Thanks Jaydeepda for this great article. After reading your last article about SCD I tried it with a huge transactional table for incremental update and it took me about 1 hour to complete. I was wandering how to do this with less time and this article came. Thanks a lot. I will let you know my current timing with this method.

    ReplyDelete
    Replies
    1. Thanks @ Debashis.
      As the name suggest Slowly and Dimension... I think u tried it just opposite with Facts.

      Delete
    2. Dear Jaydeepda,
      It took me 37 sec for each 2 lookup transformation used in this package. Data is about 1 lakh. Is it ok ? or is there any more faster alternative available?

      Delete
  2. Thanks a lot. It takes times..There is any fine tuning process.

    ReplyDelete