Sunday, 22 November 2015

SSIS – Working With Flat File

Introduction
First of all in my article of SSIS, I am not going to describe the entire step by step process. Here I am describing only those process that are very important. So this article is not for beginners and to follow this article the readers need the knowledge with SSIS.
As lot of my friend is asking me several time to write something with Flat file in SSIS. As there are different Query comes from them related to it. In this article I am trying to solve some the query related to Flat file.

When we are taking about flat file some want to read the flat file with the name that contains current data. Some wants the read the file with their creation Date, Last Access Date or Last Creation Date. We are going to make a Scenario related to it to cover all the aspect related to Flat file.

Hope it will be informative and you enjoyed it.

The Scenario

We have two folders with Flat files. One has three Flat files named like “EmployeeFile20151119.txt”, “EmployeeFile20151120.txt”, “EmployeeFile20151121.txt” if we make the close look at it we find that

“EmployeeFile” à String
“21112015”        
à  Date Like 21-11-2012 etc

So it is the combination of File name and Date String and we have to retrieve the records from current date file and sore it into a Database Table.
The second folder have Flat file like “EmployeeFile-1.txt”, “EmployeeFile-2.txt” and “EmployeeFile-3.txt”. We have to check the Creation Date/Last Access Date/Last Modification Date/File Size/File Attributes of the file and if it matches with the current date then we retrieve the records and store it in a Data base Table.
Hope you understand the scenario.

How We Do that

Step – 1 [ Create the Destination Table First ]

-- Destination Table --
IF OBJECT_ID(N'[dbo].[tbl_EmployeeDtls]', N'U')IS NOT NULL
   BEGIN
      DROP TABLE [dbo].[tbl_EmployeeDtls]
   END
GO
CREATE TABLE [dbo].[tbl_EmployeeDtls]
    (
        EmpId     INT         NOT NULL PRIMARY KEY,
        EmpName   VARCHAR(50) NOT NULL,
        City      VARCHAR(50) NOT NULL,
        Country   VARCHAR(50) NOT NULL
      );
GO

Step – 2 [ Complete Control Flow Task as Diagram ]



Step – 3 [ Execute SQL Task in our Case it is Truncate Destination Table]

It just truncate the Destination table objects.
TRUNCATE TABLE [dbo].[tbl_EmployeeDtls];

Step – 4 [ Precedence Constraint ]

As it is Actually two package we take a variable named @[User::v_SolutionValue] as Int32. If the value of the variable is one (1) then the flow goes to ForEach Loop container named “By File Name” and if the value is two (2) then it goes to ForEach Loop container named “By File Creation”



Step – 5 [ ForEach Loop Container Named By File Name ]



Step -6 [ The Expression Task ]

It just use the variable named @[User::v_EmployeeFileName] for next level of Precedence constraint.

Step -7 [ The Precedence constraint of Expression Task ]

In this We are just comparing the file name with a variable named @[User::v_CurrentDateFileNameFormat]. The expression of the variable is
"EmployeeFile"+ REPLACE((DT_WSTR, 10)(DT_DBDATE)GETDATE(),"-","")
It just take the name of Current Date File name and compare it with Actual File name




Step – 8 [ The Data Flow Task ]

We are not going to Describe the data flow task here. But one property for Flat file connection manager we have to change and it is connection String by Expression.



This is the end of one side.

Step – 9 [ The ForEach Container of By File Creation ]

It is just same as previous but we changed the value of the variable named @[User::v_EmployeeFolderPath] to locate the proper directory for that.

Step – 10 [ The Script Task ]

Here we take couple of Read only and Read and Write variables

Read Only Variables:
User::v_FileNameAndPath

Read and Write Variables:
User::v_FileAtrribute,User::v_FileCreationDate,User::v_FileIsReadOnly,User::v_FileLastAccessDate,User::v_FileLastModificationDate,User::v_FileSize

We populate the value of those variable by Script task but here we only use the User::v_FileCreationDate variable for example.

The Script:

#region Namespaces
using System;
using System.Data;
using System.IO;                        // Added to get file properties
using System.Security.Principal;        // Added to get file owner
using System.Security.AccessControl;    // Added to get file owner
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
#endregion

public void Main()
{
   // Variable for file information
   FileInfo fileInfo;

  // Fill fileInfo variable with file information
  fileInfo = new FileInfo(Dts.Variables["User::v_FileNameAndPath"].Value.ToString());

           
  // Get the rest of the file properties if the file exists
  if (fileInfo.Exists)
  {
   // Get file creation date
   Dts.Variables["User::v_FileCreationDate"].Value = fileInfo.CreationTime;

   // Get last modified date
   Dts.Variables["User::v_FileLastModificationDate"].Value = fileInfo.LastWriteTime;

   // Get last accessed date
   Dts.Variables["User::v_FileLastAccessDate"].Value = fileInfo.LastAccessTime;

   // Get size of the file in bytes
   Dts.Variables["User::v_FileSize"].Value = fileInfo.Length;

   // Get file attributes
   Dts.Variables["User::v_FileAtrribute"].Value = fileInfo.Attributes.ToString();
   Dts.Variables["User::v_FileIsReadOnly"].Value = fileInfo.IsReadOnly;
               
  }

Dts.TaskResult = (int)ScriptResults.Success;

}

If you look at the Script carefully you find the population of variables.

Step – 11 [ The Precedence Constraint ]



Step – 12 [ The Data Flow Task ]

We are not going to Describe the data flow task here. But one property for Flat file connection manager we have to change and it is connection String by Expression.




Hope you like it.





Posted by: MR. JOYDEEP DAS

2 comments:

  1. these are the amazing real time scenario. good going

    ReplyDelete
  2. these are the amazing real time scenario. good going

    ReplyDelete