Introduction
In
this article we are trying to play with filtration in Flat File. As we all know
that to make any filtration with flat file, we have to load the entire flat
file into a staging table or just use a Conditional Split Transformation after
loading the Flat file by using Flat File Data source and Flat file connection
manager.
If
the size of our flat file is huge and we need only some records depending on
specified criteria, why we are going to load the entire flat file and then
filter some records from it?
It
is always better, if we are going to filter some records before loading the
flat file and then work with those records. Something like Incremental Load
with Flat File but No staging table.
Here
we are just going to demonstrate the simple filtration in flat file before
loading it. We can make it Incremental load by using the same approach.
Look’s
interesting? To understand this article properly, we need some prerequisite
like the VB scripting language and understanding of parameterized Batch file
with SSIS.
Case Scenario
The
high-level view of our SSIS package is it just load a flat file form the source
and before loading the flat file it filters some records first and then load
it.
Please note that we are not going to load the entire flat file. We are just going to load some selected records.
Please note that we are not going to load the entire flat file. We are just going to load some selected records.
Source and Destination Details
In
source we have a flat file with huge volume of data and in destination, we have
a table object to store the records.
Sample of Flat File Source:
Filter Criteria:
Before
loading the flat file we are just going to filter the records depending on the
criteria mentioned below.
We
are going to use the last columns of the Flat File and retrieve those records
that have greater than or equal (>=) to the specified date in a variable.
Proposed Solution
We
are going
·
Call
a parameterized batch file from the SSIS Execute Process task
·
Batch
file is going to call the VB script file
·
The
VB Script file is going to filter the flat file according to criteria and make
another flat file with filtered output
·
By
using the newly created flat file we are going to store the records into our DB
table objects
Pictorial Diagram for Understanding:
[Step - 1 ] Create the VB
Script
Here
in our example we are creating the VB script in the name of Test.bvs
strFiles = WScript.Arguments.Unnamed(0)
strDate =
WScript.Arguments.Unnamed(1)
Set objFS =
CreateObject( "Scripting.FileSystemObject"
)
Set objFile
= objFS.OpenTextFile(strFiles)
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
s = Split(strLine,"|")
if
(CDate(s(5)) >= CDate(strDate)) then
WScript.Echo strLine
end
if
Loop
If
we look at the code we can see that it’s open the flat file first. The name of
the flat file is supplied with arguments.
strFiles =
WScript.Arguments.Unnamed(0)
Then
it takes another input. In our case it is a Date.
strDate = WScript.Arguments.Unnamed(1)
Then
it reads the file line by line using a Loop until the end of the file comes.
After reading a single line of flat file, it just splits the columns.
s = Split(strLine,"|")
In
our case the column separator is (|). In our flat file, there are 5 columns
(starting columns are 0 columns). The last columns contain the Date columns
that we need to compare.
if (CDate(s(5)) >= CDate(strDate)) then
[ Step – 2 ] Create the
Batch file that called the VB Script file
Here
we are creating the batch file in the name of MyBatch.bat
The batch file calls the VB script file. To call the VB script file we need two parameters.
The batch file calls the VB script file. To call the VB script file we need two parameters.
@ECHO OFF
SET arg1 = %1
REM cscript //nologo test.vbs stdDtls.txt 05-Jan-2015
> filter.txt
cscript //nologo test.vbs StdDtls.txt "%1%" > filter.txt
@ECHO ON
Here
the filtered output is stored into another flat file named filter.txt. The file
name filter.txt is automatically created.
[ Step – 3 ] Call the Batch
file from SSIS Process Task
First,
we have to create a variable in our SSIS package which contains the date and
depending on that date we are going to extract the records.
Now
we are going to call the batch file from Execute Process Task of SSIS.
After
executing the process task, it generates another flat file. This flat file
contains the filtered records. In our case the name of the file is filter.txt
[ Step – 4 ] Retrieve
records from Newly created flat file
Now
we are extracting the records from newly created flat file. For Meta data
mapping we just execute the Execute Process time once in first time to create a
new flat file and then use this flat file form Flat File Connection Manager.
If
it is needed, we can use Expression of Flat File connection manager Property
named Connection string to make it dynamic.
Control Flow Diagram:
Data Flow Diagram:
Output in Data Viewer:
Here
we are not going to store it in a DB table object. We are just showing it in
the Data Viewer.
Hope
you like it.
Posted
By: MR. JOYDEEP DAS