Introduction
We are again
with flat file scenario based solution. Query are frequently came to provide
the solution. Here is another interesting story with Flat File. I personally prefer
when to design with SSIS take a note book and try to draw it. It gives 90% of
the solution and next 10% we have Google…
The Scenario
We have a
folder and folder contains some flat files. The name of the flat files is important
it contains Filename + Date + Serial Number.
For an example: EmployeeRecord20151124-01.txt, EmployeeRecord20151124-02.txt
etc.
We have to retrieve
only those file which have system date. I mean to say the date of package run
must match with the date for File name.
In a single
day there are three copy of the file is put into the folder, such as morning,
Evening and Night.
The package must run each file in first time
and if we re-run the package it not retrieve the data from old file of same
day. It looks for new file in same day, if found then retrieve records and else
not.
The new flat
file of same day may be contains the old modified records also. So package must
take care of that also.
Hope you
understand the scenario.
The Logic behind the Package
1.
We
need three tables object. One is Destination Table where the actual data store.
Second table keep track the File name which is already retrieved. Third table
takes all the file name and path from the source folder.
2.
First
we start works with third table and put the entire file name that match the
current date scenario.
3.
Then
we compare it with Second table and check whether the file is retrieved before
or not. If not then we retrieve the file.
4.
The
data flow logic is as before using Merge transform and Conditional split
How we do that
Step – 1 [ Create the Necessary Table Objects ]
IF OBJECT_ID(N'[dbo].[tbl_EmployeeDetails]',
N'U')IS NOT NULL
BEGIN
DROP TABLE [dbo].[tbl_EmployeeDetails];
END
GO
CREATE TABLE [dbo].[tbl_EmployeeDetails]
(
EmpID INT NOT NULL PRIMARY KEY,
EmpName VARCHAR(50) NOT NULL,
EmpDept VARCHAR(50) NOT NULL,
EmpGrade CHAR(1) NOT NULL
);
GO
IF OBJECT_ID(N'[dbo].[tbl_OLDFlatFileDetails]', N'U')IS NOT NULL
BEGIN
DROP TABLE [dbo].[tbl_OLDFlatFileDetails];
END
GO
CREATE TABLE [dbo].[tbl_OLDFlatFileDetails]
(
IdNo
INT NOT NULL IDENTITY PRIMARY KEY,
FFileName VARCHAR(50) NOT NULL,
FullPath VARCHAR(1000) NOT NULL,
);
GO
IF OBJECT_ID(N'[dbo].[tbl_NEWFlatFileDetails]', N'U')IS NOT NULL
BEGIN
DROP TABLE [dbo].[tbl_NEWFlatFileDetails];
END
GO
CREATE TABLE [dbo].[tbl_NEWFlatFileDetails]
(
IdNo
INT NOT NULL IDENTITY PRIMARY KEY,
FFileName VARCHAR(50) NOT NULL,
FullPath VARCHAR(1000) NOT NULL,
);
GO
Step
– 2 [ The Control Flow Task ]
Step
– 3 [ Execute SQL Task named TRUNCATE tbl_NEWFlatFileDetails ]
Here we
Truncate the tbl_NEWFlatfileDetails Table where we store the Flat file name and
path from the source folder.
TRUNCATE TABLE [dbo].[tbl_NEWFlatFileDetails];
Step
– 4 [ ForEach Loop Container ]
Step
– 5 [ Expression Task named SUBSTRING File Name ]
In the
Expression we used
@[User::v_FileNameSubString]=
SUBSTRING( @[User::v_FileName], 15, 8 )
Here the
SUBSTRING Function retrieves the DATE part from the File name like 20151124.
Step
– 6 [ Precedence Constraint ]
In the
Expression we used
@[User::v_FileNameSubString]==
@[User::v_CurrentDateFile]
Where the
variable named @[User::v_CurrentDateFile] has
its own Expression
REPLACE((DT_WSTR,
10)(DT_DBDATE)GETDATE(),"-","")
Step
– 7 [ Execute SQL task named Insert Data INTO tbl_NEWFlatFileDetails ]
The SQL
Statement is
INSERT INTO [dbo].[tbl_NEWFlatFileDetails]
(FFileName, FullPath)
VALUES(?, ?)
Step
– 8 [ Execute SQL Task named Retrive Records Comparing Tables ]
The SQL
Statement is
SELECT FullPath, FFileName
FROM [dbo].[tbl_NEWFlatFileDetails]
WHERE FFileName NOT IN(SELECT FFileName
FROM [dbo].[tbl_OLDFlatFileDetails]);
Step
– 9 [ For EachLoop Container ]
Step
– 10 [ Data Flow Task named Working With tbl_EmployeeDetails ]
I am not
going to describe it here. Just a pictorial diagram is given
The Flat File
Connection manager need to change one property named Connection String by Expression
Step
– 11 [ Execute SQL task named Insert Data Into tbl_OLDFlatFileDetails ]
The SQL
Statement is
INSERT INTO [dbo].[tbl_OLDFlatFileDetails]
(FFileName, FullPath)
VALUES(?, ?)
One thing I want
to mention in my SSIS article is that, I don’t want my reader just copy and
paste the steps but they must understand the beauty of the package and do it by
their own logic. If the reader has any confusion or query they can comment it
or write me mail.
Hope you like
it.
Posted by: MR. JOYDEEP DAS
Nice article related to SSIS .Also we can use lookup or Marge Statement instead of Merge join and conditional split tasks.
ReplyDeleteThanks @Jax to provide your valuable time.
DeleteI personally do not like Lookup as per performance reason in case of Large volume of data.
Please provide your valuable comments, that we all can share knowledge.
An Excellent blog
ReplyDeleteThis comment has been removed by the author.
ReplyDelete