Introduction
Here
i am taking about the Incremental Load of Data in SSIS. All developer
use there own methodology to complete the process. This artilce is
related to the UPDATE the existing records of Destination in case of
Incremental Load.
Hope
it will be informative and put some values in your professional work.
What
we Actually Do in Incremental Load
Here
we are just mentioned the Alogitham behind the Incremental Load.
- We check if the Records Exists in the Destination or Not by comparing with Primary key and if the records doesn't Exists it just insert the records in destination table.
- If the Records EXIST then it checks the other columns. If the other columns changes then it goes for UPDATION of records.
Why
the UPDATION Process is so Cumbersome
Think
we have a table and the table has more than 60 columns and one
Primary key. If we compare both Source and Destination Table columns
each other’s like this
[Source].[Column-A]
== [Destination].[Column-A]
OR
[Source].[Column-B] == [Destination].[Column-B]
OR [Source].[Column-C] == [Destination].[Column-C]
OR [Source].[Column-D] == [Destination].[Column-D]
….
60 columns like this
Think
what happens if we are going to maintain such kind of code.
So
what the Solutions
To
solve this we have to introduce Columns both is Source and
Destination Table. This column contains some vale based on
Checksum/Binary Checksum / Hash bytes values. We just compare those
values and decide whether UPDATION is needed or NOT.
So
it is quite easy to compare then compare columns by columns.
Take
an Example of CHECKSUM
SELECT EmpId,
EmpName,
EmpGrade,
CHECKSUM(EmpName,
EmpGrade) AS
[CheckSum_Val]
FROM tbl_Employee;
EmpId EmpName EmpGrade CheckSum_Val
1 Joydeep
Das A -1141869962
2 Deepasree
Das B 1652695494
3 Sukamal
Jana A 2067120924
4 Santi
Ranjan C -1747286364
Take
an Example of BINARY_CHECKSUM
SELECT EmpId,
EmpName,
EmpGrade,
BINARY_CHECKSUM(EmpName,
EmpGrade) AS
[BinaryCheckSum_Val]
FROM tbl_Employee;
EmpId EmpName EmpGrade BinaryCheckSum_Val
1 Joydeep
Das A 841019011
2 Deepasree
Das B 1196883990
3 Sukamal
Jana A 2115519786
4 Santi
Ranjan C 1412284040
Take
an Example of HashBytes
SELECT EmpId,
EmpName,
EmpGrade,
CHECKSUM(HASHBYTES('MD5',EmpName),
HASHBYTES('MD5',EmpGrade))
AS [HashBytes_Val]
FROM tbl_Employee;
EmpId EmpName EmpGrade HashBytes_Val
1 Joydeep
Das A -641922184
2 Deepasree
Das B -1681185931
3 Sukamal
Jana A 750240776
4 Santi
Ranjan C 1481016726
So
which one We Prefer
We
have to choose between CHECKSUM, BINARY_CHECKSUM and HASHBYTES
When
we go to the CHECKSUM method the MS Documentation gives us some sort
of Shocking news.
“However,
there is a small chance that the checksum will not change. For this
reason, we do not recommend using CHECKSUM to detect whether values
have changed, unless your application can tolerate occasionally
missing a change”
BINARY_CHECKSUM
is also not good reputation.
So
HASHBYTES is the safest one that we can use.
Hope
you like it.
Posted
By: MR. JOYDEEP DAS
No comments:
Post a Comment