Here i am taking about the Incremental Load of Data in SSIS. All developer use there own methodology to complete the process. This artilce is related to the UPDATE the existing records of Destination in case of Incremental Load.
Hope it will be informative and put some values in your professional work.
What we Actually Do in Incremental Load
Here we are just mentioned the Alogitham behind the Incremental Load.
- We check if the Records Exists in the Destination or Not by comparing with Primary key and if the records doesn't Exists it just insert the records in destination table.
- If the Records EXIST then it checks the other columns. If the other columns changes then it goes for UPDATION of records.
Why the UPDATION Process is so Cumbersome
Think we have a table and the table has more than 60 columns and one Primary key. If we compare both Source and Destination Table columns each other’s like this
[Source].[Column-A] == [Destination].[Column-A]
OR [Source].[Column-B] == [Destination].[Column-B]
OR [Source].[Column-C] == [Destination].[Column-C]
OR [Source].[Column-D] == [Destination].[Column-D]
…. 60 columns like this
Think what happens if we are going to maintain such kind of code.
So what the Solutions
To solve this we have to introduce Columns both is Source and Destination Table. This column contains some vale based on Checksum/Binary Checksum / Hash bytes values. We just compare those values and decide whether UPDATION is needed or NOT.
So it is quite easy to compare then compare columns by columns.
Take an Example of CHECKSUM
CHECKSUM(EmpName, EmpGrade) AS [CheckSum_Val]
EmpId EmpName EmpGrade CheckSum_Val
1 Joydeep Das A -1141869962
2 Deepasree Das B 1652695494
3 Sukamal Jana A 2067120924
4 Santi Ranjan C -1747286364
Take an Example of BINARY_CHECKSUM
BINARY_CHECKSUM(EmpName, EmpGrade) AS [BinaryCheckSum_Val]
EmpId EmpName EmpGrade BinaryCheckSum_Val
1 Joydeep Das A 841019011
2 Deepasree Das B 1196883990
3 Sukamal Jana A 2115519786
4 Santi Ranjan C 1412284040
Take an Example of HashBytes
CHECKSUM(HASHBYTES('MD5',EmpName), HASHBYTES('MD5',EmpGrade)) AS [HashBytes_Val]
EmpId EmpName EmpGrade HashBytes_Val
1 Joydeep Das A -641922184
2 Deepasree Das B -1681185931
3 Sukamal Jana A 750240776
4 Santi Ranjan C 1481016726
So which one We Prefer
We have to choose between CHECKSUM, BINARY_CHECKSUM and HASHBYTES
When we go to the CHECKSUM method the MS Documentation gives us some sort of Shocking news.
“However, there is a small chance that the checksum will not change. For this reason, we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change”
BINARY_CHECKSUM is also not good reputation.
So HASHBYTES is the safest one that we can use.
Hope you like it.
Posted By: MR. JOYDEEP DAS