Sunday, 27 December 2015

SSIS – Using BULK INSERT

Introduction

Bulk Insert is a very important tool for every SSIS developer and it is almost used by every developer to load the Flat file into staging table. Here we are not going to describe detail about Bulk Copy, but working with some common factors when used with Bulk Copy.

What Common Factor we faced

Some of the common factor that we faced with BULK COPY is
1.    When the Import Table have the Identity Columns
2.    When we do not want to Insert some records in a specified columns of a Table
3.    When data type mismatch occurs
4.    When Truncation Error occurs.

Sometime in the ETL process the flat file size is so heavy that we are unable to open it.

How to Solve this Problem

This type and many other type of problem with BULK COPY can be solved by using Format File. So here in this article we are going to describe the BULK COPY using format file.
Format file is a Metadata of table where we are importing the data from flat file. There are two type of format file one is in text format and another is xml format. Generally I preferred the xml format but it completely depends on developer and there are no hard and fast rules for that.

So let’s start the Demonstration

Step – 1 [ The Flat file Format ]



Step – 2 [ The Destination Table ]

IF OBJECT_ID(N'[dbo].[tbl_StudentDetails]', N'U')IS NOT NULL
   DROP TABLE [dbo].[tbl_StudentDetails];
GO

CREATE TABLE [dbo].[tbl_StudentDetails]
    (
        StdRoll   INT         NOT NULL PRIMARY KEY,
        Stdname   VARCHAR(50) NULL,
        StdClass  INT         NULL,
        Section   CHAR(1)     NULL
      );
GO

Step – 3 [ Creating the Format File ]

The XML Format File

bcp PracticeDB.dbo.tbl_StudentDetails
format nul -c -f
E:\Preparation\SSIS\BCP\FmtStudentDetails.xml
-x -S JOYDEEP-LAPTOP\JOYDEEPSQL12 -T -q -t;


Here the XML format file name is FmtStudentDetails.xml and the Instance of the SQL Server name is JOYDEEP-LAPTOP\JOYDEEPSQL12


  <?xml version="1.0" ?>
- <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS" />
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
  <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS" />
  </RECORD>
- <ROW>
  <COLUMN SOURCE="1" NAME="StdRoll" xsi:type="SQLINT" />
  <COLUMN SOURCE="2" NAME="Stdname" xsi:type="SQLVARYCHAR" />
  <COLUMN SOURCE="3" NAME="StdClass" xsi:type="SQLINT" />
  <COLUMN SOURCE="4" NAME="Section" xsi:type="SQLCHAR" />
  </ROW>
  </BCPFORMAT>

The TEXT Format File


bcp PracticeDB.dbo.tbl_StudentDetails
format nul -c -f E:\Preparation\SSIS\BCP\FmtStudentDetails.fmt 
-S JOYDEEP-LAPTOP\JOYDEEPSQL12 -T -q -t;

10.0
4
1       SQLCHAR             0       12      ";"      1     StdRoll              ""
2       SQLCHAR             0       50      ";"      2     Stdname              SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR             0       12      ";"      3     StdClass             ""
4       SQLCHAR             0       10      "\r\n"   4     Section              SQL_Latin1_General_CP1_CI_AS


If we want we can Edit both the file. Suppose the StdRoll is an IDENTITY columns and we do not want to insert records in the table for that, we have to just edit it


  <?xml version="1.0" ?>
- <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
  <FIELD ID="0" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
  <FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS" />
  <FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="12" />
  <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS" />
  </RECORD>
- <ROW>
  <COLUMN SOURCE="0" NAME="StdRoll" xsi:type="SQLINT" />
  <COLUMN SOURCE="2" NAME="Stdname" xsi:type="SQLVARYCHAR" />
  <COLUMN SOURCE="3" NAME="StdClass" xsi:type="SQLINT" />
  <COLUMN SOURCE="4" NAME="Section" xsi:type="SQLCHAR" />
  </ROW>
  </BCPFORMAT>

10.0
4
1       SQLCHAR             0       12      ";"      0     StdRoll              ""
2       SQLCHAR             0       50      ";"      2     Stdname              SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR             0       12      ";"      3     StdClass             ""
4       SQLCHAR             0       10      "\r\n"   4     Section              SQL_Latin1_General_CP1_CI_AS

Please look at the highlighted line and you can understand it.

Step – 4 [ USING Format File for BULK INSERT ]

BULK INSERT tbl_StudentDetails
   FROM 'E:\Preparation\SSIS\BCP\test.txt'
   WITH (FORMATFILE = 'E:\Preparation\SSIS\BCP\FmtStudentDetails.xml',
         FIELDTERMINATOR = ';',
             ROWTERMINATOR = '0x0A'
         );

BULK INSERT tbl_StudentDetails
   FROM 'E:\Preparation\SSIS\BCP\test.txt'
   WITH (FORMATFILE = 'E:\Preparation\SSIS\BCP\FmtStudentDetails.fmt,
         FIELDTERMINATOR = ';',
             ROWTERMINATOR = '0x0A'
         );


Please note that the Format file is your Metadata of table so if the format file is generated properly the BULK INSERT is done successfully. We can Edit the format file according to our need and controlling the BULK INSERT.


Hope you like it.



Posted by: MR. JOYDEEP DAS


1 comment:

  1. In my Next Article i am using this BULK INSERT Methods with SSIS Components .. Hope you enjoy it.

    ReplyDelete