Monday, 11 February 2013

Introduction to Control Flow and Data Flow


When someone learning the SSIS, he/she must face a common question like What is the difference between Control Flow and Data Flow. As the Control flow and the Data flow are two basic elements of SSIS.

We already get a glance about Control Flow and Data Flow in my previous article Understanding the BIDS and in this article now I am trying to differentiate between Control Flow and Data flow. Most of the SSIS developer spent a lot of time to dealing with these two elements.

So let's Start

When we open the Business Intelligence Development Studio or BIDS (in MS SQL Server 2005/2008 2008R2/SQL Server Denali (CTP3)) we find that there are two tabs in the name of Control Flow and Data Flow.

Control Flow

Control flow controls the package flow based on completions, success or tasks failure. The smallest units of the control flow are a task. It does not move data from one task to another. It just maintains the control of data flow. Tasks are run in series if connected with precedence or in parallel. Package control flow is made up of containers and tasks connected with precedence constraints to control package flow.

Data Flow

Data Flow deals with the Actual data movement. Here the multiple components can be process data at the same time. The smallest unit of the data flow is called components. The Data flow is made up of source (From where the data is collected, may be Excel, Flat file, FTP, SQL Server etc), Transform (Modification or manipulation of data, such as data type conversion, convert to smaller to upper etc) and the destination(where the data stores, may be SQL server etc).

So we can tell that the data flow is the child of the control flow. A SSIS package at least contains a control flow. Control Flow may or may not required a data flow.

Real life Example

Don't be afraid the bookish things. To understand it properly, here I am mention a simple scenario or story.
A post man named xyz deliver the postage door to door. Let's assume, he has 3 letter for person –A , B and C. Here he person B address is far from post office. The Person B address is between A and C and Person C address is near the post office. So the postman decide to go to the Person B address first and provide him the letter, then the person B and at the end person C.

Here the decision taken by the postmen how to distribute the postage material is the Control Flow and actual handover the postage material is the Data Flow.

Hope you like it.

Posted by: MR. JOYDEEP DAS