Wednesday, 25 May 2016

Early Arriving Facts

Early arriving fact is a natural thing in data warehouse and we have to understand it and handle it. This article is related to it. Hope it will be interesting.
What is Early Arriving Facts
As named says the Fact table data is arrived early where we do not have any dimension data. That means when we are going to lockup Fact table by the natural key of fact table with surrogated key of Dimension table we are not finding anything.
What the reason of Early Arriving Facts
There is several reason of early arriving fact described
  1. It’s according to system or scenario design like that the data of the fact table is arrived before the dimension data. If we take a suitable example: suppose at hospital if the patient in critical position then the operational or treatment data is gathered in fact first and then it going to gather the dimension data like patient details.
  2. If the Dimension data came from Master and Fact data came from Transaction table and the fact data is gather first in a point of time.
  3. If the ETL fails when loading the Dimension data.
How to Solve it
One of the common solutions is to implement the Inferred member when processing Dimension by ETL process.
An inferred member is a dimension row that has been created as a result of an early arriving fact. The inferred member will have the natural key from the fact table and default values for the rest of the columns. We probably want to add an indicator column in the dimension to identify the row as an inferred member by using a bit column called Inferred Member
Hope you like it.

Posted by: MR. JOYDEEP DAS

Monday, 16 May 2016

Is it Easier to Design Dimension And Fact Table

Here I am going to describe a scenario. In my knowledge we have some definition that can be changed according to scenario. So nothing is thumbs in rule but all are based on Scenario.

How we design Dimension and Fact
We are trying to put all the facts columns (Numerical Columns), where we can use the aggregate fruition into Fact Table and try to put all the description related columns into Dimension Table.
At least theoretically it is correct.

Life is not so Easy
Yes, Life is not so easy when designing a big enterprise. At least based on theoretical definition. Think as a situation of Product Table, where each product have price (Numerical columns) and price is changed according to time.

So where we are going put the information of columns Price. In Fact table as the Price is the numerical. Or in Dimension as Price are changes according to date. I mean to say that within the period of time the price is changed.
I am waiting for your suitable comments.

Hope you like it.

Posted by: MR. JOYDEEP DAS

Friday, 13 May 2016

Is Surrogate Key is Required

Here in this article I am just explaining my view related to the Surrogated Key. All depends on situation based scenario. Hope you like it.

What is Surrogate Key
As I am not going to any hardcore definition of Surrogate key over here, I am just going to explain it in my ways.
Surrogate key is just like a Primary key with an incremental number. It is basically an Identity column with data type Big Int. They Columns also contain a Clustered Index.

Use of Surrogate Key in Case of Dimension
In case of dimension we can better use of Surrogate key for SCD Type II case where we maintain the Historical records.
IDNO (PK)    ( Surrogate Key)
EMPID          ( Business Key)`

Is Maintaining the Historical records in Fact Table is Good ?
One use of the surrogate Key is maintaining the Historical records in Fact table. As per my point of view maintaining the historical records in Fact table is not a good idea. As the Additive fact is destroyed and getting bad result in case of aggregation.
If we want to maintain the historical records in Fact table we must use another table to maintain the history.

Here nothing if fix, all depends on case scenario. My be others have different type of views related to it. So please share your views that we can share knowledge.

Posted by: JOYDEEP DAS

Monday, 9 May 2016

Type of Fact Table in Data ware house

As we all know that the Fact table contains Facts or numerical data where we can put the Aggregate function.
Here in this article we are trying to classify the Facts within the fact table.
Type of Facts
There are three types of facts present in the facts table. They are..
1.    Additive Facts
The definition says, it can be summed up through all dimension in the fact table. For an example:

Order Date
Sales Amount

If we look the example carefully we find that we can SUM the Sales amount according to Order Date, Product and Warehouse. So the Sales amount can be summed up with each level of Dimension.

2.    Semi-Additive Facts
The definition says, it can be summed up for some of the dimension in fact table. Not for all dimension. For an example:

Current Balance
Profit Margin

In this example, we can find the SUM of the Current Balance for each Account. But it will not make any sense if we Sum of the current Balance by Date wise.

3.    Non-Additive Facts
The definition says, it cannot be summed up for any dimension in fact table. Form an example:

Current Balance
Profit Margin

It is not meaningful if we summed up the Profit Margin either by Account or by Date.

Hope you like it.

Posted by: JOYDEEP DAS