Saturday, 6 May 2017

What Type of Index is needed for Performance Boost Up?

When we are thinking about index we have different options in mid to use them, such as Clustered / Non Clustered etc. Even Non Clustered have different type of Indexes like Covered, Filtered etc. It is very important to understand in which scenario what type of Index gives us the performance boost up.

We have to understand that too many indexes are not good for performance. Infect Index only boosts the performance of SELECT statement not any INSERT/UPDATE/DELETE statement.
In this blog post we are trying to demonstrate the scenario wise index selection process. We just want to mean that, in which scenario, what type of Index provide the best performance for our SELECT statement.

It is only a suggestion, this can be changed depends on the scenario to scenario and also depends on query construction.

This guideline provides us only a good concept of choosing proper index.

Scenario with Index Type
Here we are trying to put the Index type and possible scenario to use it.

Clustered Index

Clustered Index used in columns in the Query
·         The query that returns the range of values by using operators such as BETWEEN clause, >, >=, < and <= etc.
·         When the Query returns the larger result set.
·         When using JOIN clauses with FOREIGN key columns.
·         When ORDER BY or GROUP BY clauses in the query.

Use the Clustered Index on the following columns of Table
·         If the columns have the UNIQUE values.
·         It is better if the data type of the columns has INT, BIG INT.
·         The columns that are frequently used to retrieve the range of values.
·         The IDENTITY columns that are guaranteed to be UNIQUE and auto increasing numbers.
·         Columns that are used to sort data by using ORDER BY clause in SELECT statements.

The effective Clustered Index include:
·         Narrow – as narrow as possible, in terms of the number of bytes it stores
·         Unique – to avoid the need for SQL Server to add a “uniqueifier” to duplicate key values
·         Static – ideally, never updated
·         Ever-increasing – to avoid fragmentation and improve write performance

Clustered Index is NOT good for the following columns of Table
·         The Columns those are frequently modified or changed.
·         Columns that use the composite key.

Non Clustered Index

Non Clustered Index used in columns in the Query
·         Query that uses the JOIN or GROUP BY Clause.
·         Query that does not return large result sets or data.
·         The columns of the query where we frequently use the WHERE clauses.

Use the non-Clustered Index on the following columns of Table
·         Non-Clustered index is used in the non key columns that are frequently used in the query set.
·         In the columns where data is not distinct such as combination of columns where the clustered index is used in the other columns.

Covering Non-Clustered Index

If we have to use a non-clustered index and if we know that our application will be performing the same query over and over on the same table, consider creating a covering index on the table for the query

Guideline to create Covered Index
·         If the query or queries we run using the covering index are seldom run, then the overhead of the covering index may outweigh the benefits it provides.
·         The covering index should not add significantly to the size of the key. If it does, then it its use may outweigh the benefits it provides.
·         The covering index must include all columns found in the SELECT list, the JOIN clause, and the WHERE clause.

Covered Index is NOT good for the following scenario
Generally the Cover Index gathers all the necessary information of other columns in the SELECT statement in a single point of Index node.

Think about a scenario where we are using huge columns in a SELECT statement, so in this situation the note of the index is going to be heavy and decrease the performance.
So it is better to use traditional composite non clustered index to increase the performance.

Filtered Non-Clustered Index

Use the filtered index in the following scenario
·         There is a table that has a huge amount of data and we often select only a subset of that data.  For an example, we have a lot of NULL values in a column and we want to retrieve records with only non-NULL values. In MS SQL Server 2008 this type of columns is called Sparse Column.
·         In another scenario we have several categories of data in a particular column, but we often retrieve data only for a particular category value.
·         By this we mean to say that in the Query of Indexed column the WHERE statement is more or less fixed and the same WHERE statement is used in the non-clustered Filtered Index.

To Understand – We are taking a Simple Example

We have a table object named tbl_EmpDetails

Now in this example EmpID has unique value so we can use this EmpID to uniquely identify records. Here we are using the Clustered Index.

SELECT * FROM tbl_EmpDetails WHERE EmpID = 104;
1.           SELECT * FROM  tbl_EmpDetails  WHERE  EmpID = 104;

SELECT * FROM tbl_EmpDetails WHERE EmpID = 104;
If we look at the column named EmpID, we can find that the criteria (Narrow, Unique, Static and Ever Increasing) are met perfectly.

For EmpName column, duplicate values may exist. We want to make query filtration by using EmpName. In this situation we are using a Non Clustered Index over EmpName.

SELECT EmpID, EmpDepartment FROM tbl_EmpDetails
WHERE EmpName = 'Debalina Bhattacharya';
1.           SELECT  EmpID, EmpDepartment  FROM  tbl_EmpDetails
2.          WHERE   EmpName = 'Debalina Bhattacharya';

SELECT EmpID, EmpDepartment FROM tbl_EmpDetails
WHERE EmpName = 'Debalina Bhattacharya';
If we look at the above query we can find that, the Non Clustered Index used in where clause. But in the Select statement EmpID have clustered Index and EmpDepartment have no Index. In this situation, if we run this query may cause RID or KEY Lookup.

To avoid the RID and KEY Lookup, we used cover index by Including EmpID and EmpDepartment with Non clustered index columns named EmpName. It is called Non Clustered Covered Index.

We have other columns of tbl_EmpDetails named DOJ (Date of Joining). Suppose we are frequently used query with DOJ columns, but almost all the query we are making in between Year 2016 to Year 2017. In this scenario we can use the Filtered Index on DOJ column.

SELECT * FROM tbl_EmpDetails
WHERE DOJ  = '2016-03-31';
1.           SELECT * FROM  tbl_EmpDetails
2.           WHERE DOJ = '2016-03-31';
SELECT * FROM tbl_EmpDetails
WHERE DOJ  = '2016-03-31';

Hope you like it. Please share you view regarding this post.

Posted by: MR. JOYDEEP DAS

1 comment:

  1. SQL Query Performance