Monday 26 November 2012

Function in WHERE clause

Introduction
When we are using the function in the SELECT statement it returns data depends on the number of records retrieve by the select statement and what type of value is passed in the scalar function. The performance of the MS SQL Query is little bit degraded using the scalar function. But developer often used it for encapsulate frequently performed logic.
But if we saw some SELECT statement, we can see that the functions can be used in WHERE conditions also. Using scalar or User defines function in WHERE clause is a good idea? Is this hampering the performance?
This article is related to it.
Is it Bad?
Using function on WHERE clause causes Index scan.   The reason for this is that the function value has to be evaluated for each row of data to determine it matches our criteria.
How we understand it
To understand it properly, here I am taking an example. We compare the Actual Execution plan of the Query and understand how the function in the WHERE clause effects the query by Index Scanning.
Step-1 [ Creating the Base Table ]
 -- Base Table
IF OBJECT_ID('tbl_EMPDTLS') IS NOT NULL
   BEGIN
      DROP TABLE tbl_EMPDTLS;
   END
GO
CREATE TABLE tbl_EMPDTLS
       (EMPID      INT         NOT NULL IDENTITY PRIMARY KEY,
        EMPFNAME   VARCHAR(50) NOT NULL,
        EMPLNAME   VARCHAR(50) NOT NULL,  
        EMPGRADE   VARCHAR(1)  NOT NULL,
        EMPEMAIL   VARCHAR(50) NOT NULL,
        DOJ        DATETIME    NOT NULL);
GO

Step-2 [ Creating the Index Information ]

-- Creating Non clustered Index
IF EXISTS(SELECT *
          FROM   sys.indexes
          WHERE  object_id=OBJECT_ID('tbl_EMPDTLS')
                 AND name ='IX_NONC_EMPFNAME')
    BEGIN
      DROP INDEX tbl_EMPDTLS.IX_NONC_EMPFNAME;
    END             
GO   
CREATE NONCLUSTERED INDEX IX_NONC_EMPFNAME
ON  tbl_EMPDTLS (EMPFNAME)    
GO
IF EXISTS(SELECT *
          FROM   sys.indexes
          WHERE  object_id=OBJECT_ID('tbl_EMPDTLS')
                 AND name ='IX_NONC_EMPGRADE')
    BEGIN
      DROP INDEX tbl_EMPDTLS.IX_NONC_EMPGRADE;
    END             
GO   

CREATE NONCLUSTERED INDEX IX_NONC_EMPGRADE
ON  tbl_EMPDTLS (EMPLNAME);
GO

IF EXISTS(SELECT *
          FROM   sys.indexes
          WHERE  object_id=OBJECT_ID('tbl_EMPDTLS')
                 AND name ='IX_NONC_EMPEMAIL')
    BEGIN
      DROP INDEX tbl_EMPDTLS.IX_NONC_EMPEMAIL;
    END             
GO   

CREATE NONCLUSTERED INDEX IX_NONC_EMPEMAIL
ON  tbl_EMPDTLS (EMPEMAIL);    
GO

IF EXISTS(SELECT *
          FROM   sys.indexes
          WHERE  object_id=OBJECT_ID('tbl_EMPDTLS')
                 AND name ='IX_NONC_DOJ')
    BEGIN
      DROP INDEX tbl_EMPDTLS.IX_NONC_DOJ;
    END             
GO   

CREATE NONCLUSTERED INDEX IX_NONC_DOJ
ON  tbl_EMPDTLS (DOJ);    
GO

Step-3 [ Inserting some records in the Table ]

-- Inserting Records
INSERT INTO tbl_EMPDTLS 
       (EMPFNAME, EMPLNAME, EMPGRADE, EMPEMAIL, DOJ)
VALUES ('JOYDEEP', 'DAS',    'B', 'joydeep@abc.com','03-12-2006'),
       ('RAJECH',  'DAS',    'C', 'rajesh@abc.com', '01-12-2006'),
       ('SUKAMAL', 'JANA',   'B', 'suku@abc.com',   '03-12-2004'),
       ('TUHIN',   'SHINAH', 'B', 'tuhin@abc.com',  '07-12-2001'),
       ('SANGRAM', 'JIT',    'B', 'sangram@abc.com','01-10-2011'),
       ('SUDIP',   'DAS',    'A', 'sudip@abc.com',  '07-11-1990'),
       ('RANI',    'LAL',    'B', 'rani@abc.com',   '03-12-2006'),
       ('JOHN',    'IBRAHAM','C', 'john@abc.com',   '01-05-2007'),
       ('BHUPEN',  'SINGH',  'A', 'bhapu@abc.com',  '03-12-2006'),
       ('SAIKAT',  'SREE',   'B', 'saikat@abc.com', '01-12-1906'),
       ('SUJATA',  'LALA',   'B', 'sujata@abc.com', '03-12-2012'),
       ('RAJU',    'ROSTOGU','C', 'raju@abc.com',   '03-12-2006'),
       ('ROHIT',   'KUMAR',  'C', 'rohit@abc.com',  '01-10-2012'),
       ('VIPIN',   'PAUL',   'B', 'vipin@abc.com',  '01-11-2006'),
       ('VINODH',  'CHOPRA', 'C', 'vinodh@abc.com', '03-12-2006'),
       ('KALLU',   'SHEK',   'B', 'joydeep@abc.com','01-11-2011')
GO

Step-4 [ Exciting the Query ]

-- Example Set-1
SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_EMPFNAME))
WHERE EMPFNAME  LIKE 'J%';   
GO

SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_EMPFNAME))
WHERE LEFT(EMPFNAME,1)  = 'J'; 



-- Example Set-2
SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_EMPFNAME))
WHERE  EMPFNAME='JOYDEEP'
       AND EMPLNAME='DAS'
GO

SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_EMPFNAME))
WHERE  EMPFNAME+EMPLNAME='JOYDEEPDAS'       


  
-- Example Set-3      
SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_DOJ))
WHERE  DOJ = '03-12-2004' 
GO

SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_DOJ))
WHERE  DOJ < GETDATE()
GO

SELECT *
FROM   tbl_EMPDTLS WITH(INDEX(IX_NONC_DOJ))
WHERE  DATEDIFF(day, DOJ, '03-12-2004') =
GO



Conclusion
So from execution plan we find that using function in WHERE clause is a bad idea.

Hope you like it.

Posted by: MR. JOYDEEP DAS

No comments:

Post a Comment