Thursday 24 April 2014

Statistics


Introduction

To improve the performance of any query the query optimizer uses the statistics to create query plan.  In most of the case the query optimizer generate the necessary statistics for high quality query plan. In few cases we need to create the additional statistics or modify the query design.

What is Statistics
Statistics for query optimization are objects that contain statistical information about the distribution of values in one or more columns of a table or indexed view.

Query optimizer uses this statistical information to estimate the number of rows, in the query result set and by this way the optimizer to create a high-quality query plan.

Query optimizer uses this cardinality estimation to choose the index seek operator instead of the index scan operator, and in doing so improve query performance.

How we can Update Statistics

We can update statistics by using UPDATE STATISTICS or sp_updatestats
But Microsoft recommended that to keeping AUTO_UPDATE_STATISTICS set to ON so that the query optimizer continues to routinely update statistics.

To determine when the statistics where last update use the 
STATS_DATE ( object_id , stats_id )

Example

SELECT name AS index_name,
    STATS_DATE(object_id, index_id) AS statistics_update_date
FROM sys.indexes
WHERE object_id = OBJECT_ID('dbo.tbl_CUSTOMER');
GO

index_name                                                   statistics_update_date
PK__tbl_CUST__3D9DDA317BE3C753     2014-04-04 17:31:24.810

When to Update Statistics

We have to update the statistics in the following condition

·         Query execution times are slow.
·         Insert operations occur on ascending or descending key columns.
·         After maintenance operations.

Create Statistics
In MS SQL Server the statistics can be created by using CREATE STATICSTIC command or by CREATE INDEX command.

The statistical information created using CREATE STATISTICS command is equivalent to the statistics built by a CREATE INDEX command on the same columns.

The only difference is that the CREATE STATISTICS command uses sampling by default while the CREATE INDEX command gathers the statistics with full scan since it has to process all rows for the index anyway.

Example

CREATE STATISTICS [IX_STATS_CUSTID]
ON [dbo].[tbl_CUSTOMER]([CUSTID])
WITH SAMPLE 50 PERCENT;

As we are here using 50% sampling of the row, for any big table random sampling may not produce the actual statistics. Therefore, for bigger tables, we may need to use the resample option on UPDATE STATISTICS. The resample option will maintain the full scan statistics for the indexes and sample statistics for the rest of the columns.

The statistical information is updated when 20% of the rows changed.

Update Statistics
Updating of statistics ensures that any query that runs get the up-to-date statistics to satisfy the query needs. This is introduced in MS SQL 2005 version and it is similar to the sp_updatestats command.

UPDATE STATISTICS [dbo].[tbl_CUSTOMER]
WITH FULLSCAN, ALL
GO




Hope you like it.




Posted by: MR. JOYDEEP DAS

Friday 4 April 2014

TSQL trick only with FOR XML Support

Introduction

In this article we are going to demonstrate a TSQL trick only with FOR XML Support.

Case Study
We have three Table Objects

tbl_CUSTOMER

CUSTID
CUSTNAME
1
Joydeep Das
2
Chandan Bannerjee
3
Soumen Bhowmik

tbl_ITEMDTLS

ITEMCD
ITEMNAME
100
Tooth Paste
101
Tooth Brusg
102
Saving Lotion
103
Saving Brush

Now the customer purchase Items

tbl_SALEDTLS

SALENO
SRLNO
CUSTID
ITEMCD
201
1
1
100
201
2
1
101
201
3
1
102
201
4
1
103
202
1
2
100
202
2
2
101
203
1
3
100

We want a report like this Format

CUSTID
CUSTNAME
ITEM DETAILS
1
Joydeep Das
 Saving Brush, Saving Lotion, Tooth Brusg, Tooth Paste
2
Chandan Bannerjee
 Tooth Brusg, Tooth Paste
3
Soumen Bhowmik
 Tooth Paste
1
Joydeep Das
 Saving Brush, Saving Lotion, Tooth Brusg, Tooth Paste

How to solve it

-- Table Object Customer
IF OBJECT_ID(N'dbo.tbl_CUSTOMER', N'U')IS NOT NULL
   BEGIN
      DROP TABLE [dbo].[tbl_CUSTOMER];
   END
GO
CREATE TABLE [dbo].[tbl_CUSTOMER]
      (
         CUSTID    INT         NOT NULL IDENTITY PRIMARY KEY,
         CUSTNAME  VARCHAR(50) NOT NULL
      );
GO

-- Insert Records
INSERT INTO [dbo].[tbl_CUSTOMER]
      (CUSTNAME)
VALUES('Joydeep Das'),
      ('Chandan Bannerjee'),
      ('Soumen Bhowmik');                 

-- Table Object Item Details
IF OBJECT_ID(N'dbo.tbl_ITEMDTL', N'U')IS NOT NULL
   BEGIN
      DROP TABLE [dbo].[tbl_ITEMDTL];
   END
GO
CREATE TABLE [dbo].[tbl_ITEMDTL]
   (
     ITEMCD       INT         NOT NULL IDENTITY(100,1) PRIMARY KEY,
     ITEMNAME     VARCHAR(50) NOT NULL
   )
GO

-- Insert Records
INSERT INTO [dbo].[tbl_ITEMDTL] 
    (ITEMNAME)
VALUES('Tooth Paste'),
      ('Tooth Brusg'),
      ('Saving Lotion'),
      ('Saving Brush');
     
-- Table Object Sales Dtls
IF OBJECT_ID(N'dbo.tbl_SALEDTLS', N'U')IS NOT NULL
   BEGIN
      DROP TABLE [dbo].[tbl_SALEDTLS];
   END
GO           
CREATE TABLE [dbo].[tbl_SALEDTLS]
   (
     SALENO   INT   NOT NULL,
     SRLNO    INT   NOT NULL,
     CUSTID   INT   NOT NULL,
     ITEMCD   INT   NOT NULL,
     CONSTRAINT PK_tbl_SALEDTLS PRIMARY KEY
        (
           SALENO ASC,
           SRLNO  ASC
        )
   )           
GO

-- Insert Records
INSERT INTO [dbo].[tbl_SALEDTLS]
      (SALENO, SRLNO, CUSTID, ITEMCD)
VALUES(201, 1, 1, 100),
      (201, 2, 1, 101),
      (201, 3, 1, 102),
      (201, 4, 1, 103),
      (202, 1, 2, 100),
      (202, 2, 2, 101),
      (203, 1, 3, 100);  
     
GO
SELECT * FROM  [dbo].[tbl_CUSTOMER];
SELECT * FROM  [dbo].[tbl_ITEMDTL];
SELECT * FROM  [dbo].[tbl_SALEDTLS]; 

-- Query
SELECT a.CUSTID, a.CUSTNAME,
       STUFF((SELECT ', '+ y.ITEMNAME
              FROM   [dbo].[tbl_SALEDTLS] AS x
                     INNER JOIN [dbo].[tbl_ITEMDTL] AS y
              ON x.ITEMCD = y.ITEMCD
              WHERE  x.CUSTID = a.CUSTID
              ORDER BY ',' + y.ITEMNAME
              FOR XML PATH('')),1,1,'') AS [ITEM DETAILS]
FROM   [dbo].[tbl_CUSTOMER] AS a; 

CUSTID   CUSTNAME                      ITEM DETAILS
1              Joydeep Das                    Saving Brush, Saving Lotion, Tooth Brusg, Tooth Paste
2              Chandan Bannerjee       Tooth Brusg, Tooth Paste
3              Soumen Bhowmik           Tooth Paste



Hope you like it.


Posted by: MR. JOYDEEP DAS

Limitation of UDF

Introduction

All the developers are very well-known about the UDF in MS SQL server. The UDF is first introducing at Microsoft SQL Server 2000.
Here in this article we are trying to discuss about some limitation of UDF.

No Side-Effects
By definition, a UDF is supposed to have no side-effects on the database. 
In MS SQL Server user defined functions (UDFs) you cannot take any actions (update, delete, insert) that modify data on any object outside the scope of the UDF.  A table variable inside the UDF is, of course, allowed. 

Let’s take an Example:

IF OBJECT_ID(N'dbo.fn_MYUDF', N'FN')IS NOT NULL
   BEGIN
      DROP FUNCTION [dbo].[fn_MYUDF];
   END
GO
CREATE FUNCTION [dbo].[fn_MYUDF]
       (
         @p_EMPID   INT
       ) 
RETURNS INT
AS
BEGIN
   DECLARE @intRETURN INT = 1;
  
   UPDATE dbo.tbl_Employee
     SET EMPNAME='Joydeep Das'
   WHERE EMPID=@p_EMPID;
  
   RETURN @intRETURN; 
END  

When we compile this we get an error output.

Msg 443, Level 16, State 15, Procedure fn_MYUDF, Line 10
Invalid use of a side-effecting operator 'UPDATE' within a function.

Can NOT use Non-deterministic Built-in function
Within UDF we cannot use any non-deterministic built in function. Non deterministic functions may return different results each time they are called with a specific set of input values even if the database state that they access remains the same. We cannot influence the determinism of any built-in function. Each built-in function is deterministic or non deterministic based on how the function is implemented by SQL Server.

SQL Server 2005 has allowed some of the non-deterministic functions that were not allowed in 2000. In 2005 we can now use CURRENT_TIMESTAMP, @@MAX_CONNECTIONS, GET_TRANSMISSION_STATUS, @@PACK_RECEIVED, GETDATE, @@PACK_SENT,
GETUTCDATE, @@PACKET_ERRORS, @@CONNECTIONS, @@TIMETICKS, @@CPU_BUSY, @@TOTAL_ERRORS, @@DBTS, @@TOTAL_READ, @@IDLE,
@@TOTAL_WRITE and,  @@IO_BUSY

Limitation of Parameters
UDF can have up to 1023 input parameters; Stored Procedure can have up to 21000 input parameters. This is a relatively minor limitation because most routines require a much smaller number of parameters.

Limitation of CURSOR Declaration

IF OBJECT_ID(N'dbo.fn_MYUDF', N'FN')IS NOT NULL
   BEGIN
      DROP FUNCTION [dbo].[fn_MYUDF];
   END
GO
CREATE FUNCTION [dbo].[fn_MYUDF]
       (
         @p_OPMODE   INT
       ) 
RETURNS INT
AS
BEGIN
   DECLARE @intRETURN INT;
  
   DECLARE cur_test SCROLL CURSOR
        FOR SELECT o.id FROM sysobjects as o;
       
   RETURN @intRETURN; 
END 

Msg 1049, Level 15, State 1, Procedure fn_MYUDF, Line 0
Mixing old and new syntax to specify cursor options is not allowed.

In UDF we need ANSI SQL style to declare cursor.

In TSQL, user-defined functions cannot modify state of the database or persist information between executions. This is one of the reasons to disallow insert/update/delete on permanent tables or dynamic SQL.

This also means that UDFs will disallow cursors that are global in nature which happens to be the case with those declared using the ANSI SQL syntax. Note that there is no way to specify cursor as local using the ANSI SQL syntax. So we default all cursors declared within UDFs to be local implicitly. This creates a conflict in case of the DECLARE CURSOR statement using ANSI SQL syntax since that doesn't support the local option. Hence the error message about mixing old and new syntax for cursor options. The workaround is to use TSQL syntax for the DECLARE CURSOR statement to specify LOCAL or leave it out in which case we default to LOCAL anyway.



IF OBJECT_ID(N'dbo.fn_MYUDF', N'FN')IS NOT NULL
   BEGIN
      DROP FUNCTION [dbo].[fn_MYUDF];
   END
GO
CREATE FUNCTION [dbo].[fn_MYUDF]
       (
         @p_OPMODE   INT
       ) 
RETURNS INT
AS
BEGIN
   DECLARE @intRETURN INT;
  
   DECLARE cur_test CURSOR SCROLL
        FOR SELECT o.id FROM sysobjects as o;
       
   RETURN @intRETURN; 
END 

OR

IF OBJECT_ID(N'dbo.fn_MYUDF', N'FN')IS NOT NULL
   BEGIN
      DROP FUNCTION [dbo].[fn_MYUDF];
   END
GO
CREATE FUNCTION [dbo].[fn_MYUDF]
       (
         @p_OPMODE   INT
       ) 
RETURNS INT
AS
BEGIN
   DECLARE @intRETURN INT;
  
   DECLARE cur_test  CURSOR LOCAL SCROLL
        FOR SELECT o.id FROM sysobjects as o;
       
   RETURN @intRETURN; 
END   

Return Only One Result Set
UDFs can return only one row set to the user, whereas stored procedures can return multiple row sets.

UDF Can NOT Call Regular Stored Procedure
We can't call regular stored Procedure from functions - only other functions or some extended stored procedures.

-- Procedure
IF OBJECT_ID(N'dbo.proc_MYPROC', N'P')IS NOT NULL
   BEGIN
      DROP PROCEDURE [dbo].[proc_MYPROC];
   END
GO
CREATE PROCEDURE [dbo].[proc_MYPROC]
     (
        @p_EMPID   INT = 0
     )
 AS
 BEGIN
    SELECT *
    FROM   tbl_Employee
    WHERE  EMPID = @p_EMPID;
 END      
GO

--Function
IF OBJECT_ID(N'dbo.fn_MYUDF', N'FN')IS NOT NULL
   BEGIN
      DROP FUNCTION [dbo].[fn_MYUDF];
   END
GO
CREATE FUNCTION [dbo].[fn_MYUDF]
       (
         @p_ID   INT
       ) 
RETURNS INT
AS
BEGIN
   DECLARE @intRETURN INT;
  
   EXEC [dbo].[proc_MYPROC]
      @p_EMPID = @p_ID
       
   RETURN @intRETURN; 
END   

--Function Execution
SELECT  dbo.fn_MYUDF (1)

Msg 557, Level 16, State 2, Line 1
Only functions and some extended stored procedures
can be executed from within a function.

Can NOT Execute Dynamic SQL
UDFs also cannot execute dynamically constructed SQL statements. If we need to construct a statement dynamically based on the parameter values, we must resort to using stored procedures.

Can NOT Support SET Operation
SET options can affect not only the performance of the queries, but their output as well.
SET options cannot be altered within UDFs.

IF OBJECT_ID(N'dbo.fn_MYUDF', N'FN')IS NOT NULL
   BEGIN
      DROP FUNCTION [dbo].[fn_MYUDF];
   END
GO
CREATE FUNCTION [dbo].[fn_MYUDF]
       (
         @p_ID   INT
       ) 
RETURNS INT
AS
BEGIN
   DECLARE @intRETURN INT;
  
   SET NOCOUNT ON;
       
   RETURN @intRETURN; 
END 


Msg 443, Level 16, State 15, Procedure fn_MYUDF, Line 10
Invalid use of a side-effecting operator 'SET OPTION ON' within a function.

Error Checking Limitation
RAISERROR, PRINT statement is not allowed by UDF. Even @@ERROR global variable is not supported by UDF. If you encounter an error, UDF execution simply stops, and the calling routine fails. We are allowed to write a message to the Windows error log with xp_logevent if you have permission to use this extended procedure.




Hope you like it





Posted by: MR. JOYDEEP DAS