Friday, 28 September 2012

Sparse Columns



Introductions

In this article I am trying to discuss about a very interesting feture of MS SQL Server and it is called the Sparse Columns. This feature is started from MS SQL Server 2008 and onwards. Hope it will be interesting and informative.

Point in Focus

What is Sparse columns

Understand the Sparse columns with Example

Which columns not support it

Which datatypes not support it.

Where it is Best use

Others restrictions

What is Sparse columns

Sparse columns are nothing but a ordinary columns which is designed for optimized the NULL values. The benefits of the Sparse columns is they reduce space required for NULL values.
Another benefits is its reduce the costs of increased processing overhead to retrive the non-null values.

Let see the MSDN defination related to Sparse columns.

Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent


In MS SQL Server 2008 the maximum columns supported by a table objects is 1024. But the Sparse columns does not comes under this limit. The maximum Sparse coum supported by a table is 100,000. So a table objects may conatins 1024 regular columns + 100,000 Sparse columns.

Understand the Sparse columns with Example

In this example I am using a system stored procedure sp_spaceused. Before staring the example a samll note related to sp_spaceused from MSDN is mentioned bellow. For more details, please follow the MSDN link.

Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database.


Step-1 [ Creating the Base Table ]

-- Without SPARSE Columns
CREATE TABLE ExmapleTable1
      (EMPID   INT IDENTITY(1,1),
       FNAME   VARCHAR(50),
       LNAME   VARCHAR(50));

GO
-- With SPARSE
CREATE TABLE ExmapleTable2
      (EMPID   INT IDENTITY(1,1),
       FNAME   VARCHAR(50) SPARSE,
       LNAME   VARCHAR(50) SPARSE);

Step-2 [ Now Insert Some Records ]

DECLARE @idx INT = 0;
WHILE @idx < 50000
      BEGIN
         INSERT INTO ExmapleTable1 VALUES (NULL, NULL);
         INSERT INTO ExmapleTable2 VALUES (NULL, NULL);
         SET @idx+=1;
      END

Step-3 [ Now compre both ]

sp_spaceused 'ExmapleTable1'
GO
sp_spaceused 'ExmapleTable2'


Name
Rows
Reserved
Data
index_size
Unused
ExmapleTable1
50000
2824 KB
2808 KB
8 KB
8 KB
Name
Rows
Reserved
Data
index_size
Unused
ExmapleTable2
50000
1416 KB
1408 KB
8 KB
8 KB
                                                           
Which columns not support it

Some of the columns mentioed bellow is not supported by Sparse
Computed Columns / RowGuid / Filestream / Identity / XML

Which datatypes not support it

The following datatype not supported Sparse
Geography / Geometry / Image / Ntext / Text / Timestamp / Userdefine data type

Where it is Best use

The Sparse columns is best used in Fintered Index. Where data are filtered in the row.

Others restrictions


We can not bound rueles with sparse columns and it can not have any defualt value.
It can not be a part of Clustered or a unique Primary key indexes. It can not be used as a partition key of a clustered index or heap.

Sparse columns can not be used with userdefine table type (Table variable and Table valued parameters)



Hope you like it.




Posted By: MR. JOYDEEP DAS

Saturday, 22 September 2012

Inserted Scanning Performance




Introduction

Here there is an interesting case scenario mentioned bellow.

We're pulling values from an "inserted" table that is created in an
insert trigger. When we do an insert and view the execution plan in
Query Analyzer, the "Inserted Scan" accounts for 98% of the entire
query cost


So, in this article we are trying to a closer look related to the Inserted and Deleted table performance and how to optimize the performance of Inserted and Deleted table.

Please note that the details related to Inserted and Deleted table and how to use it in Trigger is the out of scope of this article.

Point in Focus

Facts related to Inserted and Deleted Tables

 Why the Performance is going down

How to Improve the Performance


Facts related to Inserted and Deleted Tables

In MS SQL Server 2000, these logical tables internally refer to database transaction log to provide data when user queries them.

In SQL Server 2005, these logical tables are maintained in tempdb and they are maintained using the new technology Row versioning.

Accessing of logical tables is much faster in MS SQL Server 2005 when compared to MS SQL Server 2000 as the load is removed from transaction log and moved to tempdb.

We cannot create any Index in the Logical table.

Why the Performance is going down

In general cases the performance is going too degraded if we use the Inserted and Deleted table more than once within triggers.

How to Improve the Performance

Performance can be improved, if we putting it into temp database by using temp table and index it well.


If anyone has any suggestion related to improving the performance of Inserted and deleted table, please make comments on this post.




Hope you like it.





Posted by: MR. JOYDEEP DAS

Tuesday, 11 September 2012

Foreign Key and Index Binding


Introductions
In this article I am trying to discuss about the foreign key Index binding features. Hope it will be interesting and we can learn something new from it.
Point in focus
1.    Foreign Key Definitions
2.    Why Index Binding is necessary
3.    What we find in our observations

Let's start it.

Foreign Key Definitions

Here I am just providing a definition of foreign key to understand it properly. If we look at the MSDN

"A foreign key (FK) is a column or combination of columns used to establish and enforce a link between the data in two tables. A link is created between two tables by adding the column or columns that hold one table's primary key values to the other table. This column becomes a foreign key in the second table."

In ideal conditions a foreign key must be liked with the Primary Key constraint in another table. But it is not necessary; it can be defined to reference the columns of a UNIQUE constraint in another table.  

Why Index Binding is necessary

A foreign key referencing the primary key will always be bound to the primary key index. However, we might have additional unique indexes on the primary key column or column(s) for performance reasons.

This Index binding is necessary for two reasons.

Performance Factors

If the parent table is not queried often but rows are inserted into the child table frequently and heavily, a unique non-clustered index that "covers" the referential integrity check may be more desirable than the clustered index.

In SQL server 2008 and later versions preferred unique non-clustered index rather than the clustered index. 

Control the index bound 
If we try to drop an index bound to the foreign key it generate an error message. To do so we must drop the foreign key first and then drop the index.
What we find in our observations
Now we see a simple example of foreign key Index binding and observe the result very carefully.
Step-1 [ Creating Parent Table ]
IF OBJECT_ID('ChildTable') IS NOT NULL
   BEGIN
     DROP TABLE ChildTable;
   END
GO
IF OBJECT_ID('ParentTable') IS NOT NULL
   BEGIN
     DROP TABLE ParentTable;
   END
GO

CREATE TABLE dbo.ParentTable
       (PtblID INT         NOT NULL IDENTITY
               CONSTRAINT PK_ParentTable 
               PRIMARY KEY CLUSTERED,
        Col1   INT         NOT NULL,
        Col2   VARCHAR(50) NOT NULL

       );

GO
Step-2 [ Creating non Clustered Covered Index ]
CREATE UNIQUE NONCLUSTERED INDEX idx_PTable_PtblID
       ON dbo.ParentTable(PtblID) INCLUDE(Col1);

Step-3 [ Inserting some records in Parent Table ]
INSERT INTO dbo.ParentTable VALUES(1, 'Joydeep Das');
INSERT INTO dbo.ParentTable VALUES(2, 'Sukamal Jana');
INSERT INTO dbo.ParentTable VALUES(3, 'Tuhin Shinah');

GO
Step-4 [ Creating Child table with Foreign Key ]
CREATE TABLE dbo.ChildTable
       (CtblID   INT NOT NULL IDENTITY
                 CONSTRAINT PK_ChildTable PRIMARY KEY CLUSTERED,
        PtblID   INT NOT NULL 
                 CONSTRAINT FK_ChildTable_ParentTable
                 FOREIGN KEY REFERENCES dbo.ParentTable(PtblID)

       );

GO
Step-5 [ Inserting Records to Child Table ]
INSERT INTO dbo.ChildTable VALUES(1);
INSERT INTO dbo.ChildTable VALUES(1);
INSERT INTO dbo.ChildTable VALUES(1);
INSERT INTO dbo.ChildTable VALUES(1);
INSERT INTO dbo.ChildTable VALUES(2);
INSERT INTO dbo.ChildTable VALUES(2);
INSERT INTO dbo.ChildTable VALUES(2);
INSERT INTO dbo.ChildTable VALUES(2);
INSERT INTO dbo.ChildTable VALUES(3);
INSERT INTO dbo.ChildTable VALUES(3);
INSERT INTO dbo.ChildTable VALUES(3);
INSERT INTO dbo.ChildTable VALUES(3);

GO
Step-6 [ Update the Statistics ]
UPDATE STATISTICS dbo.ParentTable;
UPDATE STATISTICS dbo.ChildTable;

GO

Step-7 [ Finding the Foreign key Binding ]
SELECT a.name As [FK Binding]
FROM   sys.foreign_keys f
       INNER JOIN sys.indexes
       ON a.object_id = f.referenced_object_id    
          AND a.index_id = f.key_index_id
WHERE  f.object_id = OBJECT_ID(N'dbo.FK_ChildTable_ParentTable');

GO
Now lets execute the above scrips in different version of the MS SQL Server.
MS SQL Server 2005
The output of the step-7 is
FK Binding
PK_ParentTable


SQL Server 2005 chooses the clustered index when possible and, if no suitable clustered index exists, then the unique non-clustered index on the referenced column is used.
SQL 2008, SQL 2008R2 and SQL 2012
FK Binding
idx_PTable_PtblID


The foreign key is bound to the unique non-clustered index on the referenced column.
Hope you like it.


Posted by: MR. JOYDEEP DAS

Friday, 7 September 2012

INFORMATION_SCHEMA


Introductions

When we are taking about any RDBMS, it should be a data dictionary or metadata. MS SQL Server has no exceptions.
In MS SQL Server there are two sources to view the metadata information.  
1.    Various System Tables
2.    INFORMATION_SCHEMA views

In this article I am trying to discuss about the INFORMATION_SCHEMA.

Point to Focus

1.    Metadata
2.    Anything wrong to pull information from system table
3.    About INFORMATION_SCHEMA view
4.    Descriptions of views
5.    Using INFORMATION_SCHEMA

Metadata

First we understand that what metadata is. In common understanding about metadata, we can say that it is "data about data". In the context of database it means "Information (data) stored about data, the structures or objects related to data".

Anything wrong to pull information from system table

As per Microsoft, information wise there is no difference but Microsoft reserves the rights to change any system table from version to version. So if an application depends on the system table can face a serious problem due to version change. Is it true? I don't find such an example in my development career. But we should remember the statutory warning of Microsoft.


About INFORMATION_SCHEMA view

SQL Server makes available the Information Schema Views through INFORMATION_SCHEMA schema. It is available in each database and storing information about all database objects contained in the respective database.  The following query is used to retrieve the views available under the INFORMATION_SCHEMA schema and their definitions.

SELECT s.name [Schema_Name],
       o.name [Object_Name],
       sm.definition [Schema_Defination]
FROM   sys.all_sql_modules sm
       INNER JOIN sys.all_objects o ON sm.object_id = o.object_id
       INNER JOIN sys.schemas s ON o.schema_id = s.schema_id
WHERE  s.name = 'INFORMATION_SCHEMA'
ORDER BY o.name;




Here most of the views names are self describing.


Descriptions of Views

Here are the lists of views and there descriptions are mentioned bellow. For better understanding please refer to MSDN.

No
View
Description
1.
Returns one row for each CHECK constraint
2.
Returns one row for each column that has an alias data type
3.
Returns one row for each column that has a privilege that is either granted to or granted by
4.
Returns one row for each column
5.
Returns one row for each column that has a constraint defined on it
6.
Returns one row for each table that has a constraint defined on it
7.
Returns one row for each alias data type that has a rule bound to it
8.
Returns one row for each alias data type
9.
Returns one row for each column that is constrained as a key
10.
Returns one row for each parameter of a user-defined function or stored procedure
11.
Returns one row for each FOREIGN KEY constraint
12.
Returns one row for each stored procedure and function
13.
Returns one row for each column returned by table-valued functions
14.
Returns one row for each schema
15.
Returns one row for each table constraint
16.
Returns one row for each table privilege that is granted to or granted by
17.
Returns one row for each table
18.
Returns one row for each column that is used in a view definition
19.
Returns one row for each table that is used in a view
20.
Returns one row for views



Using INFORMATION_SCHEMA

Here is the simple example to understand the information schema.

1.  TABLEs

SELECT * FROM information_schema.tables
WHERE TABLE_TYPE = 'BASE TABLE';

SELECT * FROM information_schema.tables
WHERE  TABLE_TYPE = 'BASE TABLE'
       AND table_name = 'xyz';
      
SELECT * FROM information_schema.tables
WHERE  TABLE_TYPE = 'BASE TABLE'
       AND table_name = 'xyz'
       AND table_schema = 'abc';

2.  VIEWs

SELECT * FROM information_schema.tables
WHERE  TABLE_TYPE = 'VIEW';


3.  List CONSTRAINTs

it retrieves all the constraints of Table objects "XYZ"

SELECT constraint_name, constraint_type
FROM   INFORMATION_SCHEMA.TABLE_CONSTRAINTS
WHERE  table_name = 'xyz';


4.  List FUNCTIONs


SELECT routine_name
FROM   INFORMATION_SCHEMA.ROUTINES
WHERE  routine_type = N'FUNCTION';



Hope you like it.



Posted by: MR. JOYDEEP DAS