Microsoft SQL Server Knowledge Bank: September 2012

Friday, 28 September 2012

Sparse Columns

Introductions

In this article I am trying to discuss about a very interesting feture of MS SQL Server and it is called the Sparse Columns. This feature is started from MS SQL Server 2008 and onwards. Hope it will be interesting and informative.

Point in Focus

What is Sparse columns

Understand the Sparse columns with Example

Which columns not support it

Which datatypes not support it.

Where it is Best use

Others restrictions

What is Sparse columns

Sparse columns are nothing but a ordinary columns which is designed for optimized the NULL values. The benefits of the Sparse columns is they reduce space required for NULL values.

Another benefits is its reduce the costs of increased processing overhead to retrive the non-null values.

Let see the MSDN defination related to Sparse columns.

“Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent”

http://msdn.microsoft.com/en-us/library/cc280604.aspx

In MS SQL Server 2008 the maximum columns supported by a table objects is 1024. But the Sparse columns does not comes under this limit. The maximum Sparse coum supported by a table is 100,000. So a table objects may conatins 1024 regular columns + 100,000 Sparse columns.

Understand the Sparse columns with Example

In this example I am using a system stored procedure sp_spaceused. Before staring the example a samll note related to sp_spaceused from MSDN is mentioned bellow. For more details, please follow the MSDN link.

“Displays the number of rows, disk space reserved, and disk space used by a table, indexed view, or Service Broker queue in the current database, or displays the disk space reserved and used by the whole database.”

http://msdn.microsoft.com/en-us/library/ms188776.aspx

Step-1 [ Creating the Base Table ]

-- Without SPARSE Columns

CREATE TABLE ExmapleTable1

(EMPID INT IDENTITY(1,1),

FNAME VARCHAR(50),

LNAME VARCHAR(50));

-- With SPARSE

CREATE TABLE ExmapleTable2

(EMPID INT IDENTITY(1,1),

FNAME VARCHAR(50) SPARSE,

LNAME VARCHAR(50) SPARSE);

Step-2 [ Now Insert Some Records ]

DECLARE @idx INT = 0;

WHILE @idx < 50000

BEGIN

INSERT INTO ExmapleTable1 VALUES (NULL, NULL);

INSERT INTO ExmapleTable2 VALUES (NULL, NULL);

SET @idx+=1;

END

Step-3 [ Now compre both ]

sp_spaceused 'ExmapleTable1'

sp_spaceused 'ExmapleTable2'

Name	Rows	Reserved	Data	index_size	Unused
ExmapleTable1	50000	2824 KB	2808 KB	8 KB	8 KB
Name	Rows	Reserved	Data	index_size	Unused
ExmapleTable2	50000	1416 KB	1408 KB	8 KB	8 KB

Which columns not support it

Some of the columns mentioed bellow is not supported by Sparse

Computed Columns / RowGuid / Filestream / Identity / XML

Which datatypes not support it

The following datatype not supported Sparse

Geography / Geometry / Image / Ntext / Text / Timestamp / Userdefine data type

Where it is Best use

The Sparse columns is best used in Fintered Index. Where data are filtered in the row.

Others restrictions

We can not bound rueles with sparse columns and it can not have any defualt value.

It can not be a part of Clustered or a unique Primary key indexes. It can not be used as a partition key of a clustered index or heap.

Sparse columns can not be used with userdefine table type (Table variable and Table valued parameters)

Hope you like it.

Posted By: MR. JOYDEEP DAS

Saturday, 22 September 2012

Inserted Scanning Performance

Introduction

Here there is an interesting case scenario mentioned bellow.

We're pulling values from an "inserted" table that is created in an
insert trigger. When we do an insert and view the execution plan in
Query Analyzer, the "Inserted Scan" accounts for 98% of the entire
query cost

So, in this article we are trying to a closer look related to the Inserted and Deleted table performance and how to optimize the performance of Inserted and Deleted table.

Please note that the details related to Inserted and Deleted table and how to use it in Trigger is the out of scope of this article.

Point in Focus

Facts related to Inserted and Deleted Tables

Why the Performance is going down

How to Improve the Performance

Facts related to Inserted and Deleted Tables

In MS SQL Server 2000, these logical tables internally refer to database transaction log to provide data when user queries them.

In SQL Server 2005, these logical tables are maintained in tempdb and they are maintained using the new technology Row versioning.

Accessing of logical tables is much faster in MS SQL Server 2005 when compared to MS SQL Server 2000 as the load is removed from transaction log and moved to tempdb.

We cannot create any Index in the Logical table.

Why the Performance is going down

In general cases the performance is going too degraded if we use the Inserted and Deleted table more than once within triggers.

How to Improve the Performance

Performance can be improved, if we putting it into temp database by using temp table and index it well.

If anyone has any suggestion related to improving the performance of Inserted and deleted table, please make comments on this post.

Hope you like it.

Posted by: MR. JOYDEEP DAS

Tuesday, 11 September 2012

Foreign Key and Index Binding

Introductions

In this article I am trying to discuss about the foreign key Index binding features. Hope it will be interesting and we can learn something new from it.

Point in focus

1. Foreign Key Definitions

2. Why Index Binding is necessary

3. What we find in our observations

Let's start it.

Foreign Key Definitions

Here I am just providing a definition of foreign key to understand it properly. If we look at the MSDN

"A foreign key (FK) is a column or combination of columns used to establish and enforce a link between the data in two tables. A link is created between two tables by adding the column or columns that hold one table's primary key values to the other table. This column becomes a foreign key in the second table."

In ideal conditions a foreign key must be liked with the Primary Key constraint in another table. But it is not necessary; it can be defined to reference the columns of a UNIQUE constraint in another table.

Why Index Binding is necessary

A foreign key referencing the primary key will always be bound to the primary key index. However, we might have additional unique indexes on the primary key column or column(s) for performance reasons.

This Index binding is necessary for two reasons.

Performance Factors

If the parent table is not queried often but rows are inserted into the child table frequently and heavily, a unique non-clustered index that "covers" the referential integrity check may be more desirable than the clustered index.

In SQL server 2008 and later versions preferred unique non-clustered index rather than the clustered index.

Control the index bound

If we try to drop an index bound to the foreign key it generate an error message. To do so we must drop the foreign key first and then drop the index.

What we find in our observations

Now we see a simple example of foreign key Index binding and observe the result very carefully.

Step-1 [ Creating Parent Table ]

IF OBJECT_ID('ChildTable') IS NOT NULL

BEGIN

DROP TABLE ChildTable;

END

IF OBJECT_ID('ParentTable') IS NOT NULL

BEGIN

DROP TABLE ParentTable;

END

CREATE TABLE dbo.ParentTable

(PtblID INT NOT NULL IDENTITY

CONSTRAINT PK_ParentTable
PRIMARY KEY CLUSTERED,

Col1 INT NOT NULL,

Col2 VARCHAR(50) NOT NULL

);

Step-2 [ Creating non Clustered Covered Index ]

CREATE UNIQUE NONCLUSTERED INDEX idx_PTable_PtblID

ON dbo.ParentTable(PtblID) INCLUDE(Col1);

Step-3 [ Inserting some records in Parent Table ]

INSERT INTO dbo.ParentTable VALUES(1, 'Joydeep Das');

INSERT INTO dbo.ParentTable VALUES(2, 'Sukamal Jana');

INSERT INTO dbo.ParentTable VALUES(3, 'Tuhin Shinah');

Step-4 [ Creating Child table with Foreign Key ]

CREATE TABLE dbo.ChildTable

(CtblID INT NOT NULL IDENTITY

CONSTRAINT PK_ChildTable PRIMARY KEY CLUSTERED,

PtblID INT NOT NULL
CONSTRAINT FK_ChildTable_ParentTable

FOREIGN KEY REFERENCES dbo.ParentTable(PtblID)

);

Step-5 [ Inserting Records to Child Table ]

INSERT INTO dbo.ChildTable VALUES(1);

INSERT INTO dbo.ChildTable VALUES(2);

INSERT INTO dbo.ChildTable VALUES(3);

Step-6 [ Update the Statistics ]

UPDATE STATISTICS dbo.ParentTable;

UPDATE STATISTICS dbo.ChildTable;

Step-7 [ Finding the Foreign key Binding ]

SELECT a.name As [FK Binding]

FROM sys.foreign_keys f

INNER JOIN sys.indexes a
ON a.object_id = f.referenced_object_id

AND a.index_id = f.key_index_id

WHERE f.object_id = OBJECT_ID(N'dbo.FK_ChildTable_ParentTable');

Now lets execute the above scrips in different version of the MS SQL Server.

MS SQL Server 2005

The output of the step-7 is

FK Binding

PK_ParentTable

SQL Server 2005 chooses the clustered index when possible and, if no suitable clustered index exists, then the unique non-clustered index on the referenced column is used.

SQL 2008, SQL 2008R2 and SQL 2012

FK Binding

idx_PTable_PtblID

The foreign key is bound to the unique non-clustered index on the referenced column.

Hope you like it.

Posted by: MR. JOYDEEP DAS

Friday, 7 September 2012

INFORMATION_SCHEMA

Introductions

When we are taking about any RDBMS, it should be a data dictionary or metadata. MS SQL Server has no exceptions.

In MS SQL Server there are two sources to view the metadata information.

1. Various System Tables

2. INFORMATION_SCHEMA views

In this article I am trying to discuss about the INFORMATION_SCHEMA.

Point to Focus

1. Metadata

2. Anything wrong to pull information from system table

3. About INFORMATION_SCHEMA view

4. Descriptions of views

5. Using INFORMATION_SCHEMA

Metadata

First we understand that what metadata is. In common understanding about metadata, we can say that it is "data about data". In the context of database it means "Information (data) stored about data, the structures or objects related to data".

Anything wrong to pull information from system table

As per Microsoft, information wise there is no difference but Microsoft reserves the rights to change any system table from version to version. So if an application depends on the system table can face a serious problem due to version change. Is it true? I don't find such an example in my development career. But we should remember the statutory warning of Microsoft.

About INFORMATION_SCHEMA view

SQL Server makes available the Information Schema Views through INFORMATION_SCHEMA schema. It is available in each database and storing information about all database objects contained in the respective database. The following query is used to retrieve the views available under the INFORMATION_SCHEMA schema and their definitions.

SELECT s.name [Schema_Name],

o.name [Object_Name],

sm.definition [Schema_Defination]

FROM sys.all_sql_modules sm

INNER JOIN sys.all_objects o ON sm.object_id = o.object_id

INNER JOIN sys.schemas s ON o.schema_id = s.schema_id

WHERE s.name = 'INFORMATION_SCHEMA'

ORDER BY o.name;

Here most of the views names are self describing.

Descriptions of Views

Here are the lists of views and there descriptions are mentioned bellow. For better understanding please refer to MSDN.

No	View	Description
1.	CHECK_CONSTRAINTS	Returns one row for each CHECK constraint
2.	COLUMN_DOMAIN_USAGE	Returns one row for each column that has an alias data type
3.	COLUMN_PRIVILEGES	Returns one row for each column that has a privilege that is either granted to or granted by
4.	COLUMNS	Returns one row for each column
5.	CONSTRAINT_COLUMN_USAGE	Returns one row for each column that has a constraint defined on it
6.	CONSTRAINT_TABLE_USAGE	Returns one row for each table that has a constraint defined on it
7.	DOMAIN_CONSTRAINTS	Returns one row for each alias data type that has a rule bound to it
8.	DOMAINS	Returns one row for each alias data type
9.	KEY_COLUMN_USAGE	Returns one row for each column that is constrained as a key
10.	PARAMETERS	Returns one row for each parameter of a user-defined function or stored procedure
11.	REFERENTIAL_CONSTRAINTS	Returns one row for each FOREIGN KEY constraint
12.	ROUTINES	Returns one row for each stored procedure and function
13.	ROUTINE_COLUMNS	Returns one row for each column returned by table-valued functions
14.	SCHEMATA	Returns one row for each schema
15.	TABLE_CONSTRAINTS	Returns one row for each table constraint
16.	TABLE_PRIVILEGES	Returns one row for each table privilege that is granted to or granted by
17.	TABLES	Returns one row for each table
18.	VIEW_COLUMN_USAGE	Returns one row for each column that is used in a view definition
19.	VIEW_TABLE_USAGE	Returns one row for each table that is used in a view
20.	VIEWS	Returns one row for views

Using INFORMATION_SCHEMA

Here is the simple example to understand the information schema.

1. TABLEs

SELECT * FROM information_schema.tables

WHERE TABLE_TYPE = 'BASE TABLE';

SELECT * FROM information_schema.tables

WHERE TABLE_TYPE = 'BASE TABLE'

AND table_name = 'xyz';

SELECT * FROM information_schema.tables

WHERE TABLE_TYPE = 'BASE TABLE'

AND table_name = 'xyz'

AND table_schema = 'abc';

2. VIEWs

SELECT * FROM information_schema.tables

WHERE TABLE_TYPE = 'VIEW';

3. List CONSTRAINTs

it retrieves all the constraints of Table objects "XYZ"

SELECT constraint_name, constraint_type

FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS

WHERE table_name = 'xyz';

4. List FUNCTIONs

SELECT routine_name

FROM INFORMATION_SCHEMA.ROUTINES

WHERE routine_type = N'FUNCTION';

Hope you like it.

Posted by: MR. JOYDEEP DAS