Microsoft SQL Server Knowledge Bank: March 2015

Wednesday, 18 March 2015

Column Store Index in MS SQL 2012 / 2014

Introduction

Column store index is the unique feature of Microsoft SQL Server starting from Microsoft SQL 2012 and improving with Microsoft SQL Server 2014. Here in this article I am going to discuss about the architecture of Column store Index.

Please note that this Column store Index architecture is suitable only for OLAP or BI environment not in OTLP environment, others the performance down.

Architecture of Column Store Index

Before going to the depth of the architecture of column store index in MS SQL 2012, we assume that the we all well known about the row based index and how it load in the memory to fulfill the request of a query.

First of all we look at the figure to understand the column store index properly

Segment Concept:

Each column in the column store index is it own segment. A Segment contains the value from one column only. Each and every column in a table that include in the column store index forms individual segment. Which allows the column data access individually. Each segment can from by multiple pages.

A segment has the limitation of 1 millions of rows. So for a big table it is not be possible to fit single entire columns within a single segment, so single column of a table may be fit with multiple segments.

Data is transferred from the disk to memory by segment, not by page. A segment is a highly compressed Large Object (LOB) that can contain up to one million rows. The data within each column’s segment matches row-by-row so that the rows can always be assembled correctly.

In our example the disk data came from three pages that is Page -1, Page-2 and Page-3. If we crate column store index in the column named “Name” then the segment is created from Page-1, Page-2 and Page-3.

Group Concept:

When a column is not fit in a single segment as the data of columns exceed the one million then columns takes multiple segments to store data. In such cases, multiple segments are created for each column and grouped into multiple row groups, one for each set of segments.

When a column store index is broken into multiple row groups, each row group contains a set of complete rows. Please look at the bellow figure to understand it properly.

Dictionary Concept:

Besides column segments a column store index consists of another data storage element: dictionaries. Dictionaries are widely used in columnar storage as a means to efficiently encode large data types, like strings. The values stores in the column segments will be just entry numbers in the dictionary, and the actual values are stored in the dictionary. This technique can yield very good compression for repeated values, but yields bad results if the values are all distinct (the required storage actually increases). This is what makes large columns (strings) with distinct values very poor candidates for column store indexes. Column store indexes contain separate dictionaries for each column and string columns contain two types of dictionaries.

Primary Dictionary

This is a global dictionary used by all segments of a column.

Secondary Dictionary

This is an overflow dictionary for entries that did not fit in the primary dictionaries. It can be shared by several segments of a column: the relation between dictionaries and column segments is one-to-many.

Data Compression:

Column store indexes are based on xVelocity (formerly known as VertiPaq), an advanced storage and compression technology that originated with Power Pivot and Analysis Services but has been adapted to SQL Server 2012 databases.

New feature of Column Store in SQL 2014

We can create only one non-clustered column store index on a table although we can include all the columns of the table in the single index.
SQL Server 2014 enhances it to add support for creating Clustered Column store Index.
When we create a column store index and it makes the table read only.
With SQL Server 2014, we can create a column store index without having much impact on write-ability on the table. This means we can issue some INSERT, UPDATE, DELETE statements with a table with clustered column store index. No more tedious workaround is required for writing data to a table with column store index in this release like the previous release.

Hope you like it.

Posted by: MR. JOYDEEP DAS

Saturday, 14 March 2015

In Memory OLT P (Memory Optimizer Table) In SQL 2014

Introduction

In SQL 2014 comes with some amazing feature that we must discuss in my blog post. One of the interesting features is In Memory OLTP (Memory Optimizer Table).

In this article we are trying to learn about it. Hope it will be interesting and informative.

What is it?

Microsoft is releasing SQL Server 2014 with a new set of In-Memory OLTP features to significantly improve the OLTP performance and reduce the processing time for servers with a large amount of memory and multi-core processors.

In Memory OLTP is a new technology introduced by Microsoft to reduce the work load of OLTP and provide the high improvement of processing. The performance gains also depends on the hardware we have and how we writing our query. By this technology we are taking the full advantage of modern hardware trends and modern application requirement.

SQL Server is generally designed for saving data in the disk and loading the data into the memory to serve the request made by the Query. This done for resource management as it is quite expensive.

Nowadays the cost of the hardware is quite low and we can manage a server with huge REM and multi core and the new feature of Microsoft called In Memory OLTP works on it.

The In-memory OLTP features are a new database engine component, which is fully integrated into SQL Server and runs side by side with the traditional database engine. It allows us to declare a table to be stored in main memory so that your OLTP workload can access this memory resident data faster.

In the memory optimized table all the data is stored in the memory not in Disk.

Type of Memory Optimize table

There are two types of memory optimize table

1. SCHEMA_AND_DATA

2. SCHEMA_ONLY

SCHEMA_AND_DATA :

The SCHEMA_AND_DATA Memory-Optimized table is a table that resides in memory where the data is available after a server crash, a shutdown or a restart of SQL Server

Usages: A point of sales transaction data might be a good fit for a SCHEMA_AND_DATA type table. We might want our point of sales transactions to run as fast as possible so the memory-optimize type table would provide this, but we wouldn’t want to lose those transactions should our server be restarted.

SCHEMA_ONLY

SCHEMA_ONLY Memory-Optimized table is a table that does not persist data should SQL Server crash, or the instance is stopped or restarted. The SCHEMA_ONLY Memory-Optimized tables do retain their table structure should the server crash, or be shutdown.

Usages: SCHEMA_ONLY table would be useful for a staging table in a data warehouse application. Typically it is fairly easily to reload a data warehouse staging table from its data source. This is why making these type of tables a SCHEMA_ONLY type table is relatively safe.

Maintaining Version in Rows

Rows in memory-optimized tables are versioned. This means that each row in the table potentially has multiple versions. All row versions are maintained in the same table data structure. Row versioning is used to allow concurrent reads and writes on the same row.

How we Crate it?

To create Memory-Optimized table we are just following our step by step process to Understand it clearly.

Step-1 [ Create Database to Support Memory Optimize Table ]

First we have to create a database to support Memory-Optimized Table. If needed, we can alter our existing database also.

IF EXISTS (SELECT *

FROM sys.databases

WHERE name = N'InMemoryExample'

)

DROP DATABASE InMemoryExample;

CREATE DATABASE InMemoryExample

ON PRIMARY

(NAME = InMemory_Data,

FILENAME = N'C:\data\InMemoryExample_Data.mdf',

SIZE = 100MB,

FILEGROWTH = 10MB),

FILEGROUP InMemoryExample_InMemory CONTAINS MEMORY_OPTIMIZED_DATA

( NAME = InMemory_InMemory,

FILENAME = N'C:\data\InMemoryExample_InMemory.mdf')

LOG ON

( NAME = InMemoryExample_Log,

FILENAME = N'C:\data\InMemoryExample_Log.ldf',

SIZE = 100MB,

FILEGROWTH = 10MB)

The script look likes same. Please review the script carefully, the only differences that we find is

FILEGROUP InMemoryExample_InMemory CONTAINS MEMORY_OPTIMIZED_DATA

( NAME = InMemory_InMemory,

FILENAME = N'C:\data\InMemoryExample_InMemory.mdf')

We created a FILEGROUP named “InMemoryExample_InMemory” that will be used to support our Memory-Optimized tables. This file group contains a single file. Without this “MEMORY_OPTIMIZED_DATA” file group I wouldn’t be able to create a Memory-Optimized table in our new database.

Step-2 [ Creating Memory Optimized Table ]

Here we are going to create two types of memory optimized table.

SCHEMA_AND_DATA table:

IF OBJECT_ID('MemoryOptimized_Schema_And_Data','U') IS NOT NULL

DROP TABLE MemoryOptimized_Schema_And_Data

CREATE TABLE MemoryOptimized_Schema_And_Data

(

Id INT NOT NULL,

Col1 CHAR(1000) NOT NULL,

CONSTRAINT PK_MemoryOptimized_Schema_And_Data

PRIMARY KEY NONCLUSTERED HASH (Id)

WITH (BUCKET_COUNT = 1024)

) WITH (MEMORY_OPTIMIZED = ON,

DURABILITY = SCHEMA_AND_DATA);

SCHEMA_ONLY Table:

IF OBJECT_ID('MemoryOptimized_Schema_Only','U') IS NOT NULL

DROP TABLE MemoryOptimized_Schema_Only

CREATE TABLE MemoryOptimized_Schema_Only

(

Id INT NOT NULL,

Col1 CHAR(1000) NOT NULL,

CONSTRAINT PK_MemoryOptimized_Schema_Only

PRIMARY KEY NONCLUSTERED HASH (Id)

WITH (BUCKET_COUNT = 1024)

) WITH (MEMORY_OPTIMIZED = ON,

DURABILITY = SCHEMA_ONLY);

Now we have to understand it.

Please look at the part of the script carefully.

CONSTRAINT PK_MemoryOptimized_Schema_Only

PRIMARY KEY NONCLUSTERED HASH (Id)

WITH (BUCKET_COUNT = 1024)

Here we are creating NON CLAUSTERED HASH Index on columns ID. The memory optimized table needs the HASH index It cannot be more than the 8. With CTP1 only columns with Windows BIN2 collation types can be used for a HASH index. Therefore on our table we are just create a HASH index on the INT column and not the char column.

Second thing is BUCKET_COUNT. The BUCKET_COUNT identifies the number of different buckets SQL Server will create in memory to store my Memory-Optimized table records. Each bucket is identified by the value created when hashing the index column. Each unique index key value that has the same hash value will be stored in the same bucket. Therefore it is recommended that we create a bucket value that is equal to or greater than the number of unique key values we expect for your Memory-Optimized table.

Memory-Optimized tables only support the following data types: bit, tinyint, smallint, int, bigint, money, smallmoney, float, real, datetime, smalldatetime, datetime2, date, time, numberic, decimal, char(n), varchar(n) ,nchar(n), nvarchar(n), sysname, binary(n), varbinary(n), and Uniqueidentifier. Notice that none of the large Binary Object data types are allowed, even the variable character “max” data types. Something worth also mentioning is the combined record length must not exceed 8060. This record length limitation will be enforced at the time we create our table.

Step-3 [ Inserting data into Memory-Optimized Table ]

SET NOCOUNT ON;

USE InMemoryExample;

DELETE FROM MemoryOptimized_Schema_And_Data;

DELETE FROM MemoryOptimized_Schema_Only;

SET STATISTICS IO Off;

SET STATISTICS TIME Off;

DECLARE @s datetime = getdate()

-- Load Normal Table

DECLARE @I int = 0;

WHILE @I < 1000

BEGIN

SET @I+=1;

INSERT INTO Normal(Id,C1)
VALUES (@i,cast(@I as varchar(4)) + 'A');

END;

SELECT DATEDIFF(ms,@s,getdate()) as Normal;

-- Load SchemaAnadData table --

SET @s = getdate();

SET @I = 0;

WHILE @I < 1000

BEGIN

SET @I+=1;

INSERT INTO MemoryOptimized_Schema_And_Data

(Id, Col1)

VALUES (@i,cast(@I as varchar(4)) + 'A');

END;

SELECT DATEDIFF(ms,@s,getdate()) as SchemaAndData;

-- Load SchemaOnly table

SET @s = getdate();

SET @I = 0;

WHILE @I < 1000

BEGIN

SET @I+=1;

INSERT INTO MemoryOptimized_Schema_Only

(Id, Col1)

VALUES (@i,cast(@I as varchar(4)) + 'A');

END;

SELECT DATEDIFF(ms,@s,getdate()) As SchemaOnly;

Some limitation

NO TRUNCATE TABLE statement against my Memory-Optimized tables.
NO ALTER TABLE statement against my Memory-Optimized tables.
NO UPDATE of primary key columns of my Memory-Optimized tables.
NO FOREIGN KEY or CHECK constraints.
NO UNIQUE constraints other than the PRIMARY KEY.

Hope you like it.

Posted by: MR. JOYDEEP DAS

Sunday, 8 March 2015

Recursive Trigger Fire

Introduction

We never design database like that the recursive trigger fire. But what happens if recursive trigger happens? We have to understand it clearly and solve such kind of problem. This article is related to it. Hope it will be informative.

First We understand it?

We have two tables named Test_1 and Test_2. The Test_1 have trigger named trg_Test_1 which insert data to table Test_2 and Test_2 table have trigger named trg_Test_2 which Insert data to Test_1 Table.

Example of Recursive Trigger

Step-1 [ Creating Base Table ]

CREATE TABLE test_1

(

ID INT,

MARKS INT

)

CREATE TABLE test_2

(

ID INT,

MARKS INT

)

Step-2 [ Creating Trigger ]

CREATE TRIGGER trg_test_1 ON test_1

AFTER INSERT

BEGIN

INSERT INTO test_2

SELECT * FROM Inserted;

END

CREATE TRIGGER trg_test_2 ON test_2

AFTER INSERT

BEGIN

INSERT INTO test_1

SELECT * FROM Inserted;

END

Step-3 [ Insert Value in First Table so that Trigger can Fire ]

INSERT INTO test_1

VALUES(1, 20);

Msg 217, Level 16, State 1, Procedure trg_test_2, Line 5

Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).

If we look at the table we find no records is affected

SELECT * FROM test_1

SELECT * FROM test_2

Step-4 [ Now we Rectified the Trigger ]

DROP TRIGGER trg_test_1

CREATE TRIGGER trg_test_1 ON test_1

AFTER INSERT

BEGIN

IF trigger_nestlevel() < 2

BEGIN

INSERT INTO test_2

SELECT * FROM Inserted;

END

DROP TRIGGER trg_test_2

CREATE TRIGGER trg_test_2 ON test_2

AFTER INSERT

BEGIN

IF trigger_nestlevel() < 2

BEGIN

INSERT INTO test_1

SELECT * FROM Inserted;

END

Step-5 [Observation ]

INSERT INTO test_1

VALUES(1, 20);

SELECT * FROM test_1

SELECT * FROM test_2

ID MARKS

----------- -----------

1 20

(1 row(s) affected)

ID MARKS

----------- -----------

1 20

(1 row(s) affected)

Hope you like it.

Posted by: MR. JOYDEEP DAS

Saturday, 7 March 2015

Index on Computed Columns ?

Introduction

The Question that I always find in the bog post or in community post is
“Can we create the Index on Computed Columns”

The answer is not so easy. To explain it properly let’s try an Example. Hope you find it informative

Simple Test to Understand the Index in Computed Columns

Step-1

Create a Function with SCHEMA Binding

IF OBJECT_ID(N'dbo.func_TOTMARKSWITHPRACTICAL', N'FN')IS NOT NULL

BEGIN

DROP FUNCTION [dbo].[func_TOTMARKSWITHPRACTICAL];

END

CREATE FUNCTION [dbo].[func_TOTMARKSWITHPRACTICAL]

(

@p_MARKS INT

)

RETURNS INT

WITH SCHEMABINDING

BEGIN

DECLARE @v_TOTALMARKS INT;

SET @v_TOTALMARKS = @p_MARKS + 50;

RETURN @v_TOTALMARKS;

END

Step-2

Create the Base Table to Use SCHEMA Binding Function and Insert Records

IF OBJECT_ID(N'dbo.tbl_STUDENTDTLS', N'U')IS NOT NULL

BEGIN

DROP TABLE [dbo].[tbl_STUDENTDTLS];

END

CREATE TABLE [dbo].[tbl_STUDENTDTLS]

(

STDID INT NOT NULL PRIMARY KEY,

STDNAME VARCHAR(50) NOT NULL,

STDMARKS INT NOT NULL,

STDTOTALMARKS AS [dbo].[func_TOTMARKSWITHPRACTICAL](STDMARKS)

);

INSERT INTO [dbo].[tbl_STUDENTDTLS]

(STDID, STDNAME, STDMARKS)

VALUES (101, 'Joydeep Das', 100),

(102, 'Anirudha Dey', 150);

Step-3

Check the IsIndexTable Property of Computed Columns

SELECT (SELECT CASE COLUMNPROPERTY( OBJECT_ID('dbo.tbl_STUDENTDTLS'),

'STDTOTALMARKS','IsIndexable')

WHEN 0 THEN 'No'

WHEN 1 THEN 'Yes'

END) AS 'STDTOTALMARKS is Indexable ?'

STDTOTALMARKS is Indexable ?

----------------------------

Yes

Step-4

Check the IsDeterministic Property of Computed Columns

SELECT (SELECT CASE COLUMNPROPERTY( OBJECT_ID('dbo.tbl_STUDENTDTLS'),

'STDTOTALMARKS','IsDeterministic')

WHEN 0 THEN 'No'

WHEN 1 THEN 'Yes'

END) AS 'STDTOTALMARKS is IsDeterministic?'

STDTOTALMARKS is IsDeterministic?

---------------------------------

Yes

Step-5

Check the USERDATTACCESS Property of Computed Columns

SELECT (SELECT CASE COLUMNPROPERTY( OBJECT_ID('dbo.tbl_STUDENTDTLS'),

'STDTOTALMARKS','USERDATAACCESS')

WHEN 0 THEN 'No'

WHEN 1 THEN 'Yes'

END) AS 'STDTOTALMARKS is USERDATAACCESS?'

STDTOTALMARKS is USERDATAACCESS?

--------------------------------

Step-6

Check the IsSystemVerified Property of Computed Columns

SELECT (SELECT CASE COLUMNPROPERTY( OBJECT_ID('dbo.tbl_STUDENTDTLS'),

'STDTOTALMARKS','IsSystemVerified')

WHEN 0 THEN 'No'

WHEN 1 THEN 'Yes'

END) AS 'STDTOTALMARKS is IsSystemVerified?'

STDTOTALMARKS is IsSystemVerified?

----------------------------------

Yes

Step-7

Analyzing All Property output of Computed Columns

Property Name	Output
IsIndexable	Yes
IsDeterministic	Yes
USERDATAACCESS	No
IsSystemVerified	Yes

Step-8

So we can Crete Index on Computed Columns in this Situation

CREATE NONCLUSTERED INDEX IX_NON_tbl_STUDENTDTLS_STDTOTALMARKS

ON [dbo].[tbl_STUDENTDTLS](STDTOTALMARKS);

Step-9

Now Check the Same thing with Function Without Schema Binding

IF OBJECT_ID(N'dbo.func_TOTMARKSWITHPRACTICAL', N'FN')IS NOT NULL

BEGIN

DROP FUNCTION [dbo].[func_TOTMARKSWITHPRACTICAL];

END

CREATE FUNCTION [dbo].[func_TOTMARKSWITHPRACTICAL]

(

@p_MARKS INT

)

RETURNS INT

BEGIN

DECLARE @v_TOTALMARKS INT;

SET @v_TOTALMARKS = @p_MARKS + 50;

RETURN @v_TOTALMARKS;

END

Step-10

Now analyze the same property again

Property Name	Output
IsIndexable	No
IsDeterministic	No
USERDATAACCESS	Yes
IsSystemVerified	No

Step-11

In this scenario we are unable to Create index on Computed Columns

CREATE NONCLUSTERED INDEX IX_NON_tbl_STUDENTDTLS_STDTOTALMARKS

ON [dbo].[tbl_STUDENTDTLS](STDTOTALMARKS);

Error:

Msg 2729, Level 16, State 1, Line 1

Column 'STDTOTALMARKS' in table 'dbo.tbl_STUDENTDTLS'

cannot be used in an index or statistics or as a partition key

because it is non-deterministic.

Hope you like it.

Posted by: MR. JOYDEEP DAS