Microsoft SQL Server Knowledge Bank: February 2015

Saturday 28 February 2015

TempDB for Performance

Introduction

TempDB plays a very informant role in case of Performance is concern. In this article we are trying to learn something related to TempDB. We are taking the references from Microsoft MSDN for this article.

Hope it will be informative.

When we Use the TempDB

TempDB system database is the global recourses for all users connected with SQL Server Interface. The following objects are stored in the TempDB is mentioned bellow.

User defined Objects is explicitly created by user and the scope of the user object is specific session dependent or in the scope of the routine where it is created. Here the routine means the Stored Procedure (SP), Trigger or User define Function (UDF).

The example of user define objects are mentioned bellow

· User-defined tables and indexes

· System tables and indexes

· Global temporary tables and indexes

· Local temporary tables and indexes

· Table variables

· Tables returned in table-valued functions

Internal Objects are created as necessary by the SQL Server Database Engine to process SQL Server statements. Internal objects are created and dropped within the scope of a statement.

Internal objects can be one of the following:

Work tables for cursor or spool operations and temporary large object (LOB) storage.
Work files for hash join or hash aggregate operations.
Intermediate sort results for operations such as creating or rebuilding indexes (if SORT_IN_TEMPDB is specified), or certain GROUP BY, ORDER BY, or UNION queries.

Version Stores is a collection of data pages that hold the data rows that are required to support the features that use row versioning. There are two version stores: a common version store and an online-index-build version store.

The version stores contain the following:

· Row versions that are generated by data modification transactions in a database that uses snapshot or read committed using row versioning isolation levels.

· Row versions that are generated by data modification transactions for features such as: online index operations, Multiple Active Result Sets (MARS), and AFTER triggers.

Managing the TempDB

So we have to take special care of TempDB to maintain performance of Database. Here Microsoft provides us some recommendation that we are going to discuss.

1. Set the recovery model of TempDB to SIMPLE. This model automatically reclaimed log space.

2. Allow for TempDB files to automatically grow as required. This allows for the file to grow until the disk is full. Try to avoid TempDB file to grow with small values as auto grow takes a certain amount of time and it is not tolerable by our application.

Here is the recommendation chart from Microsoft.

TempDB file size	FILEGROWTH increment
0 to 100 MB	10 MB
100 to 200 MB	20 MB
200 MB or more	10%*

3. Pre allocate space for all TempDB files by setting the file size to a value large enough to accommodate the typical workload in the environment. This prevents TempDB from expanding too frequently, which can affect performance.

4. Create as many files as needed to maximize disk bandwidth.

5. Put the TempDB database on a fast I/O subsystem. Use disk striping if there are many directly attached disks.

6. Put the TempDB database on disks that differ from those that are used by user databases.

How to Measure the TempDB Size and Growth Pattern

SELECT name AS FileName, size*1.0/128 AS FileSizeinMB,

CASE max_size WHEN 0 THEN 'Autogrowth is off.'

WHEN -1 THEN 'Autogrowth is on.'

ELSE 'Log file will grow to a maximum size of 2 TB.'

END,

growth AS 'GrowthValue',

'GrowthIncrement' =

CASE WHEN growth = 0 THEN 'Size is fixed and will not grow.'

WHEN growth > 0 AND is_percent_growth = 0

THEN 'Growth value is in 8-KB pages.'

ELSE 'Growth value is a percentage.'

END

FROM tempdb.sys.database_files;

FileName FileSizeinMB (No column name) GrowthValue GrowthIncrement

tempdev 8.000000 Autogrowth is on. 10 Growth value is a percentage.

templog 0.500000 Autogrowth is on. 10 Growth value is a percentage.

Hope you like it.

Posted by: MR. JOYDEEP DAS

Saturday 14 February 2015

All about Statistics [ Part – I ]

Introduction

I got a query from one of my friends. He is asking about when to use sp_updatestats and how frequently we can use it. To answer his query, I decide to write some words related to Statistics. Hope it will be informative for all of us.

Use of Statistics Objects

We all know about the important of statistics object to make efficient execution plan by optimizer. If the statistics object is not updated properly causes poor execution plan and down the performance factors of query.

When the Statistics Object Created

When we are creating the Index the Statistics object is created automatically called the Index Statistics. These statistics will exist as long as the index exists.

Second, assuming the database option Auto Create Statistics is enabled, which it is by default, SQL Server will create single-column statistics whenever a column, which is not already the leading column in an existing index, is used in a query predicate (e.g. in the search condition of a WHERE clause, or in a JOIN condition). We refer to these as Column Statistics. We can also use the CREATE STATISTICS command to create single- and multi-column statistics manually.

When the Statistics Object Updated

When we Insert/Update/Delete records from Table objects the SQL Server automatically Insert/Update/Delete corresponding rows in the Index.

But the Statistics is not updated like this, here we mean to say that when we Insert/Update/Delete the records from Table objects the Index updated automatically but Statistics not. Although the Auto Update Statistics option of the Database is enabled.

So we have to understand when the statistics is updated. It depends on certain volume threshold.

As data changes in our tables, the statistics - all the statistics - will be updated based on the following formula:

· When a table with no rows gets a row

· When 500 rows are changed to a table that is less than 500 rows

· When 20% + 500 are changed in a table greater than 500 rows

Every time we modify a record in a table, SQL Server tracks it via the rcmodified column in a hidden system table. SQL Server 2005 tracked this information in the sys.rowsetcolumns table, In SQL Server 2008 (and later) sys.rowsetcolumns merged with sys.syshobtcolumns and became sys.sysrscols.

When we create or rebuild (not reorganize, just rebuild) an index, SQL Server generates the statistics with a FULLSCAN, i.e. it scans all the rows in the table to create a histogram that represents the distribution of data in the leading column of the index. Likewise, SQL will auto-create column statistics with a full sample.

When to Update Statistics Manually

Suppose we have table objects of 100 millions of records and SQL server is going to Update the Statistics Objects when 20% of the records of 100 million is effected by Insert/Update/Delete. So we have to wait long for Statistics Object update and result is poor exaction plan creation by SQL server. In this situation we have to update the statistics object manually.

In cases where we know data distribution in a column is "skewed", it may be necessary to update statistics manually with a full sample, or create a set of filtered statistics, in order to generate query plans of good quality.

Examining the Statistics

sp_helpstats 'tbl_CUSTOMERDTLS', 'ALL'

statistics_name statistics_keys

--------------- ----------------

_WA_Sys_00000001_03BB8E22 CUSTID

_WA_Sys_00000002_03BB8E22 CUSTNAME

_WA_Sys_00000003_03BB8E22 TOTALSALES

_WA_Sys_00000004_03BB8E22 GRADE

IX_CUSTID_tbl_CUSTOMERDTLS CUSTID

The better approach

SELECT [sch].[name] + '.' + [so].[name] AS [TableName] ,

[si].[index_id] AS [Index ID] ,

[ss].[name] AS [Statistic] ,

STUFF(( SELECT ', ' + [c].[name]

FROM [sys].[stats_columns] [sc]

JOIN [sys].[columns] [c]

ON [c].[column_id] = [sc].[column_id]

AND [c].[object_id] = [sc].[OBJECT_ID]

WHERE [sc].[object_id] = [ss].[object_id]

AND [sc].[stats_id] = [ss].[stats_id]

ORDER BY [sc].[stats_column_id]

FOR

XML PATH('')

), 1, 2, '') AS [ColumnsInStatistic] ,

[ss].[auto_Created] AS [WasAutoCreated] ,

[ss].[user_created] AS [WasUserCreated] ,

[ss].[has_filter] AS [IsFiltered] ,

[ss].[filter_definition] AS [FilterDefinition] ,

[ss].[is_temporary] AS [IsTemporary]

FROM [sys].[stats] [ss]

JOIN [sys].[objects] AS [so] ON [ss].[object_id] = [so].[object_id]

JOIN [sys].[schemas] AS [sch] ON [so].[schema_id] = [sch].[schema_id]

LEFT OUTER JOIN [sys].[indexes] AS [si]

ON [so].[object_id] = [si].[object_id]

AND [ss].[name] = [si].[name]

WHERE [so].[object_id] = OBJECT_ID(N'tbl_CUSTOMERDTLS')

ORDER BY [ss].[user_created] ,

[ss].[auto_created] ,

[ss].[has_filter];

In next version we are trying to discuss more about it.

Hope you like it.

Posted by: MR. JOYDEEP DAS