I got a query
from one of my friends. He is asking about when to use sp_updatestats and how
frequently we can use it. To answer his query, I decide to write some words
related to Statistics. Hope it will be informative for all of us.
Use of Statistics Objects
We all know
about the important of statistics object to make efficient execution plan by
optimizer. If the statistics object is not updated properly causes poor
execution plan and down the performance factors of query.
When the Statistics Object
Created
When we are
creating the Index the Statistics object is created automatically called the Index Statistics. These statistics will exist
as long as the index exists.
Second,
assuming the database option Auto
Create Statistics is
enabled, which it is by default, SQL Server will create single-column
statistics whenever a column, which is not already the leading column in an
existing index, is used in a query predicate (e.g. in the search condition of a WHERE clause, or in a JOIN condition). We refer to these as Column Statistics. We can also use the CREATE STATISTICS command to create single- and multi-column statistics
manually.
When the Statistics Object
Updated
When we
Insert/Update/Delete records from Table objects the SQL Server automatically
Insert/Update/Delete corresponding rows in the Index.
But the Statistics is not updated like this,
here we mean to say that when we Insert/Update/Delete the records from Table
objects the Index updated automatically but Statistics not. Although the Auto
Update Statistics option of the Database is enabled.
So we have to
understand when the statistics is updated. It depends on certain volume
threshold.
As
data changes in our tables, the statistics - all the statistics - will be
updated based on the following formula:
·
When
a table with no rows gets a row
·
When
500 rows are changed to a table that is less than 500 rows
·
When
20% + 500 are changed in a table greater than 500 rows
Every time we modify a record in a
table, SQL Server tracks it via the rcmodified column in a hidden system table. SQL
Server 2005 tracked this information in the sys.rowsetcolumns table, In SQL Server 2008 (and later) sys.rowsetcolumns merged with sys.syshobtcolumns and became sys.sysrscols.
When we create or rebuild (not reorganize,
just rebuild) an index, SQL Server generates the statistics with a FULLSCAN, i.e. it scans all the rows
in the table to create a histogram that represents the distribution of data in
the leading column of the index. Likewise, SQL will auto-create column
statistics with a full sample.
When
to Update Statistics Manually
Suppose we have table objects of 100
millions of records and SQL server is going to Update the Statistics Objects
when 20% of the records of 100 million is effected by Insert/Update/Delete. So
we have to wait long for Statistics Object update and result is poor exaction
plan creation by SQL server. In this situation we have to update the statistics
object manually.
In cases where we know data
distribution in a column is "skewed", it may be necessary to update
statistics manually with a full sample, or create a set of filtered statistics,
in order to generate query plans of good quality.
Examining
the Statistics
sp_helpstats 'tbl_CUSTOMERDTLS', 'ALL'
statistics_name statistics_keys
--------------- ----------------
_WA_Sys_00000001_03BB8E22 CUSTID
_WA_Sys_00000002_03BB8E22 CUSTNAME
_WA_Sys_00000003_03BB8E22 TOTALSALES
_WA_Sys_00000004_03BB8E22 GRADE
IX_CUSTID_tbl_CUSTOMERDTLS CUSTID
The better approach
SELECT [sch].[name] + '.' + [so].[name] AS [TableName] ,
[si].[index_id] AS [Index ID] ,
[ss].[name] AS [Statistic] ,
STUFF(( SELECT ', ' + [c].[name]
FROM
[sys].[stats_columns] [sc]
JOIN [sys].[columns] [c]
ON [c].[column_id] = [sc].[column_id]
AND [c].[object_id] = [sc].[OBJECT_ID]
WHERE
[sc].[object_id] = [ss].[object_id]
AND [sc].[stats_id] = [ss].[stats_id]
ORDER BY
[sc].[stats_column_id]
FOR
XML PATH('')
), 1, 2, '') AS [ColumnsInStatistic] ,
[ss].[auto_Created] AS [WasAutoCreated] ,
[ss].[user_created] AS [WasUserCreated] ,
[ss].[has_filter] AS [IsFiltered] ,
[ss].[filter_definition] AS
[FilterDefinition] ,
[ss].[is_temporary] AS [IsTemporary]
FROM [sys].[stats] [ss]
JOIN [sys].[objects] AS [so] ON [ss].[object_id] = [so].[object_id]
JOIN [sys].[schemas] AS [sch] ON [so].[schema_id] = [sch].[schema_id]
LEFT OUTER JOIN [sys].[indexes] AS [si]
ON [so].[object_id] = [si].[object_id]
AND [ss].[name] = [si].[name]
WHERE [so].[object_id] = OBJECT_ID(N'tbl_CUSTOMERDTLS')
ORDER BY [ss].[user_created] ,
[ss].[auto_created] ,
[ss].[has_filter];
GO
In next version we are trying to
discuss more about it.
Hope you like it.
Posted
by: MR. JOYDEEP DAS
No comments:
Post a Comment