Introduction
Performance
is a big issue for Microsoft SQL server. Every developer faces the Problem of
Query performance and they are try to tune the query accordingly.
Performance
of Microsoft SQL Server can be hampering for N number of reason including
hardware and networking. We have to find the proper reason behind it.
It need time
to find the proper reason and provide an actual solution for that.
It just likes
our real world situation.
Suppose we
have fever and we go to Doctor for treatment. Doctor knows very well that fever
is not a disease, it just a symptoms of disease. To find the root cause of the disease,
Doctors provide us some Tests but before that he needs to provide some quick
solution for Fever by providing tablets.
Here in this
article, we are trying to prove some quick fix of the Performance related
problem without finding the actual cause. We can also tell them the common
mistake that we have done with Microsoft SQL Server.
Quick Solution for Performance
Auto Create Statistics in SQL Server
Most of the
Microsoft SQL Server Database does not have Auto Create Statistic On.
The
statistical histograms are used by the Query Optimizer to choose the optimal
query execution plan. If a query predicate contains a column with statistics,
the Query Optimizer does not have to estimate the number of rows affected by
that query, thus the Query Optimizer has enough information to create the
execution plan.
The SQL
Server creates statistics in different ways:
- The statistics are automatically
created for each new index.
- If the database
setting AUTO_CREATE_STATISTICS is on, then the SQL Server will
automatically create statistics for non-indexed columns that are used in
your queries.
When we set
the AUTO_CREATE_STATISTICS option on, the Query Optimizer creates statistics on
individual columns used in a predicate, if these statistics are not already
available. These statistics are necessary to generate the query plan. They are
created on columns that do not have a histogram in an existing statistics
object.
These
statistics are used by the Query Optimizer to determine the optimal Query
Execution Plan.
We can
enable the automatic statistics creation by running this SQL statement:
ALTER DATABASE TestDB
SET AUTO_CREATE_STATISTICS ON
Auto Update Statistics in SQL Server
Statistics
are checked before query compilation or before executing a cached query plan.
Statistics are considered out-of-date when:
- There was a data change on an
empty table.
- The number of rows in the table
was 500 or less at the time of statistics creation and the column
modification counter of the leading column of the statistics object has
changed by more than 500 since then.
- The table had more than 500 rows
when the statistics were gathered, and the column modification counter of
the leading column of the statistics object has changed by more than 500 +
20% of the number of rows in the table when the statistics were gathered.
- A table in TempDB with less than
6 rows had at least 6 row modifications.
We can turn
on automatic statistics update by running this SQL statement:
ALTER DATABASE TestDB
SET AUTO_UPDATE_STATISTICS ON
The
outdated statistics can cause a lot of performance issues therefore it is recommended
to enable it. The default option is ON. The usual symptoms of non-updated
statistics are suboptimal query plans and degraded performance. You can also
set the database to update statistics asynchronously:
ALTER DATABASE TestDB
SET AUTO_UPDATE_STATISTICS_ASYNC ON
If we enable this option then the Query
Optimizer will run the query first and update the outdated statistics
afterwards. When you set this option to OFF, the Query Optimizer will update
the outdated statistics before compiling the query. This option can be useful
in OLTP environments while it can have negative effects in data warehouses.
Database AutoGrowth
We use SQL
Server database Autogrowth setting to automate database
file growth. This property is very helpful if you use it carefully with proper
planning. But there can be a negative performance impact on your database if we
don’t give attention on it and leave it with default values. If we have too
many autogrow events in our database, it can degrade the overall performance of
the database. Below are the points that we can consider to reduce the autogrow
events in a database to improve performance.
Auto
growth events are expensive operations that slow down
the performance of our database because whenever
an auto-growth event is performed, SQL Serverholds up database
processing. This equates to slower response time for those SQLcommands
that are being processing against the database that is growing.
Best Practice
We can improve the database performance and can manage the disk
space utilization by focusing on below points. This will also reduce the number
of Autogrowth events to be occurred in the database files.
1. Analyze your database growth pattern
and set optimum size of database with the analyzed growth settings. Our
database should have enough size that it never grows and if it grows make
sure that Instant
file initialization should be enabled on that SQL Server Instance. This
will drastically reduce the Autogrowth events in day to day life.
2. Proactively monitor database file
sizes and their autogrowth events. This helps us to analyze the growth pattern
of database files and avoid fragmentation.
3. Consider defragmenting your database
file system if we have lot of auto-growth events occurred on your databases.
4. Never leave your database autogrowth
with default values, change it as per growth pattern that you analyzed over
time.
5. Avoid using autogrowth in percentage;
rather give a specific amount of size in MB/GB. Microsoft suggests to set
your autogrow setting to about one-eighth the size of the file and
test it before deploying on PROD.
6. Turn on the <MAXSIZE> setting
for each file to prevent any one file from growing to a point where it uses up
all available disk space.
7. The growth increment of our
transaction log must be large enough to stay ahead of the needs of your
transaction units. Even with autogrow turned on, you can receive a
message that the transaction log is full, if it cannot grow fast enough to
satisfy the needs of your query.
Identifying the Missing Index
When we think
about the performance of the query, the very first time we have to think about
index. Finding the proper column of a table for indexing is a challenge.
We can
identify missing indexes in your SQL queries by 3 ways primarily —
·
Running
the Database Engine Tuning Advisor
·
Executing
Missing Index Dynamic Management Views
·
SQL
Server Engine prompts missing indexes when you generate Execution Plans in SSMS
SELECT DISTINCT CONVERT(DECIMAL(18, 2) ,
user_seeks * avg_total_user_cost * ( avg_user_impact *
0.01 )) AS
[index_advantage] ,
migs.last_user_seek ,
mid.[statement] AS [Database.Schema.Table] ,
mid.equality_columns ,
mid.inequality_columns ,
mid.included_columns ,
migs.unique_compiles ,
migs.user_seeks ,
migs.avg_total_user_cost ,
migs.avg_user_impact ,
OBJECT_NAME(mid.[object_id]) AS [Table Name] ,
p.rows AS [Table Rows]
FROM sys.dm_db_missing_index_group_stats
AS migs WITH ( NOLOCK )
INNER JOIN sys.dm_db_missing_index_groups AS
mig WITH ( NOLOCK )
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details AS
mid WITH ( NOLOCK )
ON mig.index_handle = mid.index_handle
INNER JOIN sys.partitions AS p WITH ( NOLOCK )
ON p.[object_id] = mid.[object_id]
WHERE mid.database_id = DB_ID()
ORDER BY index_advantage DESC
OPTION ( RECOMPILE );
Identify Unused Index
We are
creating Index to improve the performance of SQL query. What happens if the
Index is not used by SQL Server or we are creating index in those column that
are not needed to create index. Moreover by creating index we decrease the
performance of Insert/Update. Over indexing is also bad.
SQL Server
provides a significant amount of index information via Dynamic Management Views
(DMVs). The dm_db_index_usage_stats DMV displays essential information about
index usage, and it can be a useful tool in identifying unused SQL Server
indexes. When an index is used for the first time, a new row gets created in
the dm_db_index_usage_stats DMV and subsequently updated every time an index is
used. However, as with every DMV, the data present in dm_db_index_usage_stats contain
only the data since the last SQL Server service restart (SQL Server service
restart resets the data in the DMV). Therefore, it is critical that there is a
sufficient time since the last SQL Server restart that allows correctly
determining which indexes are good candidates to be dropped
SELECT
objects.name AS Table_name,
indexes.name AS Index_name,
dm_db_index_usage_stats.user_seeks,
dm_db_index_usage_stats.user_scans,
dm_db_index_usage_stats.user_updates
FROM
sys.dm_db_index_usage_stats
INNER
JOIN sys.objects
ON
dm_db_index_usage_stats.OBJECT_ID = objects.OBJECT_ID
INNER
JOIN sys.indexes
ON
indexes.index_id = dm_db_index_usage_stats.index_id
AND dm_db_index_usage_stats.OBJECT_ID = indexes.OBJECT_ID
WHERE
indexes.is_primary_key = 0 -- This condition
excludes primary key constarint
AND
indexes. is_unique = 0 -- This condition
excludes unique key constarint
AND
dm_db_index_usage_stats. user_lookups = 0
AND
dm_db_index_usage_stats.user_seeks = 0
AND
dm_db_index_usage_stats.user_scans = 0
ORDER BY
dm_db_index_usage_stats.user_updates DESC
Using Proper Data Type
Did you know that poor data type choices can have a significant
impact on our database design and performance. Developers and DBAs can
improve database performance by understanding the data types supported by SQL
Server and the implications of choosing different ones. A best practice is to
“right size” data types by asking business questions and determining the data
types that are best suited to the organization’s and application’s needs.
Right-sizing can result in tremendous storage savings, which can
lead to faster database performance. Other considerations to keep in mind are
SQL Server’s 8KB data page size limit and the conditions that can lead to page
splitting. Also watch out for implicit conversions, an unfortunate by-product
of mismatching data types. Taking steps to avoid mismatches and page splitting
can greatly enhance performance.
Clean T-SQL
When writing
T-SQL code we have to make it simple. If we make the complex SQL (Suppose
joining so many table), the SQL optimizer don’t get the proper time to make
multiple cost execution plan and choosing the most effective one. So we must
keep our SQL as simple as possible.
We have to
check the EXECUTION PLAN and try to understand the root cause of the poor
performance of the query.
There are several
things that we have to take care when writing SQL Statement. Some of them are mentioned
bellow.
1.
Don’t
use SELECT * statement. Use the proper column in SELECT statement
2.
Try
to make JOIN within the column that have INDEX
3.
Don’t
use any UDF within the SQL Statement
4.
Don’t
use any function within WHERE clause
5.
Don’t
use any Co-Related Sub query
6.
Don’t
try to join a lot of table together
7.
Don’t
use IN clause, use EXISTS
8.
In
Stored procedure try to use SET NOCOUNT ON
9.
We
are not preferring any APPLY Clause in SQL statement. Try to convert it in JOIN
10. Avoid Cursor and always try to make
SET Based SQL
11. Try to make Logic as Simple as
Possible
12. If needed use Temp Table
13. Try to avoid DISTINCT, ORDER BY Clause
14. Try to use Proper data type in Table
and try to avoid the Data conversion in SQL statement