Sunday 15 July 2012

Index and Disk Space


Introductions

Disk space is a very important factor to Create, Rebuild or Drop Index. Disk space causes the performance degradation of an index or even fails. So the understanding about disk space is very important factors.  In this article I am trying to discuss related to it.
In this article we are discussing about

  1. Space required in Index Data Definition Language Operations
  2. An Example of Index Disk Space
  3. Transaction Log Disk Space for Index Operations

Space required in Index DDL Operations

Here I am mentioning some Index operations not required any additional disk space is mentioned bellow.

Ø  ALTER INDEX with REORGANIZE options not required any additional disk space as the index is reorganized within the existing pages allocated to it; no new pages are allocated. But the log space is required.
Ø  When we are dropping a non-clustered index by DROP INDEX or when we are dropping a clustered index offline without specifying the MOVE TO clause and non-clustered index do not exist.
Ø  When we are CREATE TABLE with PRIMARY KEY or UNIQUE constraints

The index operation that required additional disk space is mentioned bellow.
All other index DDL operations require additional temporary disk space to use during the operation, and the permanent disk space to store the new index structure or structures.


When a new index structure is created, disk space for both the old (source) and new (target) structures is required in their appropriate files and file groups. The old structure is not de-allocated until the index creation transaction commits.

The DDL operations that needs new index structures and required additional disk space is mentioned bellow.

Ø  CREATE INDEX

Ø  CREATE INDEX WITH DROP_EXISTING

Ø  ALTER INDEX REBUILD

Ø  ALTER TABLE ADD CONSTRAINT (PRIMARY KEY or UNIQUE)

Ø  ALTER TABLE DROP CONSTRAINT (PRIMARY KEY or UNIQUE) when the constraint is based on a clustered index

Ø  DROP INDEX MOVE TO (Applies only to clustered indexes.)

An Example of Index Disk Space

I am finding this example from MSDN.

The formula is used to measure, disk space requirements is mentioned bellow

IndexSize = NoOfRows *  AvgSizeOfIndex * 2 * (1/(100-FILLFACTOR)/100

In this example, disk space requirements to create a clustered index are determined. Assume the following conditions are true before creating the clustered index:

Ø  The existing table (heap) contains 1 million rows. Each row is 200 bytes long.

Ø  Non-clustered index A contains 1 million rows. Each row is 50 bytes long.

Ø  Non-clustered index B contains 1 million rows. Each row is 80 bytes long.

Ø  The index create memory option is set to 2 MB.

A fill factor value of 80 is used for all existing and new indexes. This means the pages are 80% full.

Note: As a result of creating a clustered index, the two non-clustered indexes must be rebuilt to replace the row indicator with the new clustered index key.

Disk Space Calculations for an Offline Index Operation

In the following steps, both temporary disk space to be used during the index operation and permanent disk space to store the new indexes are calculated. The calculations shown are approximate: results are rounded up and consider only the size of index leaf level. The tilde (~) is used to indicate approximate calculations.
  1.  Determine the size of the existing (source) structures

    Heap: 1 million * 200 bytes ~ 200 MB
    Non-clustered index A: 1 million * 50 bytes / 80% ~ 63 MB
    Non-clustered index B: 1 million * 80 bytes / 80% ~ 100 MB
    Total size of existing structures: 363 MB

  2. Determine the size of the new (target) index structures

    Assume that the new clustered key is 24 bytes long including a uniqueifier. The row indicator (8 bytes long) in both non-clustered indexes will be replaced by this clustered key.

    Clustered index: 1 million * 200 bytes / 80% ~ 250 MB
    Non-clustered index A: 1 million * (50 – 8 + 24) bytes / 80% ~ 83 MB
    Non-clustered index B: 1 million * (80 – 8 + 24) bytes / 80% ~ 120 MB
    Total size of new structures: 453 MB

    Total disk space required to support both the source and target structures for the duration of the index operation is 816 MB (363 + 453). The space currently allocated to the source structures will be deallocated after the index operation is committed.

  3. Determine additional temporary disk space for sorting

    Space requirements are shown for sorting in tempdb (with SORT_IN_TEMPDB set to ON) and sorting in the target location (with SORT_IN_TEMPDB set to OFF).

# When SORT_IN_TEMPDB is set to ON, tempdb must have sufficient disk space to hold the largest index (1 million * 200 bytes ~ 200 MB). Fill factor is not considered in the sorting operation. Additional disk space (in the tempdb location) equal to the index create memory Option value = 2 MB. Total size of temporary disk space with SORT_IN_TEMPDB set to ON ~ 202 MB.


# When SORT_IN_TEMPDB is set to OFF (default), the 250 MB of disk space already considered for the new index in step 2 is used for sorting. Additional disk space (in the target location) equal to the index create memory Option value = 2 MB. Total size of temporary disk space with SORT_IN_TEMPDB set to OFF = 2 MB.

Using tempdb, a total of 1018 MB (816 + 202) would be needed to create the clustered and non-clustered indexes. Although using tempdb increases the amount of temporary disk space used to create an index, it may reduce the time that is required to create an index when tempdb is on a different set of disks than the user database. For more information about using tempdb, see tempdb and Index Creation.

Without using tempdb, a total of 818 MB (816+ 2) would be needed to create the clustered and non-clustered indexes.

Disk Space Calculations for an Online Clustered Index Operation

When you create, drop, or rebuild a clustered index online, additional disk space is required to build and maintain a temporary mapping index. This temporary mapping index contains one record for each row in the table, and its contents are the union of the old and new bookmark columns.

To calculate the disk space needed for an online clustered index operation, follow the steps shown for an offline index operation and add those results to the results of the following step.

Determine space for the temporary mapping index. In this example, the old bookmark is the row ID (RID)) of the heap (8 bytes) and the new bookmark is the clustering key (24 bytes including a uniqueifier). There are no overlapping columns between the old and new bookmarks.

Temporary mapping index size = 1 million * (8 bytes + 24 bytes) / 80% ~ 40 MB. This disk space must be added to the required disk space in the target location if SORT_IN_TEMPDB is set to OFF, or to tempdb if SORT_IN_TEMPDB is set to ON.

Transaction Log Disk Space for Index Operations

To make sure that the Index operations can be rollback the transaction log cannot be truncated until the index operations has completed. Therefore, the transaction log must have sufficient room to store both the index operation transactions and any concurrent user transactions for the duration of the index operation. This is for both offline online index operations.

When we are using large scale index operation we must remember
  1. The Transaction log should be backed up and truncated before index operation online. The log has sufficient space to store the project index and user transactions.

  2. SORT_IN_TEMPDB options should be ON. This separates the index transactions from concurrent user transactions.

  3. Use database recovery model that allows minimal logging of the index operation. This may reduce the size of the log and prevent the log from filling the log space. 

  4. Don't run the ONLINE index operations in explicit transactions. The log will not be truncated until the explicit transaction ends.

Hope you like it.


Posted by: MR. JOYDEEP DAS

2 comments:

  1. # When SORT_IN_TEMPDB is set to ON, tempdb must have sufficient disk space to hold the largest index (1 million * 200 bytes ~ 200 MB). Fill factor is not considered in the sorting operation. Additional disk space (in the tempdb location) equal to the index create memory Option value = 2 MB. Total size of temporary disk space with SORT_IN_TEMPDB set to ON ~ 202 MB.


    above is not true. based on microsoft article, about 20% of index size is needed in tempdb if SORT_IN_TEMPDB set to ON.

    ReplyDelete