Archive for the ‘Performance Tuning’ Category

I use this script all the time when setting up an SSIS package. (Unfortunately, I can’t remember where I found the original code. I’ve adapted it slightly, so if anyone recognises the original then let me know and I’ll link to it.)

The Problem

When setting up a data flow in SSIS the data transfer speed can be very slow because the default settings in the package have not been optimised.

The Solution

SSIS Properties

The code below will show you each table in the database. I take the column MaxBufferSize and round it down to the nearest hundred – so 87235 becomes 87000. I use this value as the DefaultBufferMaxRows value. I change the DefaultBufferSize from 10485760 to 104857600 (same number but add a zero to the end). Finally, I’ll add values to the BlobTempStoragePath and BufferTempStoragePath, normally I’ll use C:\temp, but make sure the directory exists and you’re probably better choosing a value not on the C drive.

SELECT s.[name] + '.' + t.[name] as TableName, SUM (max_length) as [row_length], 10485760/ SUM (max_length) as MaxBufferSize
FROM sys.tables as t
JOIN sys.columns as c
ON t.object_id=c.object_id
JOIN sys.schemas s
ON t.schema_id=s.schema_id
GROUP BY s.[name], t.[name];

These changes will allow SSIS to load more rows simultaneously and so should speed up your loading. I tend to use OLD connection for Source and Destination.


We had a deadlock on a SQL 2008 r2 server but did not have the deadlock trace flags running. Luckily we now have Extended Events available for deadlocks.

To investigate I ran this code (taken from a post By Jonathan Kehayias

select CAST(
REPLACE(XEventData.XEvent.value(‘(data/value)[1]’, ‘varchar(max)’),

”, ”),
as xml) as DeadlockGraph
(select CAST(target_data as xml) as TargetData
from sys.dm_xe_session_targets st
join sys.dm_xe_sessions s on s.address = st.event_session_address
where name = ‘system_health’) AS Data
CROSS APPLY TargetData.nodes (‘//RingBufferTarget/event’) AS XEventData (XEvent)
where XEventData.XEvent.value(‘@name’, ‘varchar(4000)’) = ‘xml_deadlock_report’;

Which gave be a deadlock graph. Actually it gave a malformed deadlock graph because we need to patch the server (

To fix this I replaced the final </deadlock> with </victim-list> although you may not need to do this.

The graph gave me these bits of information (the info for both was the same (other than SPids etc) because it was the same stored proc running twice which cause the deadlock.))

<process id=”process50934c8″ taskpriority=”0″ logused=”568″ waitresource=”PAGE: 9:4:8453″ waittime=”2822″ ownerId=”73869146″
transactionname=”INSERT” lasttranstarted=”2012-05-04T18:15:41.003″ XDES=”0x17d4aa3b0″ lockMode=”IX” schedulerid=”12″
kpid=”5132″ status=”background” spid=”35″ sbid=”0″ ecid=”0″ priority=”0″ trancount=”2″>
<frame procname=”” line=”120″ stmtstart=”9448″ stmtend=”11092″ sqlhandle=”0x030009002a854154312dff0045a000000100000000000000″ />
<frame procname=”” line=”1″ stmtstart=”102″ sqlhandle=”0x010009009c37861ef05ac675050000000000000000000000″ />
<frame procname=”” line=”1″ sqlhandle=”0x0400ff7fbe80662601000000000000000000000000000000″ />
<frame procname=”” line=”78″ stmtstart=”4870″ stmtend=”5260″ sqlhandle=”0x030009000c8ae7465a93010145a000000100000000000000″ />
<frame procname=”” line=”34″ stmtstart=”1774″ stmtend=”1888″ sqlhandle=”0x030009009b87944d7ab3ff00489f00000100000000000000″ />
<inputbuf />

I took the sql handles and ran them here:

FRoM sys.dm_exec_sql_text(0x01000900953c3516b086df9f020000000000000000000000);
FRoM sys.dm_exec_sql_text(0x0400ff7fbe80662601000000000000000000000000000000);
FRoM sys.dm_exec_sql_text(0x030009000c8ae7465a93010145a000000100000000000000);
FRoM sys.dm_exec_sql_text(0x030009009b87944d7ab3ff00489f00000100000000000000);

It then listed three different stored procs. Only had an Insert candidate but I wanted to confirm it.

Back to the graph:
<pagelock fileid=”4″ pageid=”8453″ dbid=”9″ objectname=”” id=”locked7e4900″ mode=”S” associatedObjectId=”72057594040614912″>

From this I can get the object that was blocked, either by this:

dbcc traceon (3604); –needs to be on to display the results of dbcc page
dbcc page ( 9, 4, 8453 , 0);

Partial result set:

name object_id type type_desc
RealNameWithheld 133575514 U USER_TABLE

I confirmed this result by running this:

FROM sys.partitions
WHERE partition_id = 72057594040614912;

Using the associatedObjectId from the Deadlock graph

The result: SameTableName

Back to the grap to look at this:
<pagelock fileid=”4″ pageid=”8453″ dbid=”9″ objectname=”” id=”locked7e4900″ mode=”S” associatedObjectId=”72057594040614912″>
<owner id=”process3ce748″ mode=”S” />
<waiter id=”process50934c8″ mode=”IX” requestType=”convert” />
<pagelock fileid=”4″ pageid=”8453″ dbid=”9″ objectname=”” id=”locked7e4900″ mode=”S” associatedObjectId=”72057594040614912″>
<owner id=”process50934c8″ mode=”S” />
<waiter id=”process3ce748″ mode=”IX” requestType=”convert” />

In the stored procedure there was this code:

INSERT INTO RealNameWithheld (Column Listing)
SELECT A mix of variables and column names
FROM RealNameWithheld
WHERE request_id = ISNULL(@LastRequestID, -1)

The table being inserted into is the same as the table being used in the Select. The Select is asking for the shared lock (S) the insert is asking for an exclusive lock (IX – intent exclusive).

Both procedures ran at almost the same time – 3 milliseconds apart – and so both acquired a shared lock at almost the same time. They both then tried to acquire the exclusive lock but to do that they needed to wait for the other to relinquish its shared lock. A normal Select would eventually return its results and drop the lock. In this case it wouldn’t drop the lock as it is waiting to escalate it.

This problem doesn’t occur often so we can either leave it as it is. To fix it we could add a query hint to the SELECT part of the statement – with (TabLock). With this hint the first query to reach the table gains an exclusive lock on the table. The subsequent query will then wait.

In other circumstances we could use (NoLock) but that would be too risky on this table where the data is being updated frequently.

A work around: First select into a temporary table. Then use the temporary table to insert back into the original. This will be slightly slower but won’t cause the deadlock.

The notes are primarily taken from SQL Server 2008 Internals – Chapter 6 Indexes

Indexes – in general

The leaf level of an index contains something for every row of the table in indexed order.
Each non-leaf level stores something for every page of the level below.
When INCLUDE is used (non-clustered indexes only), leaf level pages can contain wider rows – wider than the 900 byte 16 col key max.
An index with a smaller key size scales faster -> Scale = non-leaf levels. This is because smaller key = less pages. This will also require fewer IOs to traverse.
The goal of an index is not to be extremely narrow. Extremely narrow indexes will be of less use than wider ones. But you need to be aware of the consequences of a wider index. INCLUDE and FILTERS can affect the index in both size and usefulness.
Index pages have almost the same structure as data pages except they store index records instead of data records.
8kb or 8,192 bytes
96 byte header
Off-set arrays of 2 bytes per row
Data rows have an Ncol field and null bitmap whether or not any columns allow null. Index rows only have a null bitmap and Ncol field if nulls are allowed in any of them columns.
Every index maintains a histogram representing the distribution of values for the leading column of the index. (To see the stats for a table use SP_AutoStats(TableName).
A constraint is a logical construct an index is a physical one.

Clustered Indexes

When a clustered index is created, the data becomes the leaf level of the index.
If there is no clustered index (a heap) then rows are identified by their physical row identifier (RID). This process of looking up corresponding data rows in the base table is known as a Bookmark Lookup.
The primary reason why a clustered index must be unique is so that non-clustered index entries can point to exactly one specified row.
If you build a clustered index without specifying the UNIQUE keyword, SQL Server guarantees uniqueness by adding a hidden uniquifier column to the rows.
The uniquifier is a 4-byte integer. Once added it becomes a part of the clustering key: meaning that it is duplicated in every non-clustered index. No duplicates on a non-unique index = no uniquifier. It is only added when necessary.
If a non-unique clustered index is created and duplicates are found the duplicate values will have the uniquifier value added to them:
Smith, Smith1, Smith2, Jones, Jones1, Jones2 – where 1 and 2 are the uniquifier.
There is no difference between the structure of a heap and the row structure of a clustered index unless the clustered index is not unique in which case a uniquifier is added to duplicate rows.
Be sure to choose a clustering key which is not volatile.
A table’s clustering key should also be chosen to minimise fragmentation for inserts.

Non-Clustered Indexes

For a non-clustered index the leaf level is a separate and additional structure which contains a copy of some of the table data.
The leaf level of a non-clustered index consists of the index key, any included columns, and either the clustered index or the row’s physical RID.
A non-clustered index is there to either help point to the data or to answer a query directly (covering index).
The bookmark lookup of a row occurs when a non-clustered index does not have all the data required by the query but the query is driven by a predicate that the index can help find.
The base table’s structure – either a heap or with a clustered index – affects the structure of your non-clustered index.
A non-clustered index can have non-key columns included in the leaf level using INCLUDE.
Create a filtered index by using INCLUDE in the create command.
IGNORE_DUP_KEY doesn’t allow the uniqueness of the index to be violated; instead, it makes a violation in a multi-row modification non-fatal to the non-violating rows. The violating rows are discarded.
If a non-clustered index is not-unique a copy of the clustered index or the RID will appear on the non-leaf levels as well as the leaf level of the index.

Filtered Indexes

Filtered indexes can be used to enforce uniqueness – ie you can have any number of Ancillary Add On Codes but only one where Active = True. Therefore you can create a unique index on Ancillary Add On Codes if you use the filter WHERE Active = True. You may also have a column which accepts nulls or unique values – if not null it must be unique (a unique index will only allow for one null entry).
Unique if not Null is useful for SPARSE columns.
Useful for busy queries over a narrow range.

Duplicate Indexes

SQL Server will allow you to create multiple identical indexes – which will need to be maintained separately and consume excess space on your disks.
To spot a duplicate index you need to consider these points:
Include columns only appear at the leaf level of the non-clustered index.
The clustered index key or the RID (for a heap) will appear at the leaf level of a non-clustered index.
A non-unique non-clustered index will also have the clustered key or RID at the non-leaf levels.
Columns must appear in the same order to be duplicated.
Table A
Col A – Clustered key
Col B
Col C
Col D
Col E
Index A – Non Unique – on ColB
Index B – Non Unique – on ColB, ColA
Index C – unique – on ColB, ColA
Index D – unique on ColA, ColB
Index E – Non Unique – on ColB – Include ColA
Index F – Non Unique – on ColB
Index A != Index B because A will have ColA on the non-leaf levels
Index C != Index D because the column order is different
Index E = Index F = Index C all have the same non-leaf columns (ColB and ColA) and they have the same leaf level – ColB and ColA.
You don’t need to add or Include a clustered index column in a non-clustered index as it is there by default.
Index Index columns Non-leaf level Leaf Level
A ColB Non Unique ColB, ColA ColB, ColA
B ColB, ColA Non Unique ColB, ColA ColB, ColA
C ColB, ColA unique ColB, ColA ColB, ColA
D ColA, ColB unique ColA, ColB ColA, ColB
E ColB ColB, ColA ColB, ColA
F ColB ColB, ColA ColB, ColA

A good blog post on duplicate indexes
Kimberly Tripp – Duplicate Indexes
SQL University: Index Usage (The Scary DBA)

MCM Training videos on indexing
Indexing Stragergies
Index Internals
The Clustered Index Debate

And that book again – I can’t over state how much I like this book…
SQL Server 2008 Internals – Chapter 6 Indexes

I recently added a filtered index to a table without any problems. Later that night jobs began to fail with the error:

Description: Executing the query “NameOfStoredProcedure” failed with the following error: “UNKNOWN TOKEN failed because the following SET options have incorrect settings: ‘QUOTED_IDENTIFIER’. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and…

The problem was with the Quoted Identifier settings for the stored procedure. For the query to work the Quoted Identifier needs to be set to ON. The problem is that each stored procedure inherits the settings used when it was created.

You can check these settings by right-clicking on the stored proc and selecting properties. In the general tab there is a list of options which includes the QUOTED_IDENTIFIER option. These values will be set to either True or False.

The default Value for QUOTED_IDENTIFIER is ON.

From BOL


When a stored procedure is created, the SET QUOTED_IDENTIFIER and SET ANSI_NULLS settings are captured and used for subsequent invocations of that stored procedure.

Create Procedure

The Database Engine saves the settings of both SET QUOTED_IDENTIFIER and SET ANSI_NULLS when a Transact-SQL procedure is created or modified. These original settings are used when the procedure is executed. Therefore, any client session settings for SET QUOTED_IDENTIFIER and SET ANSI_NULLS are ignored when the procedure is running.

Other SET options, such as SET ARITHABORT, SET ANSI_WARNINGS, or SET ANSI_PADDINGS are not saved when a procedure is created or modified. If the logic of the procedure depends on a particular setting, include a SET statement at the start of the procedure to guarantee the appropriate setting. When a SET statement is executed from a procedure, the setting remains in effect only until the procedure has finished running. The setting is then restored to the value the procedure had when it was called. This enables individual clients to set the options they want without affecting the logic of the procedure.

Any SET statement can be specified inside a procedure, except SET SHOWPLAN_TEXT and SET SHOWPLAN_ALL. These must be the only statements in the batch. The SET option chosen remains in effect during the execution of the procedure and then reverts to its former setting.

This goes through all databases and finds any indexes which have fragmentation greater than 15% and reindexes them. The loop near the beginning is because sys.dm_db_index_physical_stats needs a database ID. If you do not pass a database Id and leave it as null it will go thrugh all database – which will cause a problem if it is not writeable (read_only, offline, In Recovery)

To use the null options the account running the command needs: VIEW SERVER STATE

This task will re-index and index where fragmentation is greater than 15% provided that there aare more than 50 pages used in the index. If there are less than this the chances are the index isn’t being used and Query Analyser will perform a table scan.

I took the initial idea of using this function to populate a table for reindexing from a blog but can’t remember which. I have adapted it to suit the tasks I wanted to perform.

I find STATISTICS_NORECOMPUTE = OFF a little unintuitive. It means do not not recomputed stats – or in English: recompute the stats.

On most servers this tasks takes between 5 and 20 minutes to run and I can run it every night. There are a few though where it is run instead on a weekly basis.

I also use an extra line after AND ps.index_type_desc <> ‘heap’ to exclude some databases. This is used along with AND [Name] NOT IN (‘LIST OF DBNAMES TO EXCLUDE’); The names should be like this (‘dbname’,’dbname2’). This is where the there is a database that does a lot of ETL work during the night so it is best to run a separate re-index job for that once it has finished.

If speed is a problem you can replace
FROM sys.dm_db_index_physical_stats (@dbCountID, NULL, NULL, NULL, ‘detailed’)
FROM sys.dm_db_index_physical_stats (@dbCountID, NULL, NULL, NULL, NULL)
This will then use the default Limited for gather statistics on the indexes – of course the faster version does not get such accurate data back.

use tempdb;

if not object_id(‘indexData’) is null
  drop table [indexData];

CREATE TABLE [dbo].[indexData](
[dbName] varchar(250) Not NULL,
[di] int Not NULL,
[oi] varchar(150) NOT NULL,
[Frag] int not null,
[pages] int not null,
[iType] varchar(250) not null,
[oName] varchar(250) NULL,
[schemaName] varchar(250) NULL);

DECLARE @dbCountID as int;
SET @dbCountID = 0;
DECLARE @topID as int;
SELECT @topID = Max(database_id) FROM master.sys.databases;
While @dbCountID < @topID
   SELECT TOP (1) @dbCountID = database_id FROM master.sys.databases
   WHERE database_id > @dbCountID
   AND is_Read_Only = 0 and state_desc = ‘ONLINE’

   INSERT INTO [indexData]([dbName],[di],[oi],[Frag],[pages],[iType])
   SELECT Distinct DB_NAME ( @dbCountID ), @dbCountID,    ps.object_id,ps.avg_fragmentation_in_percent,ps.Page_Count,ps.index_type_desc
   FROM sys.dm_db_index_physical_stats (@dbCountID, NULL, NULL, NULL, ‘detailed’) ps
   LEFT OUTER JOIN master.sys.databases AS sd
   ON ps.database_id = sd.database_id
   Where (ps.database_id > 4) AND sd.is_Read_only = 0 AND sd.state_desc = ‘online’
   AND ps.Page_Count > 50
   AND ps.avg_fragmentation_in_percent > 15
   AND ps.index_type_desc <> ‘heap’
   AND sd.database_id = @dbCountID;
DECLARE @dbName varchar(250);
DECLARE dbCursor CURSOR Local Fast_Forward FOR
SELECT Distinct [dbName]
FROM [indexData];

OPEN dbCursor;
FETCH NEXT FROM dbCursor INTO @dbName;
    DECLARE @sql as Varchar(500)
    SET @sql = ‘Use [‘ + @dbName + ‘]’
    Set @sql = @sql + ‘ Update tempdb..[indexData] ‘
    Set @sql = @sql + ‘ SET oName = object_Name(oi), ‘
    Set @sql = @sql + ‘ schemaName = (select schema_Name(sobj.schema_id) from sys.objects sobj where sobj.object_id = oi) ‘
   Set @sql = @sql + ‘ WHERE dbName = ”’ + @dbName + ”’;’;
   Exec (@sql);
   Use tempdb;
   FETCH NEXT FROM dbCursor INTO @dbName;
CLOSE dbCursor;
DECLARE @dbName2 varchar(5000);
DECLARE dbCursor2 CURSOR Local Fast_Forward FOR
SELECT Distinct ‘[‘ + dbName + ‘].[‘ + schemaName + ‘].[‘ + oName + ‘]’ as databaseObject
FROM [indexData]
WHERE oName not like ‘perfValuesForProcess%’;

OPEN dbCursor2;
FETCH NEXT FROM dbCursor2 INTO @dbName2;
     DECLARE @sql2 as Varchar(8000);
     Set @sql2 = ‘ALTER INDEX ALL ON ‘ + @dbName2
     Set @sql2 = @sql2 + ‘ REBUILD WITH (SORT_IN_TEMPDB = ON,       STATISTICS_NORECOMPUTE = OFF);’;
     FETCH NEXT FROM dbCursor2 INTO @dbName2;
CLOSE dbCursor2;
Use tempdb;
Drop Table [indexData];

MSDN sys.dm_db_index_physical_stats (