SQL Server Waits

When diagnosing performance problems with SQL Server, one of the first things I do is check the wait conditions on the server.  The current wait conditions on the server can give insight into what is affecting performance.  They can help identify the server resource being stressed (CPU, I/O, locking, and so on), which should narrow down the search for the cause of the problem.

The T-SQL Script

I use a T-SQL script to check wait conditions. I don’t remember where I originally acquired this script (I didn’t write it myself), but it has proven to be very useful.  I have confirmed it to work on SQL Server 2005 and 2008.

The script queries the dm_os_wait_stats dynamic management view and performs calculations on the information found there. It outputs a list of the waits are occurring on the server, including the percentage each wait contributes to the total wait time.

Here is the script:

IF NOT EXISTS(SELECT id FROM tempdb.dbo.sysobjects WHERE [name] like ‘#waits%’)
BEGIN 
  — Get starting statistics 
  CREATE TABLE #Waits( 
    WaitType nvarchar(60) NOT NULL, 
    WaitTime bigint NULL, 
    [%Waiting] decimal(12, 2) NULL, 
    MonitorDate datetime NOT NULL 
  )

  INSERT INTO #Waits 
  SELECT 
    wait_type, 
    wait_time_ms / 1000, 
    CONVERT(DECIMAL(12,2), wait_time_ms * 100.0 / SUM(wait_time_ms) OVER()), 
    GETDATE() 
  FROM sys.dm_os_wait_stats 
  WHERE wait_type NOT LIKE ‘%SLEEP%’
END

— Get the current stats and the percent change since the initial capture
SELECT 
NEW.WaitType, 
        NEW.WaitTime AS NewWaitTime, 
        NEW.[%Waiting] AS [New%Waiting], 
        OLD.WaitTime AS OldWaitTime, 
        OLD.[%Waiting] AS [Old%Waiting], 
        NEW.WaitTime – OLD.WaitTime AS Change
INTO    #tmpStats
FROM    ( 
        SELECT WaitType = wait_type, 
               WaitTime = wait_time_ms / 1000, 
               [%Waiting] = CONVERT(DECIMAL(12,2),
                           
wait_time_ms * 100.0 / SUM(wait_time_ms) OVER()) 
        FROM  sys.dm_os_wait_stats 
        WHERE wait_type NOT LIKE ‘%SLEEP%’ 
        ) NEW 
        INNER JOIN 
        ( 
        SELECT WaitType, WaitTime, [%Waiting] FROM #Waits 
        ) OLD 
           ON NEW.WaitType = OLD.WaitType

DECLARE @totalchange DECIMAL(10, 2)
SELECT @totalchange = SUM(change) FROM #tmpStats

SELECT  *,
        CASE WHEN @TotalChange > 0
        THEN CONVERT(DECIMAL(10,2), CONVERT(DECIMAL(10,2),change) / @totalchange * 100)
        ELSE 0
        END AS [%WaitingSinceStart]
FROM    #tmpStats
ORDER BY [%WaitingSinceStart] DESC

DROP TABLE #tmpStats

Usage

To analyze the wait conditions on a server, perform the following steps:

1) Open a query window in the SQL Server Management Studio

2) Execute the script

Upon this first execution of the script, you will get little useful information.  Really all the first execution does is populate a temp table with a snapshot of the current wait state of the server (selected from the dm_os_waits_stats view).

3) DON’T CLOSE THE QUERY WINDOW! 

You want to maintain the temp table created by the first execution of the script. 

4) Wait 30 seconds, or 1 minute, or 5 minutes, or 10 minutes (or whatever makes sense). 

The length of time you wait depends on the load on your server.  The heavier the load, the less time you’ll need to wait. 

5) Execute the script again. 

It will select a new snapshot of the server’s wait state from the dm_os_waits_stats view, compare it to the original snapshot stored in the temporary table, and calculate the differences in wait times between the two.  Now the output will accurately show the wait conditions that have occurred since the first time the script was run. 

You can repeat this process as many times as necessary.  As long as you reuse the same the query window (without dropping the temp table), you’ll continue to see the cumulative statistics since the first execution of the script.

NOTE: It is important to first execute this process during a period of normal database usage, so that the baseline wait conditions on the server can be established.  There are always wait conditions occurring, even when a server is under no stress, so it is important to understand what is “normal” for the server.  That way, problems can be more easily identified when the wait conditions deviate from the baseline conditions.

Analyzing the Script Output

There are many different types of waits that might occur.  Here is an example of the script output:

                      New      New     Old      Old            %Waiting
WaitType              WaitTime Waiting WaitTime Waiting Change SinceStart
——————— ——– ——- ——– ——- —— ———-
SQLTRACE_BUFFER_FLUSH 28156047 49.16   28155996 49.16   51     35.17
CXPACKET              4202791  7.34    4202751  7.34    40     27.59
PAGEIOLATCH_SH        2040584  3.56    2040562  3.56    22     15.17
SOS_SCHEDULER_YIELD   12441434 21.72   12441417 21.72   17     11.72
LATCH_EX              4595795  8.02    4595784  8.02    11     7.59
PAGELATCH_UP          27982    0.05    27981    0.05    1      0.69
PAGEIOLATCH_EX        91517    0.16    91516    0.16    1      0.69
IO_COMPLETION         42555    0.07    42554    0.07    1      0.69
PAGELATCH_EX          30927    0.05    30926    0.05    1      0.69
PAGELATCH_KP          0        0.00    0        0.00    0      0.00

 
On many servers, SQLTRACE_BUFFER_FLUSH will take up the majority of the wait time when the server is under no stress.  That wait type is related to the default server trace that is enabled in SQL Server 2005 and 2008.  Generally, it is a good thing if this wait type is taking up the majority of the wait time on the server.

CXPACKET and SOS_SCHEDULER_YIELD are related to parallelization and CPU usage.  Any sustained increase in wait times for these two wait types indicates CPU pressure.  An increase in PAGEIOLATCH_SH waits indicates IO issues.  Increased CPU and IO waits may indicate a problem with a SQL query, high traffic conditions, or a non-SQL-related server issue.  If waits like LCK_M_IX, LCK_M_X, or LCK_M_S (all generally very low) begin to increase, it is an indication of locking/blocking issues.  Many other wait types are documented here.

Generally, wait statistics will not pinpoint the exact cause of performance problems, but they can be very useful for identifying the areas that need investigation.

Advertisements

3 Responses to SQL Server Waits

  1. Pingback: Check if SQL Server is Under CPU Pressure « LichtenBytes

  2. Pingback: SQL Server CPU Waits « LichtenBytes

  3. Pingback: SQL Server – Finding the Source of Excessive I/O | LichtenBytes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: