SQL Server Waits
October 18, 2011 3 Comments
When diagnosing performance problems with SQL Server, one of the first things I do is check the wait conditions on the server. The current wait conditions on the server can give insight into what is affecting performance. They can help identify the server resource being stressed (CPU, I/O, locking, and so on), which should narrow down the search for the cause of the problem.
The T-SQL Script
I use a T-SQL script to check wait conditions. I don’t remember where I originally acquired this script (I didn’t write it myself), but it has proven to be very useful. I have confirmed it to work on SQL Server 2005 and 2008.
The script queries the dm_os_wait_stats dynamic management view and performs calculations on the information found there. It outputs a list of the waits are occurring on the server, including the percentage each wait contributes to the total wait time.
Here is the script:
To analyze the wait conditions on a server, perform the following steps:
1) Open a query window in the SQL Server Management Studio
2) Execute the script
Upon this first execution of the script, you will get little useful information. Really all the first execution does is populate a temp table with a snapshot of the current wait state of the server (selected from the dm_os_waits_stats view).
3) DON’T CLOSE THE QUERY WINDOW!
You want to maintain the temp table created by the first execution of the script.
4) Wait 30 seconds, or 1 minute, or 5 minutes, or 10 minutes (or whatever makes sense).
The length of time you wait depends on the load on your server. The heavier the load, the less time you’ll need to wait.
5) Execute the script again.
It will select a new snapshot of the server’s wait state from the dm_os_waits_stats view, compare it to the original snapshot stored in the temporary table, and calculate the differences in wait times between the two. Now the output will accurately show the wait conditions that have occurred since the first time the script was run.
You can repeat this process as many times as necessary. As long as you reuse the same the query window (without dropping the temp table), you’ll continue to see the cumulative statistics since the first execution of the script.
NOTE: It is important to first execute this process during a period of normal database usage, so that the baseline wait conditions on the server can be established. There are always wait conditions occurring, even when a server is under no stress, so it is important to understand what is “normal” for the server. That way, problems can be more easily identified when the wait conditions deviate from the baseline conditions.
Analyzing the Script Output
There are many different types of waits that might occur. Here is an example of the script output:
On many servers, SQLTRACE_BUFFER_FLUSH will take up the majority of the wait time when the server is under no stress. That wait type is related to the default server trace that is enabled in SQL Server 2005 and 2008. Generally, it is a good thing if this wait type is taking up the majority of the wait time on the server.
CXPACKET and SOS_SCHEDULER_YIELD are related to parallelization and CPU usage. Any sustained increase in wait times for these two wait types indicates CPU pressure. An increase in PAGEIOLATCH_SH waits indicates IO issues. Increased CPU and IO waits may indicate a problem with a SQL query, high traffic conditions, or a non-SQL-related server issue. If waits like LCK_M_IX, LCK_M_X, or LCK_M_S (all generally very low) begin to increase, it is an indication of locking/blocking issues. Many other wait types are documented here.
Generally, wait statistics will not pinpoint the exact cause of performance problems, but they can be very useful for identifying the areas that need investigation.