Check if SQL Server is Under CPU Pressure

Last fall, I wrote about how to check wait conditions in SQL Server.  As I pointed out in that post, if a high percentage of your server’s wait time is due to waits of the type SOS_SCHEDULER_YIELD, you may be experiencing CPU pressure on your server.

You can use the following query to further investigate if your server is waiting on CPU resources.

SELECT scheduler_id, current_tasks_count, runnable_tasks_count
FROM sys.dm_os_schedulers
WHERE scheduler_id < 255

Here is an example of the output of this query:

scheduler_id current_tasks_count runnable_tasks_count
———— ——————- ——————–
0            8                   0
1            11                  0

If you run this query frequently and see the runnable_tasks_count consistenty above zero, that is an indication that the server is in fact CPU-bound.  If you see runnable_tasks_count in double digits for any length of time, that is cause for immediate concern.

The sys.dm_os_schedulers dynamic management view which is used in this query gives information on the SQL Server schedulers.  Schedulers are responsible for allocating system resources (such as CPU time) to SQL Server processes.  sys.dm_os_schedulers contains one row per scheduler for schedulers mapped to individual processors.  “Active worker” schedulers are those dedicated to regular user-controlled tasks; these are assigned IDs less than 255.  Schedulers with IDs above 255 are called “hidden”schedulers; these are used by system processes.  You should be more concerned with the “active worker” schedulers, and that’s why the query explicitly excludes schedulers 255 and up.

Optionally, instead of checking for particular scheduler_id values, you could modify the query to check the scheduler status, as follows:

SELECT scheduler_id, current_tasks_count, runnable_tasks_count
FROM sys.dm_os_schedulers
WHERE status = ‘VISIBLE ONLINE’

The WHERE clause (status = ‘VISIBLE ONLINE’) will limit the result set to only those schedulers that are available for user processes.

These queries should work in SQL Server 2005, 2008, 2008 R2, and 2012.

One final point… before running out to upgrade your server, you should ensure that your queries and stored procedures are using the correct plans. An incorrect or corrupt plan can cause SQL Server to consume far more of the CPU resources than should normally be necessary.  In that case, adding physical resources to your server is not the answer.

Determine the Space Used by Each Table in a SQL Server Database

The following script will work on SQL Server 2005 and 2008 (and 2012?).  Use it to determine a number of metrics about each table in a database, including number of rows, data size, index size, unused space, and total space reserved for the table.

SET NOCOUNT ON

DECLARE @table_name VARCHAR(500)

DECLARE @SpaceUsed TABLE (    
    tablename sysname,
    row_count INT,
    reserved VARCHAR(50),
    data VARCHAR(50),
    index_size VARCHAR(50),
    unused VARCHAR(50)
)

DECLARE curTables CURSOR FOR 
SELECT s.name + ‘.’ + t.name  
FROM sys.tables t  INNER JOIN sys.schemas s
    ON t.schema_id = s.schema_id

OPEN curTables
FETCH NEXT FROM curTables INTO @table_name
WHILE @@FETCH_STATUS = 0 
BEGIN  
    SET @table_name = REPLACE(REPLACE(@table_name, ‘[',''), ']‘, ”)

    — make sure the object exists before calling sp_spacedused
    IF EXISTS(SELECT OBJECT_ID FROM sys.objects WHERE OBJECT_ID = OBJECT_ID(@table_name))
    BEGIN
        INSERT INTO @SpaceUsed EXEC sp_spaceused @table_name, false
    END
    FETCH NEXT FROM curTables INTO @table_name
END
CLOSE curTables
DEALLOCATE curTables

– Compute the total size of each table and add the schemaname to the result set
SELECT    s.name as schemaname,
        su.tablename,
        su.row_count,
        su.data,
        su.index_size,
        su.unused,
        CONVERT(decimal, CONVERT(bigint, REPLACE(su.data, ‘ KB’, ”)) +
            CONVERT(bigint, REPLACE(su.index_size, ‘ KB’, ”)) +
            CONVERT(bigint, REPLACE(su.unused, ‘ KB’, ”))) / 1000 AS [total (MB)]
FROM    @SpaceUsed su INNER JOIN sys.tables t
            ON t.name = su.tablename
        INNER JOIN sys.schemas s
            ON t.schema_id = s.schema_id
ORDER BY 
        [total (MB)] DESC,
        schemaname,
        tablename

 

Here is an example of the output you can expect from this script:

schemaname tablename     row_count   data      index_size unused   total (MB)
———- ————- ———– ——— ———- ——– ———-
dbo        Page          116213      71864 KB  66304 KB   344 KB   138.512000
dbo        PageName      164508      17040 KB  37520 KB   336 KB   54.896000
dbo        IndicatedPage 103127      6208 KB   32 KB      40 KB    6.280000
dbo        Page_PageType 115770      4840 KB   32 KB      0 KB     4.872000
dbo        Item          254         136 KB    144 KB     48 KB    0.328000
dbo        Title         77          64 KB     224 KB     0 KB     0.288000

SQL Server Waits

When diagnosing performance problems with SQL Server, one of the first things I do is check the wait conditions on the server.  The current wait conditions on the server can give insight into what is affecting performance.  They can help identify the server resource being stressed (CPU, I/O, locking, and so on), which should narrow down the search for the cause of the problem.

The T-SQL Script

I use a T-SQL script to check wait conditions. I don’t remember where I originally acquired this script (I didn’t write it myself), but it has proven to be very useful.  I have confirmed it to work on SQL Server 2005 and 2008.

The script queries the dm_os_wait_stats dynamic management view and performs calculations on the information found there. It outputs a list of the waits are occurring on the server, including the percentage each wait contributes to the total wait time.

Here is the script:

IF NOT EXISTS(SELECT id FROM tempdb.dbo.sysobjects WHERE [name] like ‘#waits%’)
BEGIN 
  — Get starting statistics 
  CREATE TABLE #Waits( 
    WaitType nvarchar(60) NOT NULL, 
    WaitTime bigint NULL, 
    [%Waiting] decimal(12, 2) NULL, 
    MonitorDate datetime NOT NULL 
  )

  INSERT INTO #Waits 
  SELECT 
    wait_type, 
    wait_time_ms / 1000, 
    CONVERT(DECIMAL(12,2), wait_time_ms * 100.0 / SUM(wait_time_ms) OVER()), 
    GETDATE() 
  FROM sys.dm_os_wait_stats 
  WHERE wait_type NOT LIKE ‘%SLEEP%’
END

– Get the current stats and the percent change since the initial capture
SELECT 
NEW.WaitType, 
        NEW.WaitTime AS NewWaitTime, 
        NEW.[%Waiting] AS [New%Waiting], 
        OLD.WaitTime AS OldWaitTime, 
        OLD.[%Waiting] AS [Old%Waiting], 
        NEW.WaitTime – OLD.WaitTime AS Change
INTO    #tmpStats
FROM    ( 
        SELECT WaitType = wait_type, 
               WaitTime = wait_time_ms / 1000, 
               [%Waiting] = CONVERT(DECIMAL(12,2),
                           
wait_time_ms * 100.0 / SUM(wait_time_ms) OVER()) 
        FROM  sys.dm_os_wait_stats 
        WHERE wait_type NOT LIKE ‘%SLEEP%’ 
        ) NEW 
        INNER JOIN 
        ( 
        SELECT WaitType, WaitTime, [%Waiting] FROM #Waits 
        ) OLD 
           ON NEW.WaitType = OLD.WaitType

DECLARE @totalchange DECIMAL(10, 2)
SELECT @totalchange = SUM(change) FROM #tmpStats

SELECT  *,
        CASE WHEN @TotalChange > 0
        THEN CONVERT(DECIMAL(10,2), CONVERT(DECIMAL(10,2),change) / @totalchange * 100)
        ELSE 0
        END AS [%WaitingSinceStart]
FROM    #tmpStats
ORDER BY [%WaitingSinceStart] DESC

DROP TABLE #tmpStats

Usage

To analyze the wait conditions on a server, perform the following steps:

1) Open a query window in the SQL Server Management Studio

2) Execute the script

Upon this first execution of the script, you will get little useful information.  Really all the first execution does is populate a temp table with a snapshot of the current wait state of the server (selected from the dm_os_waits_stats view).

3) DON’T CLOSE THE QUERY WINDOW! 

You want to maintain the temp table created by the first execution of the script. 

4) Wait 30 seconds, or 1 minute, or 5 minutes, or 10 minutes (or whatever makes sense). 

The length of time you wait depends on the load on your server.  The heavier the load, the less time you’ll need to wait. 

5) Execute the script again. 

It will select a new snapshot of the server’s wait state from the dm_os_waits_stats view, compare it to the original snapshot stored in the temporary table, and calculate the differences in wait times between the two.  Now the output will accurately show the wait conditions that have occurred since the first time the script was run. 

You can repeat this process as many times as necessary.  As long as you reuse the same the query window (without dropping the temp table), you’ll continue to see the cumulative statistics since the first execution of the script.

NOTE: It is important to first execute this process during a period of normal database usage, so that the baseline wait conditions on the server can be established.  There are always wait conditions occurring, even when a server is under no stress, so it is important to understand what is “normal” for the server.  That way, problems can be more easily identified when the wait conditions deviate from the baseline conditions.

Analyzing the Script Output

There are many different types of waits that might occur.  Here is an example of the script output:

                      New      New     Old      Old            %Waiting
WaitType              WaitTime Waiting WaitTime Waiting Change SinceStart
——————— ——– ——- ——– ——- —— ———-
SQLTRACE_BUFFER_FLUSH 28156047 49.16   28155996 49.16   51     35.17
CXPACKET              4202791  7.34    4202751  7.34    40     27.59
PAGEIOLATCH_SH        2040584  3.56    2040562  3.56    22     15.17
SOS_SCHEDULER_YIELD   12441434 21.72   12441417 21.72   17     11.72
LATCH_EX              4595795  8.02    4595784  8.02    11     7.59
PAGELATCH_UP          27982    0.05    27981    0.05    1      0.69
PAGEIOLATCH_EX        91517    0.16    91516    0.16    1      0.69
IO_COMPLETION         42555    0.07    42554    0.07    1      0.69
PAGELATCH_EX          30927    0.05    30926    0.05    1      0.69
PAGELATCH_KP          0        0.00    0        0.00    0      0.00

 
On many servers, SQLTRACE_BUFFER_FLUSH will take up the majority of the wait time when the server is under no stress.  That wait type is related to the default server trace that is enabled in SQL Server 2005 and 2008.  Generally, it is a good thing if this wait type is taking up the majority of the wait time on the server.

CXPACKET and SOS_SCHEDULER_YIELD are related to parallelization and CPU usage.  Any sustained increase in wait times for these two wait types indicates CPU pressure.  An increase in PAGEIOLATCH_SH waits indicates IO issues.  Increased CPU and IO waits may indicate a problem with a SQL query, high traffic conditions, or a non-SQL-related server issue.  If waits like LCK_M_IX, LCK_M_X, or LCK_M_S (all generally very low) begin to increase, it is an indication of locking/blocking issues.  Many other wait types are documented here.

Generally, wait statistics will not pinpoint the exact cause of performance problems, but they can be very useful for identifying the areas that need investigation.

Follow

Get every new post delivered to your Inbox.