Identify Wait Events in MS SQL 2005 that Impacts Performance of the Application

21 Apr

While working with MSSQL 2005, I suggest having an eye on the wait events of the database server. Wait events happening at the servers often at times are the main reasons as why we quite often see the high response time at the UI Level. If the query is waiting for something at the database level, then the transaction which was supposed to take 2 secs will end up taking 3 secs and quite often unless we have clear understanding and knowledge on the database performance tuning aspects basics, it becomes quite a hard task to identify and fix such performance issues.

Most often in database world, Wait events normally falls in 3 categories,

· Resource Wait: This often is seen when the threads wait for some resources like IO/CPU etc.

· Queue Wait: Occur when a worker thread is idle, waiting for work to fall in its plate.

· External waits: Occurs when waiting for an external event like fetching data from some other table or doing some custom stuff in the box outside DB layer.

Let’s think about what happens when we fire a query to update a row, I feel below sequence of event happens

· The optimizer uses statistics to help decide on the best way to execute the query.

· The query is executed.

· The row will be read from disk into memory.(Exactly the reason as why I always say DB action are memory bound)

· The update will be written to the transaction log.

· The row will be changed in memory.

· Confirmation will be sent back to the client.

Wait events can happen at any of these above steps and in fact they are can identified by the help of pulling data out from the DMV Wait stats tables and correlating these information with Perf Mon counters.

Wait events timing at the sql server level can be identified by querying below DMV’s in MS SQL 2005,

· sys.dm_exec_requests

· sys.dm_os_waiting_tasks

· sys.dm_os_wait_stats

Let me give a you a quick walkthrough of each of the DMV’s above and illustrate as how we can identify some issues,

sys.dm_exec_requests – Session Level Information

This DMV gives information at session level, i.e at the higher level as which SPID is consuming high CPU, or having high waits times. System level processes and queries which are running parallel are often hard to detect with just this DMV’s.However this DMV’s gives some information on Wait Events.

Below is copy pasted information from the MS SQL team site which I feel is much more easier to understand with regard to this DMV,

Each SQL Server session has a unique session_id in the system DMV sys.dm_exec_requests. The stored procedure sp_who2 provides a list of these sessions in addition to other connection information such as command, resource, wait types, wait time, and status. User queries will have a session_id > 50. Common status values are ‘running’, ‘runnable’ and ‘suspended’, as described in the Execution Model discussion. A session status of ‘Sleeping’ indicates SQL Server is waiting for the next SQL Server command.

PS: sp_who2 can be run with SSMS studio, just type sp_who2 in query window and execute the command. It gives you very valuable information of the running processes at the Database level at that particular point of time.

sys.dm_os_waiting_tasks – All Waiting Tasks

This DMV gives information about the entire waiting task that is waiting for some reason. As long as tasks are waiting for something, they can seen here by querying these DMV’s.Again below is the MS explanation of this DMV’s which I find is quite easy to understand,

The waiter list that shows all waiting sessions and the reasons for the waits can be found in the DMV sys.dm_os_waiting_tasks. The session_id, wait type, and associated wait time can be seen. In addition, if the session is blocked from acquiring a lock, the session holding (known as blocking) the lock as well as the blocked resource is shown in the columns blocking_session_id and resource.

Below is the one of the SP which I had in my library which I often use in case I have access to DB,

SELECT wt.*,

st.text

FROM sys.dm_os_waiting_tasks wt LEFT JOIN sys.dm_exec_requests er

ON wt.waiting_task_address = er.task_address

OUTER APPLY sys.dm_exec_sql_text(er.sql_handle) st

ORDER BY wt.session_id

sys.dm_os_wait_stats – Aggregated time by Wait Type

This DMV is an aggregation of all wait times from all queries since SQL Server started and is recommended approach from MS for server wide tuning.

Below is the explanation of this DMV from MS,

Sys.dm_os_wait_stats is the DMV that contains wait statistics, which are aggregated across all session ids since the last restart of SQL Server or since the last time that the wait statistics were reset manually using DBCC SQLPERF (‘sys.dm_os_wait_stats’, CLEAR). Resetting wait statistics can be helpful before running a test or workload.

Anytime a session_id waits for a resource, the session_id is moved to the waiter list along with an associated wait type. The DMV sys.dm_os_waiting_tasks shows the waiter list at a given moment in time. Waits for all session_id are aggregated in sys.dm_os_wait_stats. While collecting information from this DMV, its often requires that we clear stats and run the test with before and after change,

Stats can be cleared by running in SSMS,

DBCC sqlperf (‘sys.dm_os_wait_stats’, clear)

The stored procedures track_waitstats_2005 and get_waitstats_2005 can be used to measure the wait statistics for a given workload.

Here is the track_waitstats_2005 SP

SELECT request_session_id AS Session,

resource_database_id AS DBID,

Resource_Type,

resource_description AS Resource,

request_type AS Type,

request_mode AS Mode,

request_status AS Status

FROM sys.dm_tran_locks

Here is the get_wait stats_2005 SP,

if exists (select * from sys.objects where object_id = object_id(N'[dbo].[get_waitstats_2005]’) and OBJECTPROPERTY(object_id, N’IsProcedure’) = 1)

    drop procedure [dbo].[get_waitstats_2005]

GO

CREATE proc [dbo].[get_waitstats_2005] (@report_format varchar(20)=’all’, @report_order varchar(20)=’resource’)

as

— this proc will create waitstats report listing wait types by percentage.  

—     (1) total wait time is the sum of resource & signal waits, @report_format=’all’ reports resource & signal

—    (2) Basics of execution model (simplified)

—        a. spid is running then needs unavailable resource, moves to resource wait list at time T0

—        b. a signal indicates resource available, spid moves to runnable queue at time T1

—        c. spid awaits running status until T2 as cpu works its way through runnable queue in order of arrival

—    (3) resource wait time is the actual time waiting for the resource to be available, T1-T0

—    (4) signal wait time is the time it takes from the point the resource is available (T1)

—          to the point in which the process is running again at T2.  Thus, signal waits are T2-T1

—    (5) Key questions: Are Resource and Signal time significant?

—        a. Highest waits indicate the bottleneck you need to solve for scalability

—        b. Generally if you have LOW% SIGNAL WAITS, the CPU is handling the workload e.g. spids spend move through runnable queue quickly

—        c. HIGH % SIGNAL WAITS indicates CPU can’t keep up, significant time for spids to move up the runnable queue to reach running status

—     (6) This proc can be run when track_waitstats is executing

— Revision 4/19/2005

— (1) add computation for CPU Resource Waits = Sum(signal waits / total waits)

— (2) add @report_order parm to allow sorting by resource, signal or total waits

set nocount on

declare @now datetime, @totalwait numeric(20,1), @totalsignalwait numeric(20,1), @totalresourcewait numeric(20,1)

    ,@endtime datetime,@begintime datetime

    ,@hr int, @min int, @sec int

if not exists (select 1 from sysobjects where id = object_id ( N'[dbo].[waitstats]’) and OBJECTPROPERTY(id, N’IsUserTable’) = 1)

begin

        raiserror(‘Error [dbo].[waitstats] table does not exist’, 16, 1) with nowait

        return

end

if lower(@report_format) not in (‘all’,’detail’,’simple’)

    begin

        raiserror (‘@report_format must be either ”all”,”detail”, or ”simple”’,16,1) with nowait

        return

    end

if lower(@report_order) not in (‘resource’,’signal’,’total’)

    begin

        raiserror (‘@report_order must be either ”resource”, ”signal”, or ”total”’,16,1) with nowait

        return

    end

if lower(@report_format) = ‘simple’ and lower(@report_order) <> ‘total’

    begin

        raiserror (‘@report_format is simple so order defaults to ”total”’,16,1) with nowait

        select @report_order = ‘total’

    end

select  @now=max(now),@begintime=min(now),@endtime=max(now)

from [dbo].[waitstats] where [wait_type] = ‘Total’

— subtract waitfor, sleep, and resource_queue from Total

select @totalwait = sum([wait_time_ms]) + 1, @totalsignalwait = sum([signal_wait_time_ms]) + 1 from waitstats 

–where [wait_type] not in (‘WAITFOR’,’SLEEP’,’RESOURCE_QUEUE’, ‘Total’, ‘***total***’) and now = @now

where [wait_type] not in (‘CLR_SEMAPHORE’,’LAZYWRITER_SLEEP’,’RESOURCE_QUEUE’,’SLEEP_TASK’,’SLEEP_SYSTEMTASK’,’Total’,’WAITFOR’, ‘***total***’) and now = @now

select @totalresourcewait = 1 + @totalwait – @totalsignalwait

— insert adjusted totals, rank by percentage descending

delete waitstats where [wait_type] = ‘***total***’ and now = @now

insert into waitstats select ‘***total***’,0,@totalwait,0,@totalsignalwait,@now

select ‘start time’=@begintime,’end time’=@endtime

        ,’duration (hh:mm:ss:ms)’=convert(varchar(50),@endtime-@begintime,14)

        ,’report format’=@report_format, ‘report order’=@report_order

if lower(@report_format) in (‘all’,’detail’)

    begin

—– format=detail, column order is resource, signal, total.  order by resource desc

    if lower(@report_order) = ‘resource’

    select [wait_type],[waiting_tasks_count]

        ,’Resource wt (T1-T0)’=[wait_time_ms]-[signal_wait_time_ms]

        ,’res_wt_%’=cast (100*([wait_time_ms] – [signal_wait_time_ms]) /@totalresourcewait as numeric(20,1))

        ,’Signal wt (T2-T1)’=[signal_wait_time_ms]

        ,’sig_wt_%’=cast (100*[signal_wait_time_ms]/@totalsignalwait as numeric(20,1))

        ,’Total wt (T2-T0)’=[wait_time_ms]

        ,’wt_%’=cast (100*[wait_time_ms]/@totalwait as numeric(20,1))

    from waitstats 

    where [wait_type] not in (‘CLR_SEMAPHORE’,’LAZYWRITER_SLEEP’,’RESOURCE_QUEUE’,’SLEEP_TASK’,’SLEEP_SYSTEMTASK’,’Total’,’WAITFOR’)

    and now = @now

    order by ‘res_wt_%’ desc

—– format=detail, column order signal, resource, total.  order by signal desc

    if lower(@report_order) = ‘signal’

    select [wait_type],[waiting_tasks_count]

        ,’Signal wt (T2-T1)’=[signal_wait_time_ms]

        ,’sig_wt_%’=cast (100*[signal_wait_time_ms]/@totalsignalwait as numeric(20,1))

        ,’Resource wt (T1-T0)’=[wait_time_ms]-[signal_wait_time_ms]

        ,’res_wt_%’=cast (100*([wait_time_ms] – [signal_wait_time_ms]) /@totalresourcewait as numeric(20,1))

        ,’Total wt (T2-T0)’=[wait_time_ms]

        ,’wt_%’=cast (100*[wait_time_ms]/@totalwait as numeric(20,1))

    from waitstats 

    where [wait_type] not in (‘CLR_SEMAPHORE’,’LAZYWRITER_SLEEP’,’RESOURCE_QUEUE’,’SLEEP_TASK’,

‘SLEEP_SYSTEMTASK’,’Total’,’WAITFOR’)

    and now = @now

    order by ‘sig_wt_%’ desc

—– format=detail, column order total, resource, signal.  order by total desc

    if lower(@report_order) = ‘total’

    select [wait_type],[waiting_tasks_count]

        ,’Total wt (T2-T0)’=[wait_time_ms]

        ,’wt_%’=cast (100*[wait_time_ms]/@totalwait as numeric(20,1))

        ,’Resource wt (T1-T0)’=[wait_time_ms]-[signal_wait_time_ms]

        ,’res_wt_%’=cast (100*([wait_time_ms] – [signal_wait_time_ms]) /@totalresourcewait as numeric(20,1))

        ,’Signal wt (T2-T1)’=[signal_wait_time_ms]

        ,’sig_wt_%’=cast (100*[signal_wait_time_ms]/@totalsignalwait as numeric(20,1))

    from waitstats 

    where [wait_type] not in (‘CLR_SEMAPHORE’,’LAZYWRITER_SLEEP’,’RESOURCE_QUEUE’,’SLEEP_TASK’,

‘SLEEP_SYSTEMTASK’,’Total’,’WAITFOR’)

    and now = @now

    order by ‘wt_%’ desc

end

else

—- simple format, total waits only

    select [wait_type],[wait_time_ms]

            ,percentage=cast (100*[wait_time_ms]/@totalwait as numeric(20,1))

    from waitstats 

    where [wait_type] not in (‘CLR_SEMAPHORE’,’LAZYWRITER_SLEEP’,’RESOURCE_QUEUE’,’SLEEP_TASK’,

‘SLEEP_SYSTEMTASK’,’Total’,’WAITFOR’)

    and now = @now

    order by percentage desc

—- compute cpu resource waits

select ‘total waits’=[wait_time_ms],’total signal=CPU waits’=[signal_wait_time_ms]

    ,’CPU resource waits % = signal waits / total waits’=cast (100*[signal_wait_time_ms]/[wait_time_ms] as numeric(20,1)), now

from [dbo].[waitstats]

where [wait_type] = ‘***total***’

order by now

GO

exec [dbo].[get_waitstats_2005] @report_format=’detail’,@report_order=’resource’

 

The stored procedure get_wait stats_2005 reports the wait types that are collected by track_wait stats_2005. The get_wait stats_2005 procedure can be run during the execution of track_waitstats or after track_wait stats is completed. Running get_wait stats_2005 during the execution of track_waitstats_2005 will return a report of intermediate results while running get_waitstats_2005 at the conclusion of track_waitstats_2005 will return the final wait statistics report. The report provides a detailed picture of different wait types during the time measured, and the accumulated wait time for each.

Get_waitstats_2005 reports information about waits. Total wait time is composed of resource waits and signal waits. Resource waits are computed by subtracting signal waits from total waits. Because signal waits represent the amount of time spent waiting in the runnable queue for CPU resources, they are a measure of CPU pressure. The application blueprints identify the significance CPU pressure by comparing signal waits with total waits

PS: Before you run the above SP, make sure you have wait table in the database, or you can run the below SP to create one for you. (This SP is also pulled out from MS site),

if exists (select * from sys.objects where object_id = object_id(N'[dbo].[track_waitstats_2005]’) and OBJECTPROPERTY(object_id, N’IsProcedure’) = 1)
    drop procedure [dbo].[track_waitstats_2005]
go
CREATE proc [dbo].[track_waitstats_2005] (@num_samples int=10
                                ,@delay_interval int=1
                                ,@delay_type nvarchar(10)=’minutes’
                                ,@truncate_history nvarchar(1)=’N’
                                ,@clear_waitstats nvarchar(1)=’Y’)
as

— This stored procedure is provided “AS IS” with no warranties, and confers no rights. 
— Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

— T. Davidson
— @num_samples is the number of times to capture waitstats, default is 10 times
— default delay interval is 1 minute
— delaynum is the delay interval – can be minutes or seconds
— delaytype specifies whether the delay interval is minutes or seconds
— create waitstats table if it doesn’t exist, otherwise truncate
— Revision: 4/19/05 
— (1) added object owner qualifier
— (2) optional parameters to truncate history and clear waitstats
set nocount on
if not exists (select 1 from sys.objects where object_id = object_id ( N'[dbo].[waitstats]’) and OBJECTPROPERTY(object_id, N’IsUserTable’) = 1)
    create table [dbo].[waitstats] 
        ([wait_type] nvarchar(60) not null, 
        [waiting_tasks_count] bigint not null,
        [wait_time_ms] bigint not null,
        [max_wait_time_ms] bigint not null,
        [signal_wait_time_ms] bigint not null,
        now datetime not null default getdate())
If lower(@truncate_history) not in (N’y’,N’n’)
    begin
    raiserror (‘valid @truncate_history values are ”y” or ”n”’,16,1) with nowait    
    end
If lower(@clear_waitstats) not in (N’y’,N’n’)
    begin
    raiserror (‘valid @clear_waitstats values are ”y” or ”n”’,16,1) with nowait    
    end
If lower(@truncate_history) = N’y’ 
    truncate table dbo.waitstats
If lower (@clear_waitstats) = N’y’ 
    dbcc sqlperf ([sys.dm_os_wait_stats],clear) with no_infomsgs — clear out waitstats
 
declare @i int,@delay varchar(8),@dt varchar(3), @now datetime, @totalwait numeric(20,1)
    ,@endtime datetime,@begintime datetime
    ,@hr int, @min int, @sec int
select @i = 1
select @dt = case lower(@delay_type)
    when N’minutes’ then ‘m’
    when N’minute’ then ‘m’
    when N’min’ then ‘m’
    when N’mi’ then ‘m’
    when N’n’ then ‘m’
    when N’m’ then ‘m’
    when N’seconds’ then ‘s’
    when N’second’ then ‘s’
    when N’sec’ then ‘s’
    when N’ss’ then ‘s’
    when N’s’ then ‘s’
    else @delay_type
end
if @dt not in (‘s’,’m’)
begin
    raiserror (‘delay type must be either ”seconds” or ”minutes”’,16,1) with nowait
    return
end
if @dt = ‘s’
begin
    select @sec = @delay_interval % 60, @min = cast((@delay_interval / 60) as int), @hr = cast((@min / 60) as int)
end
if @dt = ‘m’
begin
    select @sec = 0, @min = @delay_interval % 60, @hr = cast((@delay_interval / 60) as int)
end
select @delay= right(‘0’+ convert(varchar(2),@hr),2) + ‘:’ + 
    + right(‘0’+convert(varchar(2),@min),2) + ‘:’ + 
    + right(‘0’+convert(varchar(2),@sec),2)
if @hr > 23 or @min > 59 or @sec > 59
begin
    select ‘delay interval and type: ‘ + convert (varchar(10),@delay_interval) + ‘,’ + @delay_type + ‘ converts to ‘ + @delay
    raiserror (‘hh:mm:ss delay time cannot > 23:59:59’,16,1) with nowait
    return
end
while (@i <= @num_samples)
begin
            select @now = getdate()
            insert into [dbo].[waitstats] ([wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], now)    
            select [wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], @now
                from sys.dm_os_wait_stats
            insert into [dbo].[waitstats] ([wait_type], [waiting_tasks_count], [wait_time_ms], [max_wait_time_ms], [signal_wait_time_ms], now)    
                select ‘Total’,sum([waiting_tasks_count]), sum([wait_time_ms]), 0, sum([signal_wait_time_ms]),@now
                from [dbo].[waitstats]
                where now = @now
            select @i = @i + 1
            waitfor delay @delay
end
— create waitstats report
execute dbo.get_waitstats_2005
go
exec dbo.track_waitstats_2005 @num_samples=20
                                ,@delay_interval=30
                                ,@delay_type=’s’
                                ,@truncate_history=’y’
                                ,@clear_waitstats=’y’
go

 

The above queries  I hope will help you to identify the wait events that consuming more of your resources and contributing to the latency of your application.Wait events and Queues often go hand in hand, when we see queuing, it means that that something somewhere is waiting for some resources, and above SP’s from MS does the good job of identifying those events that are waiting for something.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: