Importance of IIS HTTP ERR logs in Performance Engineering

18 Feb

One among many skills required to be good Performance Engineer is to know, understand, interpret and analyze the logs of the backend servers and trace back those events to the load test executed during that known time period. Many times it so happens that in spite of having perfect script, perfect workload model, response times of the tests do not meet the service level agreements or they are just beyond the range as expected by the business owners or application developers. There could be many reasons for this, but to pin point the exact cause and come up with conclusive data, it becomes extremely necessary that we understand importance of logging and analyzing the server logs for the backend servers. So in this post I would be focusing my effort in explaining the importance of HTTP ERR logs which is the part of IIS logging.

Since IIS handles applications which are HTTP based, errors originating from the HTTP API of the IIS are logged in HTTP ERR logs. The most common types of errors logged here are network drop connections, connection dropped by the application pool, Service Unavailable errors, bad client request etc. Best part of this log is that it gives you the precise information as which URL or application pool is responsible for incorrect behavior, something which is very valuable in case if you have 10s or 100’s of sites sitting on the same IIS box or sharing the same application pool. All these information one can easily correlate with load test results to build a strong case that the application under test has some issues which needs some attention from the concerned team and may possibly have performance impact.

HttpErr logs comes with standard fields as shown below.It has all the information that requires for troubleshooting any fatal issue that occurs during performance testing. One thing I would like to mention here is that in case if you have IIS server sitting behind the  load balancers, then it might happen that the IP what you see in these logs might belong to the load balancer rather than end client users.Again this could be due to your infrastructure/environment design.

Field

Description

Date

The Date field follows the W3C format. This field is based on Coordinated Universal Time (UTC). The Date field is always ten characters in the form of YYYY-MM-DD. For example, May 1, 2003 is expressed as 2003-05-01.

Time

The Time field follows the W3C format. This field is based on UTC. The time field is always eight characters in the form of MM: HH: SS. For example, 5:30 PM (UTC) is expressed as 17:30:00.

Client IP Address

The IP address of the affected client. The value in this field can be either an IPv4 address or an IPv6 address. If the client IP address is an IPv6 address, the ScopeId field is also included in the address.

Client Port

The port number for the affected client.

Server IP Address

The IP address of the affected server. The value in this field can be either an IPv4 address or an IPv6 address. If the server IP address is an IPv6 address, the ScopeId field is also included in the address.

Server Port

The port number of the affected server.

Protocol Version

The version of the protocol that is being used.
If the connection has not been parsed sufficiently to determine the protocol version, a hyphen (0x002D) is used as a placeholder for the empty field.
If either the major version number or the minor version number that is parsed is greater than or equal to 10, the version is logged as HTTP/?

Verb

The verb state that the last request that is parsed passes. Unknown verbs are included, but any verb that is more than 255 bytes is truncated to this length. If a verb is not available, a hyphen (0x002D) is used as a placeholder for the empty field.

CookedURL + Query

The URL and any query that is associated with it are logged as one field that is separated by a question mark (0x3F). This field is truncated at its length limit of 4096 bytes.
If this URL has been parsed (“cooked”), it is logged with local code page conversion, and is treated as a Unicode field.
If this URL has not been parsed (“cooked”) at the time of logging, it is copied exactly, without any Unicode conversion.
If the HTTP API cannot parse this URL, a hyphen (0x002D) is used as a placeholder for the empty field.

Protocol Status

The protocol status cannot be greater than 999.
If the protocol status of the response to a request is available, it is logged in this field.
If the protocol status is not available, a hyphen (0x002D) is used as a placeholder for the empty field.

Site ID

Not used in this version of the HTTP API. A placeholder hyphen (0x002D) always appears in this field.

Reason Phrase

This field contains a string that identifies the type of error that is being logged. This field is never left empty.

Queue Name

This the request queue name.

Another good thing about HTTP ERR logs is that they can be queried exactly the way we query IIS Access logs by using log parser.

Below is one such example query, we can use to query HTTP ERR logs is as below,

LogParser‘SELECT TO_STRING(date,'YYYY-MM-DD'),TO_STRING
(time,'hh:mm:ss'),c-ip,c-port,s-ip,s-port,cs-version,cs-method,
cs-uri,sc-status,s-siteid,s-reason,s-queuename
 From \\YOUR PATH TO HTTP ERROR LOG’

Most often the HTTPERR logs are written in SYSTEMROOT\SYSTEM32\Log Files\HTTPERR Folder of the IIS box, however since this location is configurable  so at times we might need to confirm the exact location with Server administrators.Also in some rare cases , these logs might also have been disabled due to various reasons, so in those cases we might need to check this with Server administrators and get those enabled though I feel no one really disables it for any reason.

Once you query these logs with log parser you should be able to get the details as shown below.

Httperror1

As you can see the wealth of the information lies in the s-reason codes which helps us in identifying the issues with the application and aids a lot in troubleshooting some of the hardest performance bottlenecks.

Below is the Microsoft definition of the reason codes that comes as part of HTTP ERR logs,

AppOffline: A service unavailable error occurred (an HTTP error 503). The service is not available because application errors caused the application to be taken offline.

AppPoolTimer: A service unavailable error occurred (an HTTP error 503). The service is not available because the application pool process is too busy to handle the request.

AppShutdown : A service unavailable error occurred (an HTTP error 503). The service is not available because the application shut down automatically in response to administrator policy.

Bad Request : A parse error occurred while processing a request.

Client_Reset : The connection between the client and the server was closed before the request could be assigned to a worker process. The most common cause of this behavior is that the client prematurely closes its connection to the server.

Connection_Abandoned_By_AppPool : A worker process from the application pool has quit unexpectedly or orphaned a pending request by closing its handle.

Connection_Abandoned_By_ReqQueue : A worker process from the application pool has quit unexpectedly or orphaned a pending request by closing its handle. Specific to Windows Vista and Windows Server 2008.

Connection_Dropped : The connection between the client and the server was closed before the server could send its final response packet. The most common cause of this behavior is that the client prematurely closes its connection to the server.

ConnLimit : A service unavailable error occurred (an HTTP error 503). The service is not available because the site level connection limit has been reached or exceeded.

Connections_Refused : The kernel NonPagedPool memory has dropped below 20MB and http.sys has stopped receiving new connections.

Disabled : A service unavailable error occurred (an HTTP error 503). The service is not available because an administrator has taken the application offline.

EntityTooLarge : An entity exceeded the maximum size that is permitted.

FieldLength : A field length limit was exceeded.

Forbidden : A forbidden element or sequence was encountered while parsing.

Header : A parse error occurred in a header.

Hostname : A parse error occurred while processing a Hostname.

Internal : An internal server error occurred (an HTTP error 500).

Invalid_CR/LF : An illegal carriage return or line feed occurred.

N/A : A service unavailable error occurred (an HTTP error 503). The service is not available because an internal error (such as a memory allocation failure) occurred.

N/I : A not-implemented error occurred (an HTTP error 501), or a service unavailable error occurred (an HTTP error 503) because of an unknown transfer encoding.

Number : A parse error occurred while processing a number.

Precondition : A required precondition was missing.

QueueFull : A service unavailable error occurred (an HTTP error 503). The service is not available because the application request queue is full.

RequestLength : A request length limit was exceeded.

Timer_AppPool:The connection expired because a request waited too long in an application pool queue for a server application to de queue and process it. This timeout duration is ConnectionTimeout. By default, this value is set to two minutes.

Timer_Connection Idle :The connection expired and remains idle. The default ConnectionTimeout duration is two minutes.

Timer_EntityBody :The connection expired before the request entity body arrived. When it is clear that a request has an entity body, the HTTP API turns on the Timer_EntityBody timer. Initially, the limit of this timer is set to the ConnectionTimeout value (typically 2 minutes). Each time another data indication is received on this request, the HTTP API resets the timer to give the connection two more minutes (or whatever is specified in ConnectionTimeout).

Timer_HeaderWait :The connection expired because the header parsing for a request took more time than the default limit of two minutes.

Timer_MinBytesPerSecond :The connection expired because the client was not receiving a response at a reasonable speed. The response send rate was slower than the default of 240 bytes/sec. This can be controlled with the MinFileBytesPerSec metabase property.

Timer_ReqQueue : The connection expired because a request waited too long in an application pool queue for a server application to dequeue. This timeout duration is ConnectionTimeout. By default, this value is set to two minutes. Specific to Windows Vista and Windows Server 2008.

Timer_Response : Reserved. Not currently used.

URL : A parse error occurred while processing a URL.

URL_Length : A URL exceeded the maximum permitted size.

Verb : A parse error occurred while processing a verb.

Version_N/S : A version-not-supported error occurred (an HTTP error 505).

So next time if you are load testing any .Net Web Application installed on IIS Web Server, make sure that you ask for access to IIS HTTP ERR logs as well in addition to the IIS Access logs.It will help you identifying issues in the fraction of seconds compared to any other methods used quite often.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: